Skip to content

Incorrect filtering for numeric values with different decimal precision #33206

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
3 tasks
vddavydova opened this issue Apr 22, 2025 · 11 comments · Fixed by suraj-mandal/superset#1 · May be fixed by #33222 or #33230
Open
3 tasks

Incorrect filtering for numeric values with different decimal precision #33206

vddavydova opened this issue Apr 22, 2025 · 11 comments · Fixed by suraj-mandal/superset#1 · May be fixed by #33222 or #33230
Assignees
Labels
good first issue Good first issues for new contributors

Comments

@vddavydova
Copy link

Bug description

Summary

There is an issue with filtering numeric values in Superset when values have different numbers of digits after the decimal point.

Steps to Reproduce

  1. Add a filter for a numeric field that contains both integer and decimal values (e.g., 21, 21.8, 25.35, etc.).

Image

  1. First select an integer value (for example, 21) in the filter, and then fractional values.

Image

  1. The query returns only the rows where the value is exactly 21 (i.e., without any decimal places), excluding 21.8, 25.35, etc.

Image

Image

  1. Now, if you select a decimal value first (e.g., 21.8 or 25.35), the filtering works as expected, returning all matching rows regardless of decimal precision.

Image

Image

Expected Behavior

Filtering by numeric value should match all equivalent numeric representations (e.g., 21, 21.8, 25.35) regardless of how many decimal places are used.

Actual Behavior

Filtering is inconsistent depending on whether an integer or decimal value is selected first. Selecting an integer leads to an overly strict match that excludes decimal representations of the same value.

Databases Affected

This issue has been observed across all three databases tested:

  • ClickHouse
  • Impala
  • PostgreSQL

Screenshots/recordings

No response

Superset version

master / latest-dev

Python version

3.9

Node version

I don't know

Browser

Not applicable

Additional context

No response

Checklist

  • I have searched Superset docs and Slack and didn't find a solution to my problem.
  • I have searched the GitHub issue tracker and didn't find a similar bug report.
  • I have checked Superset's logs for errors and if I found a relevant Python stacktrace, I included it here as text in the "additional context" section.
Copy link

dosubot bot commented Apr 22, 2025

The issue you're experiencing with numeric filtering in Superset, where selecting an integer value first results in an overly strict match that excludes decimal representations, might be related to how Superset handles numeric values and filters.

Superset has a NumericalRangeFilter component that manages numeric range inputs and ensures that the minimum value is strictly less than the maximum value [1]. Additionally, the getRangeExtraFormData function constructs filter clauses for numeric ranges, adding conditions for lower and upper bounds [2]. However, there might be a need to review how these components interact with different database backends like ClickHouse, Impala, and PostgreSQL, especially regarding decimal precision.

Furthermore, the handle_single_value function in superset/models/helpers.py checks if the value is a float or int and casts the string representation of a numeric value to a number for backward compatibility [3]. This might be relevant to your issue, as it suggests that numeric values are explicitly handled or transformed when applying filters.

Given the behavior you've described, it might be worth investigating whether the numeric filtering logic in Superset is correctly handling decimal precision across different database backends. You may want to explore the code related to numeric filtering and see if adjustments are needed to ensure consistent behavior regardless of the initial value selected.

To continue talking to Dosu, mention @dosu.


Help Dosu learn! Give it feedback: Great Response | Irrelevant Answer | Incorrect Sources | Too Verbose | Hallucination | Bug Report | Other

@geido
Copy link
Member

geido commented Apr 22, 2025

Feels like a good first issue to me

@geido geido added the good first issue Good first issues for new contributors label Apr 22, 2025
@suraj-mandal
Copy link

Can I work on this issue? I am working superset and would like to contribute to the product

@rusackas
Copy link
Member

@suraj-mandal go for it! I'll happily assign it to you if you want to call dibs on it :D Thank you for contributing!

@rusackas rusackas assigned suraj-mandal and unassigned kgabryje and geido Apr 23, 2025
@suraj-mandal
Copy link

Thanks, getting right into it!

@Sayan199
Copy link

Hi @suraj-mandal, Can you please include me as well? I would love to collaborate with you on this issue.

@suraj-mandal
Copy link

Sure

@Sayan199 Sayan199 linked a pull request Apr 23, 2025 that will close this issue
9 tasks
@suraj-mandal
Copy link

@dosu, can you tell me how to reproduce this issue?

Copy link

dosubot bot commented Apr 24, 2025

To reproduce the issue with numeric filtering in Superset, follow these steps:

  1. Set Up a Filter:

    • Add a filter for a numeric field that contains both integer and decimal values (e.g., 21, 21.8, 25.35).
  2. Select an Integer Value First:

    • In the filter, first select an integer value (e.g., 21).
    • Observe that the query returns only the rows where the value is exactly 21, excluding decimal values like 21.8 and 25.35.
  3. Select a Decimal Value First:

    • Now, select a decimal value first (e.g., 21.8 or 25.35).
    • Observe that the filtering works as expected, returning all matching rows regardless of decimal precision.

This issue has been observed across multiple databases, including ClickHouse, Impala, and PostgreSQL. The expected behavior is that filtering by numeric value should match all equivalent numeric representations, regardless of the number of decimal places.

To continue talking to Dosu, mention @dosu.


Help Dosu learn! Give it feedback: Great Response | Irrelevant Answer | Incorrect Sources | Too Verbose | Hallucination | Bug Report | Other

@suraj-mandal
Copy link

@dosu can you tell me the dataset that has been used here?

@suraj-mandal
Copy link

@Sayan199 , I tested your file in my local development server, and there are some formatting issues. Moreover, I tested it on my server, but I can see that sometimes inconsistencies are coming. Do check once!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue Good first issues for new contributors
Projects
None yet
6 participants