Skip to content

Fix query limit #5515

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Apr 30, 2025
Merged

Fix query limit #5515

merged 1 commit into from
Apr 30, 2025

Conversation

nickva
Copy link
Contributor

@nickva nickva commented Apr 24, 2025

Previously, we set a default limit that was an effective infinity (2^28). It seems back in the 32 bit days that was the Erlang's largest small integer [1]. However, that turned out to be too low and it surprised a user when it truncated their all_docs output skipping some of the data. Fix that by increasing the limit to a larger "infinity" (highest 64 bit Erlang small integer [1]).

We did have a "query_limit" config parameter to customize the limit, however that turned out to be broken and did not take effect when the user tried it for all_docs, so fix that as well. Test to ensure the limit gets reduced appropriately. To make the setting more user friendly, allow infinity as the value.

Also, in the case of all_docs, we validated args and applied the limit check twice: once in the coordinator and another time on each worker, which wasted CPU resources and made things a bit confusing. To fix that, remove the validation from the common worker code in couch_mrview and validate once: either on the coordinator side, or local (port 5986) callback, right in the http callback.

[1] https://www.erlang.org/doc/system/memory.html

Fix #5176

@nickva nickva force-pushed the fix-query-limit-for-all-docs branch 9 times, most recently from 80ef735 to 370c1c0 Compare April 26, 2025 04:27
@nickva nickva force-pushed the fix-query-limit-for-all-docs branch from 370c1c0 to a37253c Compare April 27, 2025 18:32
Copy link
Contributor

@jaydoane jaydoane left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice improvement, but I'm a little worried that you've moved validation around without explicitly testing that validation. Is it worth adding a couple more tests to tighten that up?

?MAX_VIEW_LIMIT
),
MaxLimit =
case config:get("query_server_config", LimitType, "infinity") of
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice ergonomic improvement!

@nickva
Copy link
Contributor Author

nickva commented Apr 27, 2025

Good idea, Jay, to add a few more tests. I assumed we tested those somewhere else already but it's apparently not the case.

@nickva nickva force-pushed the fix-query-limit-for-all-docs branch 3 times, most recently from 876a7a9 to ef4354d Compare April 30, 2025 05:00
Previously, we set a default limit that was an effective infinity (2^28). It
seems back in the 32 bit days that was the Erlang's largest small integer [1].
However, that turned out to be too low and it surprised a user when it
truncated their all_docs output skipping some of the data. Fix that by
increasing the limit to a larger "infinity" (highest 64 bit Erlang small
integer [1]).

We did have a "query_limit" config parameter to customize the limit, however
that turned out to be broken and did not take effect when the user tried it for
all_docs, so fix that as well. Fix that and use a test to ensure the limit gets
reduced appropriately. To make the setting more user friendly, allow `infinity`
as the value.

Also, in the case of all_docs, we validated args and applied the limit check
twice: once in the coordinator and another time on each worker, which wasted
CPU resources and made things a bit confusing. To fix that, remove the
validation from the common worker code in couch_mrview and validate once:
either on the coordinator side, or local (port 5986) callback, right in the
http callback.

[1] https://www.erlang.org/doc/system/memory.html

Fix #5176
@nickva nickva force-pushed the fix-query-limit-for-all-docs branch from ef4354d to caa4f9b Compare April 30, 2025 05:08
@nickva nickva merged commit 7aa8a4e into main Apr 30, 2025
24 checks passed
@nickva nickva deleted the fix-query-limit-for-all-docs branch April 30, 2025 06:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Query exporting only 2ˆ28 documents (data have more) even after setting query limit to 2ˆ29
2 participants