-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Fix query limit #5515
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Fix query limit #5515
Conversation
ff77094
to
80ef735
Compare
Previously, we set a default limit that was an effective infinity (2^28). It seems back in the 32 bit days that was the Erlang's largest small integer [1]. However, that turned out to be too low and it surprised a user when it truncated their all_docs output skipping some of the data. Fix that by increasing the limit to a larger "infinity" (highest 64 bit Erlang small integer [1]). We did have a "query_limit" config parameter to customize the limit, however that turned out to be broken and did not take effect when the user tried it for all_docs, so fix that as well. Fix that and use a test to ensure the limit gets reduced appropriately. To make the setting more user friendly, allow `infinity` as the value. Also, in the case of all_docs, we validated args and applied the limit check twice: once in the coordinator and another time on each worker, which wasted CPU resources and made things a bit confusing. To fix that, remove the validation from the common worker code in couch_mrview and validate once: either on the coordinator side, or local (port 5986) callback, right in the http callback. [1] https://www.erlang.org/doc/system/memory.html Fix #5176
80ef735
to
370c1c0
Compare
@@ -279,20 +279,20 @@ query_all_docs(Db, Args) -> | |||
query_all_docs(Db, Args, fun default_cb/2, []). | |||
|
|||
query_all_docs(Db, Args, Callback, Acc) when is_list(Args) -> | |||
query_all_docs(Db, to_mrargs(Args), Callback, Acc); | |||
Args1 = couch_mrview_util:validate_all_docs_args(Db, to_mrargs(Args)), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This clause is not called from a clustered endpoint, only from a test.
query_all_docs(Db, Args0, Callback, Acc) -> | ||
Sig = couch_util:with_db(Db, fun(WDb) -> | ||
{ok, Info} = couch_db:get_db_info(WDb), | ||
couch_index_util:hexsig(couch_hash:md5_hash(?term_to_bin(Info))) | ||
end), | ||
Args1 = Args0#mrargs{view_type = map}, | ||
Args2 = couch_mrview_util:validate_all_docs_args(Db, Args1), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove validation here and validate once in fabric or in couch*http modules instead
@@ -54,7 +54,8 @@ | |||
view_states=nil | |||
}). | |||
|
|||
-define(MAX_VIEW_LIMIT, 16#10000000). | |||
% Largest 64 bit small int: 2^59-1 | |||
-define(MAX_VIEW_LIMIT, 16#7FFFFFFFFFFFFFF). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use infinity
in .ini files to make it more user-friendly.
@@ -948,14 +948,12 @@ multi_all_docs_view(Req, Db, OP, Queries) -> | |||
|
|||
all_docs_view(Req, Db, Keys, OP) -> | |||
Args0 = couch_mrview_http:parse_body_and_query(Req, Keys), | |||
Args1 = Args0#mrargs{view_type = map}, | |||
Args2 = fabric_util:validate_all_docs_args(Db, Args1), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Validation now happens in fabric:all_docs()
Previously, we set a default limit that was an effective infinity (2^28). It seems back in the 32 bit days that was the Erlang's largest small integer [1]. However, that turned out to be too low and it surprised a user when it truncated their all_docs output skipping some of the data. Fix that by increasing the limit to a larger "infinity" (highest 64 bit Erlang small
integer [1]).
We did have a "query_limit" config parameter to customize the limit, however that turned out to be broken and did not take effect when the user tried it for all_docs, so fix that as well. Fix that and use a test to ensure the limit gets reduced appropriately. To make the setting more user friendly, allow
infinity
as the value.Also, in the case of all_docs, we validated args and applied the limit check twice: once in the coordinator and another time on each worker, which wasted CPU resources and made things a bit confusing. To fix that, remove the validation from the common worker code in couch_mrview and validate once: either on the coordinator side, or local (port 5986) callback, right in the http callback.
[1] https://www.erlang.org/doc/system/memory.html
Fix #5176