You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Teach nbtree composite index scans to opportunistically skip over
irrelevant sections of composite indexes given a query with an omitted
prefix column. When nbtree is passed input scan keys derived from a
query predicate "WHERE b = 5", new nbtree preprocessing steps now output
"WHERE a = ANY(<every possible 'a' value>) AND b = 5" scan keys. That
is, preprocessing generates a "skip array" (along with an associated
scan key) for the omitted column "a", which makes it safe to mark the
scan key on "b" as required to continue the scan. This is far more
efficient than a traditional full index scan whenever it allows the scan
to skip over many irrelevant leaf pages, by iteratively repositioning
itself using the keys on "a" and "b" together.
A skip array has "elements" that are generated procedurally and on
demand, but otherwise works just like a regular ScalarArrayOp array.
Preprocessing can freely add a skip array before or after any input
ScalarArrayOp arrays. Index scans with a skip array decide when and
where to reposition the scan using the same approach as any other scan
with array keys. This design builds on the design for array advancement
and primitive scan scheduling added to Postgres 17 by commit 5bf748b.
The core B-Tree operator classes on most discrete types generate their
array elements with the help of their own custom skip support routine.
This infrastructure gives nbtree a way to generate the next required
array element by incrementing (or decrementing) the current array value.
It can reduce the number of index descents in cases where the next
possible indexable value frequently turns out to be the next value
stored in the index. Opclasses that lack a skip support routine fall
back on having nbtree "increment" (or "decrement") a skip array's
current element by setting the NEXT (or PRIOR) scan key flag, without
directly changing the scan key's sk_argument. These sentinel values
behave just like any other value from an array -- though they can never
locate equal index tuples (they can only locate the next group of index
tuples containing the next set of non-sentinel values that the scan's
arrays need to advance to).
Inequality scan keys can affect how skip arrays generate their values.
Their range is constrained by the inequalities. For example, a skip
array on "a" will only use element values 1 and 2 given a qual such as
"WHERE a BETWEEN 1 AND 2 AND b = 66". A scan using such a skip array
has almost identical performance characteristics to one with the qual
"WHERE a = ANY('{1, 2}') AND b = 66". The scan will be much faster when
it can be executed as two selective primitive index scans instead of a
single very large index scan that reads many irrelevant leaf pages.
However, the array transformation process won't always lead to improved
performance at runtime. Much depends on physical index characteristics.
B-Tree preprocessing is optimistic about skipping working out: it
applies static, generic rules when determining where to generate skip
arrays, which assumes that the runtime overhead of maintaining skip
arrays will pay for itself -- or lead to only a modest performance loss.
As things stand, these assumptions are much too optimistic: skip array
maintenance will lead to unacceptable regressions with unsympathetic
queries (queries whose scan can't skip over many irrelevant leaf pages).
An upcoming commit will address the problems in this area by enhancing
_bt_readpage's approach to saving cycles on scan key evaluation, making
it work in a way that directly considers the needs of = array keys
(particularly = skip array keys).
Author: Peter Geoghegan <[email protected]>
Reviewed-By: Masahiro Ikeda <[email protected]>
Reviewed-By: Heikki Linnakangas <[email protected]>
Reviewed-By: Tomas Vondra <[email protected]>
Reviewed-By: Matthias van de Meent <[email protected]>
Reviewed-By: Aleksander Alekseev <[email protected]>
Reviewed-By: Alena Rybakina <[email protected]>
Discussion: https://postgr.es/m/CAH2-Wzmn1YsLzOGgjAQZdn1STSG_y8qP__vggTaPAYXJP+G4bw@mail.gmail.com
0 commit comments