Skip to content

Commit d819362

Browse files
author
Commitfest Bot
committed
[CF 5081] v33 - nbtree skip scan
This branch was automatically generated by a robot using patches from an email thread registered at: https://commitfest.postgresql.org/patch/5081 The branch will be overwritten each time a new patch version is posted to the thread, and also periodically to check for bitrot caused by changes on the master branch. Patch(es): https://www.postgresql.org/message-id/CAH2-WznK55udrFQm4umLjOmxcBh8k_5Ybxmt1_rXSYW9N8j64A@mail.gmail.com Author(s): Peter Geoghegan
2 parents abe5622 + 055e60d commit d819362

35 files changed

+3840
-565
lines changed

doc/src/sgml/btree.sgml

Lines changed: 33 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -207,7 +207,7 @@
207207

208208
<para>
209209
As shown in <xref linkend="xindex-btree-support-table"/>, btree defines
210-
one required and four optional support functions. The five
210+
one required and five optional support functions. The six
211211
user-defined methods are:
212212
</para>
213213
<variablelist>
@@ -583,6 +583,38 @@ options(<replaceable>relopts</replaceable> <type>local_relopts *</type>) returns
583583
</para>
584584
</listitem>
585585
</varlistentry>
586+
<varlistentry>
587+
<term><function>skipsupport</function></term>
588+
<listitem>
589+
<para>
590+
Optionally, a btree operator family may provide a <firstterm>skip
591+
support</firstterm> function, registered under support function
592+
number 6. These functions allow the B-tree code to more efficiently
593+
navigate the index structure during an index skip scan. Operator classes
594+
that implement skip support provide the core B-Tree code with a way of
595+
enumerating and iterating through every possible value from the domain of
596+
indexable values. The APIs involved in this are defined in
597+
<filename>src/include/utils/skipsupport.h</filename>.
598+
</para>
599+
<para>
600+
Operator classes that do not provide a skip support function are still
601+
eligible to use skip scan. The core code can still use a fallback
602+
strategy, though it might be somewhat less efficient with discrete types.
603+
It usually doesn't make sense (and may not even be feasible) for operator
604+
classes on continuous types to provide a skip support function.
605+
</para>
606+
<para>
607+
It is not sensible for an operator family to register a cross-type
608+
<function>skipsupport</function> function, and attempting to do so will
609+
result in an error. This is because determining the next indexable value
610+
from some earlier value does not just depend on sorting/equality
611+
semantics, which are more or less defined at the operator family level.
612+
Skip scan works by exhaustively considering every possible value that
613+
might be stored in an index, so the domain of the particular data type
614+
stored within the index (the input opclass type) must also be considered.
615+
</para>
616+
</listitem>
617+
</varlistentry>
586618
</variablelist>
587619

588620
</sect2>

doc/src/sgml/indexam.sgml

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -835,7 +835,8 @@ amrestrpos (IndexScanDesc scan);
835835
<para>
836836
<programlisting>
837837
Size
838-
amestimateparallelscan (int nkeys,
838+
amestimateparallelscan (Relation indexRelation,
839+
int nkeys,
839840
int norderbys);
840841
</programlisting>
841842
Estimate and return the number of bytes of dynamic shared memory which

doc/src/sgml/indices.sgml

Lines changed: 30 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -457,23 +457,26 @@ CREATE INDEX test2_mm_idx ON test2 (major, minor);
457457
<para>
458458
A multicolumn B-tree index can be used with query conditions that
459459
involve any subset of the index's columns, but the index is most
460-
efficient when there are constraints on the leading (leftmost) columns.
461-
The exact rule is that equality constraints on leading columns, plus
462-
any inequality constraints on the first column that does not have an
463-
equality constraint, will be used to limit the portion of the index
464-
that is scanned. Constraints on columns to the right of these columns
465-
are checked in the index, so they save visits to the table proper, but
466-
they do not reduce the portion of the index that has to be scanned.
460+
efficient when there are equality constraints on the leading (leftmost) columns.
461+
B-Tree index scans can use the index skip scan strategy to generate
462+
equality constraints on prefix columns that were wholly omitted from the
463+
query predicate, as well as prefix columns whose values were constrained by
464+
inequality conditions.
467465
For example, given an index on <literal>(a, b, c)</literal> and a
468466
query condition <literal>WHERE a = 5 AND b &gt;= 42 AND c &lt; 77</literal>,
469467
the index would have to be scanned from the first entry with
470468
<literal>a</literal> = 5 and <literal>b</literal> = 42 up through the last entry with
471-
<literal>a</literal> = 5. Index entries with <literal>c</literal> &gt;= 77 would be
472-
skipped, but they'd still have to be scanned through.
469+
<literal>a</literal> = 5. Intervening groups of index entries with
470+
<literal>c</literal> &gt;= 77 would not need to be returned by the scan,
471+
and can be skipped over entirely by applying the skip scan strategy.
473472
This index could in principle be used for queries that have constraints
474473
on <literal>b</literal> and/or <literal>c</literal> with no constraint on <literal>a</literal>
475-
&mdash; but the entire index would have to be scanned, so in most cases
476-
the planner would prefer a sequential table scan over using the index.
474+
&mdash; but that approach is generally only taken when there are so few
475+
distinct <literal>a</literal> values that the planner expects the skip scan
476+
strategy to allow the scan to skip over most individual index leaf pages.
477+
If there are many distinct <literal>a</literal> values, then the entire
478+
index will have to be scanned, so in most cases the planner will prefer a
479+
sequential table scan over using the index.
477480
</para>
478481

479482
<para>
@@ -508,11 +511,15 @@ CREATE INDEX test2_mm_idx ON test2 (major, minor);
508511
</para>
509512

510513
<para>
511-
Multicolumn indexes should be used sparingly. In most situations,
512-
an index on a single column is sufficient and saves space and time.
513-
Indexes with more than three columns are unlikely to be helpful
514-
unless the usage of the table is extremely stylized. See also
515-
<xref linkend="indexes-bitmap-scans"/> and
514+
Multicolumn indexes should only be used when testing shows that they'll
515+
offer a clear advantage over simply using multiple single column indexes.
516+
Indexes with more than three columns can make sense, but only when most
517+
queries that make use of later columns also make use of earlier prefix
518+
columns. It's possible for B-Tree index scans to make use of <quote>skip
519+
scan</quote> optimizations with queries that omit a low cardinality
520+
leading prefix column, but this is usually much less efficient than a scan
521+
of an index without the extra prefix column. See <xref
522+
linkend="indexes-bitmap-scans"/> and
516523
<xref linkend="indexes-index-only-scans"/> for some discussion of the
517524
merits of different index configurations.
518525
</para>
@@ -669,9 +676,13 @@ CREATE INDEX test3_desc_index ON test3 (id DESC NULLS LAST);
669676
multicolumn index on <literal>(x, y)</literal>. This index would typically be
670677
more efficient than index combination for queries involving both
671678
columns, but as discussed in <xref linkend="indexes-multicolumn"/>, it
672-
would be almost useless for queries involving only <literal>y</literal>, so it
673-
should not be the only index. A combination of the multicolumn index
674-
and a separate index on <literal>y</literal> would serve reasonably well. For
679+
would be less useful for queries involving only <literal>y</literal>. Just
680+
how useful might depend on how effective the B-Tree index skip scan
681+
optimization is; if <literal>x</literal> has no more than several hundred
682+
distinct values, skip scan will make searches for specific
683+
<literal>y</literal> values execute reasonably efficiently. A combination
684+
of a multicolumn index on <literal>(x, y)</literal> and a separate index on
685+
<literal>y</literal> might also serve reasonably well. For
675686
queries involving only <literal>x</literal>, the multicolumn index could be
676687
used, though it would be larger and hence slower than an index on
677688
<literal>x</literal> alone. The last alternative is to create all three

doc/src/sgml/monitoring.sgml

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4263,7 +4263,9 @@ description | Waiting for a newly initialized WAL file to reach durable storage
42634263
<replaceable>column_name</replaceable> =
42644264
<replaceable>value2</replaceable> ...</literal> construct, though only
42654265
when the optimizer transforms the construct into an equivalent
4266-
multi-valued array representation.
4266+
multi-valued array representation. Similarly, when B-Tree index scans use
4267+
the skip scan strategy, an index search is performed each time the scan is
4268+
repositioned to the next index leaf page that might have matching tuples.
42674269
</para>
42684270
</note>
42694271
<tip>

doc/src/sgml/perform.sgml

Lines changed: 31 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -860,6 +860,37 @@ EXPLAIN ANALYZE SELECT * FROM tenk1 WHERE thousand IN (1, 2, 3, 4);
860860
<structname>tenk1_thous_tenthous</structname> index leaf page.
861861
</para>
862862

863+
<para>
864+
The <quote>Index Searches</quote> line is also useful with B-tree index
865+
scans that apply the <firstterm>skip scan</firstterm> optimization to
866+
more efficiently traverse through an index:
867+
<screen>
868+
EXPLAIN ANALYZE SELECT four, unique1 FROM tenk1 WHERE four BETWEEN 1 AND 3 AND unique1 = 42;
869+
QUERY PLAN
870+
-------------------------------------------------------------------&zwsp;---------------------------------------------------------------
871+
Index Only Scan using tenk1_four_unique1_idx on tenk1 (cost=0.29..6.90 rows=1 width=8) (actual time=0.006..0.007 rows=1.00 loops=1)
872+
Index Cond: ((four &gt;= 1) AND (four &lt;= 3) AND (unique1 = 42))
873+
Heap Fetches: 0
874+
Index Searches: 3
875+
Buffers: shared hit=7
876+
Planning Time: 0.029 ms
877+
Execution Time: 0.012 ms
878+
</screen>
879+
880+
Here we see an Index-Only Scan node using
881+
<structname>tenk1_four_unique1_idx</structname>, a composite index on the
882+
<structname>tenk1</structname> table's <structfield>four</structfield> and
883+
<structfield>unique1</structfield> columns. The scan performs 3 searches
884+
that each read a single index leaf page:
885+
<quote><literal>four = 1 AND unique1 = 42</literal></quote>,
886+
<quote><literal>four = 2 AND unique1 = 42</literal></quote>, and
887+
<quote><literal>four = 3 AND unique1 = 42</literal></quote>. This index
888+
is generally a good target for skip scan, since its leading column (the
889+
<structfield>four</structfield> column) contains only 4 distinct values,
890+
while its second/final column (the <structfield>unique1</structfield>
891+
column) contains many distinct values.
892+
</para>
893+
863894
<para>
864895
Another type of extra information is the number of rows removed by a
865896
filter condition:

doc/src/sgml/xindex.sgml

Lines changed: 13 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -461,6 +461,13 @@
461461
</entry>
462462
<entry>5</entry>
463463
</row>
464+
<row>
465+
<entry>
466+
Return the addresses of C-callable skip support function(s)
467+
(optional)
468+
</entry>
469+
<entry>6</entry>
470+
</row>
464471
</tbody>
465472
</tgroup>
466473
</table>
@@ -1062,7 +1069,8 @@ DEFAULT FOR TYPE int8 USING btree FAMILY integer_ops AS
10621069
FUNCTION 1 btint8cmp(int8, int8) ,
10631070
FUNCTION 2 btint8sortsupport(internal) ,
10641071
FUNCTION 3 in_range(int8, int8, int8, boolean, boolean) ,
1065-
FUNCTION 4 btequalimage(oid) ;
1072+
FUNCTION 4 btequalimage(oid) ,
1073+
FUNCTION 6 btint8skipsupport(internal) ;
10661074

10671075
CREATE OPERATOR CLASS int4_ops
10681076
DEFAULT FOR TYPE int4 USING btree FAMILY integer_ops AS
@@ -1075,7 +1083,8 @@ DEFAULT FOR TYPE int4 USING btree FAMILY integer_ops AS
10751083
FUNCTION 1 btint4cmp(int4, int4) ,
10761084
FUNCTION 2 btint4sortsupport(internal) ,
10771085
FUNCTION 3 in_range(int4, int4, int4, boolean, boolean) ,
1078-
FUNCTION 4 btequalimage(oid) ;
1086+
FUNCTION 4 btequalimage(oid) ,
1087+
FUNCTION 6 btint4skipsupport(internal) ;
10791088

10801089
CREATE OPERATOR CLASS int2_ops
10811090
DEFAULT FOR TYPE int2 USING btree FAMILY integer_ops AS
@@ -1088,7 +1097,8 @@ DEFAULT FOR TYPE int2 USING btree FAMILY integer_ops AS
10881097
FUNCTION 1 btint2cmp(int2, int2) ,
10891098
FUNCTION 2 btint2sortsupport(internal) ,
10901099
FUNCTION 3 in_range(int2, int2, int2, boolean, boolean) ,
1091-
FUNCTION 4 btequalimage(oid) ;
1100+
FUNCTION 4 btequalimage(oid) ,
1101+
FUNCTION 6 btint2skipsupport(internal) ;
10921102

10931103
ALTER OPERATOR FAMILY integer_ops USING btree ADD
10941104
-- cross-type comparisons int8 vs int2

src/backend/access/index/indexam.c

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -489,7 +489,8 @@ index_parallelscan_estimate(Relation indexRelation, int nkeys, int norderbys,
489489
if (parallel_aware &&
490490
indexRelation->rd_indam->amestimateparallelscan != NULL)
491491
nbytes = add_size(nbytes,
492-
indexRelation->rd_indam->amestimateparallelscan(nkeys,
492+
indexRelation->rd_indam->amestimateparallelscan(indexRelation,
493+
nkeys,
493494
norderbys));
494495

495496
return nbytes;

0 commit comments

Comments
 (0)