Skip to content

Commit a0a9db4

Browse files
petergeogheganCommitfest Bot
authored andcommitted
Apply low-order skip key in _bt_first more often.
Convert low_compare and high_compare nbtree skip array inequalities (with opclasses that offer skip support) such that _bt_first is consistently able to use later keys when descending the tree within _bt_first. For example, an index qual "WHERE a > 5 AND b = 2" is now converted to "WHERE a >= 6 AND b = 2" by a new preprocessing step that takes place after a final low_compare and/or high_compare are chosen by all earlier preprocessing steps. That way the scan's initial call to _bt_first will use "WHERE a >= 6 AND b = 2" to find the initial leaf level position, rather than merely using "WHERE a > 5" -- "b = 2" can always be applied. This has a decent chance of making the scan avoid an extra _bt_first call that would otherwise be needed just to determine the lowest-sorting "a" value in the index (the lowest that still satisfies "WHERE a > 5"). The transformation process can only lower the total number of index pages read when the use of a more restrictive set of initial positioning keys in _bt_first actually allows the scan to land on some later leaf page directly, relative to the unoptimized case (or on an earlier leaf page directly, when scanning backwards). The savings can be far greater when affected skip arrays come after some higher-order array. For example, a qual "WHERE x IN (1, 2, 3) AND y > 5 AND z = 2" can now save as many as 3 _bt_first calls as a result of these transformations (there can be as many as 1 _bt_first call saved per "x" array element). Author: Peter Geoghegan <[email protected]> Reviewed-By: Matthias van de Meent <[email protected]> Discussion: https://postgr.es/m/CAH2-Wz=FJ78K3WsF3iWNxWnUCY9f=Jdg3QPxaXE=uYUbmuRz5Q@mail.gmail.com
1 parent a62d29e commit a0a9db4

File tree

3 files changed

+211
-0
lines changed

3 files changed

+211
-0
lines changed

src/backend/access/nbtree/nbtpreprocesskeys.c

Lines changed: 180 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -50,6 +50,12 @@ static bool _bt_saoparray_shrink(IndexScanDesc scan, ScanKey arraysk,
5050
BTArrayKeyInfo *array, bool *qual_ok);
5151
static bool _bt_skiparray_shrink(IndexScanDesc scan, ScanKey skey,
5252
BTArrayKeyInfo *array, bool *qual_ok);
53+
static void _bt_skiparray_strat_adjust(IndexScanDesc scan, ScanKey arraysk,
54+
BTArrayKeyInfo *array);
55+
static void _bt_skiparray_strat_decrement(IndexScanDesc scan, ScanKey arraysk,
56+
BTArrayKeyInfo *array);
57+
static void _bt_skiparray_strat_increment(IndexScanDesc scan, ScanKey arraysk,
58+
BTArrayKeyInfo *array);
5359
static ScanKey _bt_preprocess_array_keys(IndexScanDesc scan, int *new_numberOfKeys);
5460
static void _bt_preprocess_array_keys_final(IndexScanDesc scan, int *keyDataMap);
5561
static int _bt_num_array_keys(IndexScanDesc scan, Oid *skip_eq_ops,
@@ -1295,6 +1301,171 @@ _bt_skiparray_shrink(IndexScanDesc scan, ScanKey skey, BTArrayKeyInfo *array,
12951301
return true;
12961302
}
12971303

1304+
/*
1305+
* Applies the opfamily's skip support routine to convert the skip array's >
1306+
* low_compare key (if any) into a >= key, and to convert its < high_compare
1307+
* key (if any) into a <= key. Decrements the high_compare key's sk_argument,
1308+
* and/or increments the low_compare key's sk_argument (also adjusts their
1309+
* operator strategies, while changing the operator as appropriate).
1310+
*
1311+
* This optional optimization reduces the number of descents required within
1312+
* _bt_first. Whenever _bt_first is called with a skip array whose current
1313+
* array element is the sentinel value MINVAL, using a transformed >= key
1314+
* instead of using the original > key makes it safe to include lower-order
1315+
* scan keys in the insertion scan key (there must be lower-order scan keys
1316+
* after the skip array). We will avoid an extra _bt_first to find the first
1317+
* value in the index > sk_argument -- at least when the first real matching
1318+
* value in the index happens to be an exact match for the sk_argument value
1319+
* that we produced here by incrementing the original input key's sk_argument.
1320+
* (Backwards scans derive the same benefit when they encounter the sentinel
1321+
* value MAXVAL, by converting the high_compare key from < to <=.)
1322+
*
1323+
* Note: The transformation is only correct when it cannot allow the scan to
1324+
* overlook matching tuples, but we don't have enough semantic information to
1325+
* safely make sure that can't happen during scans with cross-type operators.
1326+
* That's why we'll never apply the transformation in cross-type scenarios.
1327+
* For example, if we attempted to convert "sales_ts > '2024-01-01'::date"
1328+
* into "sales_ts >= '2024-01-02'::date" given a "sales_ts" attribute whose
1329+
* input opclass is timestamp_ops, the scan would overlook _all_ tuples for
1330+
* sales that fell on '2024-01-01'.
1331+
*
1332+
* Note: We can safely modify array->low_compare/array->high_compare in place
1333+
* because they just point to copies of our scan->keyData[] input scan keys
1334+
* (namely the copies returned by _bt_preprocess_array_keys to be used as
1335+
* input into the standard preprocessing steps in _bt_preprocess_keys).
1336+
* Everything will be reset if there's a rescan.
1337+
*/
1338+
static void
1339+
_bt_skiparray_strat_adjust(IndexScanDesc scan, ScanKey arraysk,
1340+
BTArrayKeyInfo *array)
1341+
{
1342+
BTScanOpaque so = (BTScanOpaque) scan->opaque;
1343+
MemoryContext oldContext;
1344+
1345+
/*
1346+
* Called last among all preprocessing steps, when the skip array's final
1347+
* low_compare and high_compare have both been chosen
1348+
*/
1349+
Assert(arraysk->sk_flags & SK_BT_SKIP);
1350+
Assert(array->num_elems == -1 && !array->null_elem && array->sksup);
1351+
1352+
oldContext = MemoryContextSwitchTo(so->arrayContext);
1353+
1354+
if (array->high_compare &&
1355+
array->high_compare->sk_strategy == BTLessStrategyNumber)
1356+
_bt_skiparray_strat_decrement(scan, arraysk, array);
1357+
1358+
if (array->low_compare &&
1359+
array->low_compare->sk_strategy == BTGreaterStrategyNumber)
1360+
_bt_skiparray_strat_increment(scan, arraysk, array);
1361+
1362+
MemoryContextSwitchTo(oldContext);
1363+
}
1364+
1365+
/*
1366+
* Convert skip array's > low_compare key into a >= key
1367+
*/
1368+
static void
1369+
_bt_skiparray_strat_decrement(IndexScanDesc scan, ScanKey arraysk,
1370+
BTArrayKeyInfo *array)
1371+
{
1372+
Relation rel = scan->indexRelation;
1373+
Oid opfamily = rel->rd_opfamily[arraysk->sk_attno - 1],
1374+
opcintype = rel->rd_opcintype[arraysk->sk_attno - 1],
1375+
leop;
1376+
RegProcedure cmp_proc;
1377+
ScanKey high_compare = array->high_compare;
1378+
Datum orig_sk_argument = high_compare->sk_argument,
1379+
new_sk_argument;
1380+
bool uflow;
1381+
1382+
Assert(high_compare->sk_strategy == BTLessStrategyNumber);
1383+
1384+
/*
1385+
* Only perform the transformation when the operator type matches the
1386+
* index attribute's input opclass type
1387+
*/
1388+
if (high_compare->sk_subtype != opcintype &&
1389+
high_compare->sk_subtype != InvalidOid)
1390+
return;
1391+
1392+
/* Decrement, handling underflow by marking the qual unsatisfiable */
1393+
new_sk_argument = array->sksup->decrement(rel, orig_sk_argument, &uflow);
1394+
if (uflow)
1395+
{
1396+
BTScanOpaque so = (BTScanOpaque) scan->opaque;
1397+
1398+
so->qual_ok = false;
1399+
return;
1400+
}
1401+
1402+
/* Look up <= operator (might fail) */
1403+
leop = get_opfamily_member(opfamily, opcintype, opcintype,
1404+
BTLessEqualStrategyNumber);
1405+
if (!OidIsValid(leop))
1406+
return;
1407+
cmp_proc = get_opcode(leop);
1408+
if (RegProcedureIsValid(cmp_proc))
1409+
{
1410+
/* Transform < high_compare key into <= key */
1411+
fmgr_info(cmp_proc, &high_compare->sk_func);
1412+
high_compare->sk_argument = new_sk_argument;
1413+
high_compare->sk_strategy = BTLessEqualStrategyNumber;
1414+
}
1415+
}
1416+
1417+
/*
1418+
* Convert skip array's < low_compare key into a <= key
1419+
*/
1420+
static void
1421+
_bt_skiparray_strat_increment(IndexScanDesc scan, ScanKey arraysk,
1422+
BTArrayKeyInfo *array)
1423+
{
1424+
Relation rel = scan->indexRelation;
1425+
Oid opfamily = rel->rd_opfamily[arraysk->sk_attno - 1],
1426+
opcintype = rel->rd_opcintype[arraysk->sk_attno - 1],
1427+
geop;
1428+
RegProcedure cmp_proc;
1429+
ScanKey low_compare = array->low_compare;
1430+
Datum orig_sk_argument = low_compare->sk_argument,
1431+
new_sk_argument;
1432+
bool oflow;
1433+
1434+
Assert(low_compare->sk_strategy == BTGreaterStrategyNumber);
1435+
1436+
/*
1437+
* Only perform the transformation when the operator type matches the
1438+
* index attribute's input opclass type
1439+
*/
1440+
if (low_compare->sk_subtype != opcintype &&
1441+
low_compare->sk_subtype != InvalidOid)
1442+
return;
1443+
1444+
/* Increment, handling overflow by marking the qual unsatisfiable */
1445+
new_sk_argument = array->sksup->increment(rel, orig_sk_argument, &oflow);
1446+
if (oflow)
1447+
{
1448+
BTScanOpaque so = (BTScanOpaque) scan->opaque;
1449+
1450+
so->qual_ok = false;
1451+
return;
1452+
}
1453+
1454+
/* Look up >= operator (might fail) */
1455+
geop = get_opfamily_member(opfamily, opcintype, opcintype,
1456+
BTGreaterEqualStrategyNumber);
1457+
if (!OidIsValid(geop))
1458+
return;
1459+
cmp_proc = get_opcode(geop);
1460+
if (RegProcedureIsValid(cmp_proc))
1461+
{
1462+
/* Transform > low_compare key into >= key */
1463+
fmgr_info(cmp_proc, &low_compare->sk_func);
1464+
low_compare->sk_argument = new_sk_argument;
1465+
low_compare->sk_strategy = BTGreaterEqualStrategyNumber;
1466+
}
1467+
}
1468+
12981469
/*
12991470
* _bt_preprocess_array_keys() -- Preprocess SK_SEARCHARRAY scan keys
13001471
*
@@ -1838,6 +2009,15 @@ _bt_preprocess_array_keys_final(IndexScanDesc scan, int *keyDataMap)
18382009
}
18392010
else
18402011
{
2012+
/*
2013+
* Any skip array low_compare and high_compare scan keys
2014+
* are now final. Transform the array's > low_compare key
2015+
* into a >= key (and < high_compare keys into a <= key).
2016+
*/
2017+
if (array->num_elems == -1 && array->sksup &&
2018+
!array->null_elem)
2019+
_bt_skiparray_strat_adjust(scan, outkey, array);
2020+
18412021
/* Match found, so done with this array */
18422022
arrayidx++;
18432023
}

src/test/regress/expected/create_index.out

Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2589,6 +2589,27 @@ ORDER BY thousand;
25892589
1 | 1001
25902590
(1 row)
25912591

2592+
-- Skip array preprocessing increments "thousand > -1" to "thousand >= 0"
2593+
explain (costs off)
2594+
SELECT thousand, tenthous FROM tenk1
2595+
WHERE thousand > -1 AND tenthous IN (1001,3000)
2596+
ORDER BY thousand limit 2;
2597+
QUERY PLAN
2598+
--------------------------------------------------------------------------------------------------
2599+
Limit
2600+
-> Index Only Scan using tenk1_thous_tenthous on tenk1
2601+
Index Cond: ((thousand > '-1'::integer) AND (tenthous = ANY ('{1001,3000}'::integer[])))
2602+
(3 rows)
2603+
2604+
SELECT thousand, tenthous FROM tenk1
2605+
WHERE thousand > -1 AND tenthous IN (1001,3000)
2606+
ORDER BY thousand limit 2;
2607+
thousand | tenthous
2608+
----------+----------
2609+
0 | 3000
2610+
1 | 1001
2611+
(2 rows)
2612+
25922613
--
25932614
-- Check elimination of constant-NULL subexpressions
25942615
--

src/test/regress/sql/create_index.sql

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -993,6 +993,16 @@ SELECT thousand, tenthous FROM tenk1
993993
WHERE thousand < 3 and thousand <= 2 AND tenthous = 1001
994994
ORDER BY thousand;
995995

996+
-- Skip array preprocessing increments "thousand > -1" to "thousand >= 0"
997+
explain (costs off)
998+
SELECT thousand, tenthous FROM tenk1
999+
WHERE thousand > -1 AND tenthous IN (1001,3000)
1000+
ORDER BY thousand limit 2;
1001+
1002+
SELECT thousand, tenthous FROM tenk1
1003+
WHERE thousand > -1 AND tenthous IN (1001,3000)
1004+
ORDER BY thousand limit 2;
1005+
9961006
--
9971007
-- Check elimination of constant-NULL subexpressions
9981008
--

0 commit comments

Comments
 (0)