Skip to content

Commit 18d2614

Browse files
author
Richard Guo
committed
Fix pushdown of degenerate HAVING clauses
67a54b9 taught the planner to push down HAVING clauses even when grouping sets are present, as long as the clause does not reference any columns that are nullable by the grouping sets. However, there was an oversight: if any empty grouping sets are present, the aggregation node can produce a row that did not come from the input, and pushing down a HAVING clause in this case may cause us to fail to filter out that row. Currently, non-degenerate HAVING clauses are not pushed down when empty grouping sets are present, since the empty grouping sets would nullify the vars they reference. However, degenerate (variable-free) HAVING clauses are not subject to this restriction and may be incorrectly pushed down. To fix, explicitly check for the presence of empty grouping sets and retain degenerate clauses in HAVING when they are present. This ensures that we don't emit a bogus aggregated row. A copy of each such clause is also put in WHERE so that query_planner() can use it in a gating Result node. To facilitate this check, this patch expands the groupingSets tree of the query to a flat list of grouping sets before applying the HAVING pushdown optimization. This does not add any additional planning overhead, since we need to do this expansion anyway. In passing, make a small tweak to preprocess_grouping_sets() by reordering its initial operations a bit. Backpatch to v18, where this issue was introduced. Reported-by: Yuhang Qiu <[email protected]> Author: Richard Guo <[email protected]> Author: Tom Lane <[email protected]> Discussion: https://postgr.es/m/[email protected] Backpatch-through: 18
1 parent 29b039e commit 18d2614

File tree

3 files changed

+143
-26
lines changed

3 files changed

+143
-26
lines changed

src/backend/optimizer/plan/planner.c

Lines changed: 43 additions & 24 deletions
Original file line numberDiff line numberDiff line change
@@ -1127,15 +1127,28 @@ subquery_planner(PlannerGlobal *glob, Query *parse, char *plan_name,
11271127
if (parse->hasTargetSRFs)
11281128
parse->hasTargetSRFs = expression_returns_set((Node *) parse->targetList);
11291129

1130+
/*
1131+
* If we have grouping sets, expand the groupingSets tree of this query to
1132+
* a flat list of grouping sets. We need to do this before optimizing
1133+
* HAVING, since we can't easily tell if there's an empty grouping set
1134+
* until we have this representation.
1135+
*/
1136+
if (parse->groupingSets)
1137+
{
1138+
parse->groupingSets =
1139+
expand_grouping_sets(parse->groupingSets, parse->groupDistinct, -1);
1140+
}
1141+
11301142
/*
11311143
* In some cases we may want to transfer a HAVING clause into WHERE. We
11321144
* cannot do so if the HAVING clause contains aggregates (obviously) or
11331145
* volatile functions (since a HAVING clause is supposed to be executed
1134-
* only once per group). We also can't do this if there are any nonempty
1135-
* grouping sets and the clause references any columns that are nullable
1136-
* by the grouping sets; moving such a clause into WHERE would potentially
1137-
* change the results. (If there are only empty grouping sets, then the
1138-
* HAVING clause must be degenerate as discussed below.)
1146+
* only once per group). We also can't do this if there are any grouping
1147+
* sets and the clause references any columns that are nullable by the
1148+
* grouping sets; the nulled values of those columns are not available
1149+
* before the grouping step. (The test on groupClause might seem wrong,
1150+
* but it's okay: it's just an optimization to avoid running pull_varnos
1151+
* when there cannot be any Vars in the HAVING clause.)
11391152
*
11401153
* Also, it may be that the clause is so expensive to execute that we're
11411154
* better off doing it only once per group, despite the loss of
@@ -1145,19 +1158,19 @@ subquery_planner(PlannerGlobal *glob, Query *parse, char *plan_name,
11451158
* clause into WHERE, in hopes of eliminating tuples before aggregation
11461159
* instead of after.
11471160
*
1148-
* If the query has explicit grouping then we can simply move such a
1161+
* If the query has no empty grouping set then we can simply move such a
11491162
* clause into WHERE; any group that fails the clause will not be in the
11501163
* output because none of its tuples will reach the grouping or
1151-
* aggregation stage. Otherwise we must have a degenerate (variable-free)
1152-
* HAVING clause, which we put in WHERE so that query_planner() can use it
1153-
* in a gating Result node, but also keep in HAVING to ensure that we
1154-
* don't emit a bogus aggregated row. (This could be done better, but it
1155-
* seems not worth optimizing.)
1164+
* aggregation stage. Otherwise we have to keep the clause in HAVING to
1165+
* ensure that we don't emit a bogus aggregated row. But then the HAVING
1166+
* clause must be degenerate (variable-free), so we can copy it into WHERE
1167+
* so that query_planner() can use it in a gating Result node. (This could
1168+
* be done better, but it seems not worth optimizing.)
11561169
*
11571170
* Note that a HAVING clause may contain expressions that are not fully
11581171
* preprocessed. This can happen if these expressions are part of
11591172
* grouping items. In such cases, they are replaced with GROUP Vars in
1160-
* the parser and then replaced back after we've done with expression
1173+
* the parser and then replaced back after we're done with expression
11611174
* preprocessing on havingQual. This is not an issue if the clause
11621175
* remains in HAVING, because these expressions will be matched to lower
11631176
* target items in setrefs.c. However, if the clause is moved or copied
@@ -1182,8 +1195,11 @@ subquery_planner(PlannerGlobal *glob, Query *parse, char *plan_name,
11821195
/* keep it in HAVING */
11831196
newHaving = lappend(newHaving, havingclause);
11841197
}
1185-
else if (parse->groupClause)
1198+
else if (parse->groupClause &&
1199+
(parse->groupingSets == NIL ||
1200+
(List *) linitial(parse->groupingSets) != NIL))
11861201
{
1202+
/* There is GROUP BY, but no empty grouping set */
11871203
Node *whereclause;
11881204

11891205
/* Preprocess the HAVING clause fully */
@@ -1196,6 +1212,7 @@ subquery_planner(PlannerGlobal *glob, Query *parse, char *plan_name,
11961212
}
11971213
else
11981214
{
1215+
/* There is an empty grouping set (perhaps implicitly) */
11991216
Node *whereclause;
12001217

12011218
/* Preprocess the HAVING clause fully */
@@ -2181,10 +2198,13 @@ grouping_planner(PlannerInfo *root, double tuple_fraction,
21812198
}
21822199

21832200
/*
2184-
* Do preprocessing for groupingSets clause and related data. This handles the
2185-
* preliminary steps of expanding the grouping sets, organizing them into lists
2186-
* of rollups, and preparing annotations which will later be filled in with
2187-
* size estimates.
2201+
* Do preprocessing for groupingSets clause and related data.
2202+
*
2203+
* We expect that parse->groupingSets has already been expanded into a flat
2204+
* list of grouping sets (that is, just integer Lists of ressortgroupref
2205+
* numbers) by expand_grouping_sets(). This function handles the preliminary
2206+
* steps of organizing the grouping sets into lists of rollups, and preparing
2207+
* annotations which will later be filled in with size estimates.
21882208
*/
21892209
static grouping_sets_data *
21902210
preprocess_grouping_sets(PlannerInfo *root)
@@ -2195,19 +2215,18 @@ preprocess_grouping_sets(PlannerInfo *root)
21952215
ListCell *lc_set;
21962216
grouping_sets_data *gd = palloc0(sizeof(grouping_sets_data));
21972217

2198-
parse->groupingSets = expand_grouping_sets(parse->groupingSets, parse->groupDistinct, -1);
2199-
2200-
gd->any_hashable = false;
2201-
gd->unhashable_refs = NULL;
2202-
gd->unsortable_refs = NULL;
2203-
gd->unsortable_sets = NIL;
2204-
22052218
/*
22062219
* We don't currently make any attempt to optimize the groupClause when
22072220
* there are grouping sets, so just duplicate it in processed_groupClause.
22082221
*/
22092222
root->processed_groupClause = parse->groupClause;
22102223

2224+
/* Detect unhashable and unsortable grouping expressions */
2225+
gd->any_hashable = false;
2226+
gd->unhashable_refs = NULL;
2227+
gd->unsortable_refs = NULL;
2228+
gd->unsortable_sets = NIL;
2229+
22112230
if (parse->groupClause)
22122231
{
22132232
ListCell *lc;

src/test/regress/expected/groupingsets.out

Lines changed: 81 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -890,7 +890,8 @@ explain (costs off)
890890
-> Seq Scan on gstest2
891891
(10 rows)
892892

893-
-- test pushdown of HAVING clause that does not reference any columns that are nullable by grouping sets
893+
-- test pushdown of non-degenerate HAVING clause that does not reference any
894+
-- columns that are nullable by grouping sets
894895
explain (costs off)
895896
select a, b, count(*) from gstest2 group by grouping sets ((a, b), (a)) having a > 1 and b > 1;
896897
QUERY PLAN
@@ -911,6 +912,85 @@ select a, b, count(*) from gstest2 group by grouping sets ((a, b), (a)) having a
911912
2 | 2 | 1
912913
(1 row)
913914

915+
explain (costs off)
916+
select a, b, count(*) from gstest2 group by rollup(a), b having b > 1;
917+
QUERY PLAN
918+
---------------------------------
919+
GroupAggregate
920+
Group Key: b, a
921+
Group Key: b
922+
-> Sort
923+
Sort Key: b, a
924+
-> Seq Scan on gstest2
925+
Filter: (b > 1)
926+
(7 rows)
927+
928+
select a, b, count(*) from gstest2 group by rollup(a), b having b > 1;
929+
a | b | count
930+
---+---+-------
931+
1 | 2 | 1
932+
2 | 2 | 1
933+
| 2 | 2
934+
(3 rows)
935+
936+
-- test pushdown of degenerate HAVING clause
937+
explain (costs off)
938+
select count(*) from gstest2 group by grouping sets (()) having false;
939+
QUERY PLAN
940+
-----------------------------------
941+
Aggregate
942+
Group Key: ()
943+
Filter: false
944+
-> Result
945+
Replaces: Scan on gstest2
946+
One-Time Filter: false
947+
(6 rows)
948+
949+
select count(*) from gstest2 group by grouping sets (()) having false;
950+
count
951+
-------
952+
(0 rows)
953+
954+
explain (costs off)
955+
select a, count(*) from gstest2 group by grouping sets ((a), ()) having false;
956+
QUERY PLAN
957+
-----------------------------------------
958+
GroupAggregate
959+
Group Key: a
960+
Group Key: ()
961+
Filter: false
962+
-> Sort
963+
Sort Key: a
964+
-> Result
965+
Replaces: Scan on gstest2
966+
One-Time Filter: false
967+
(9 rows)
968+
969+
select a, count(*) from gstest2 group by grouping sets ((a), ()) having false;
970+
a | count
971+
---+-------
972+
(0 rows)
973+
974+
explain (costs off)
975+
select a, b, count(*) from gstest2 group by grouping sets ((a), (b)) having false;
976+
QUERY PLAN
977+
-----------------------------------------
978+
GroupAggregate
979+
Group Key: a
980+
Sort Key: b
981+
Group Key: b
982+
-> Sort
983+
Sort Key: a
984+
-> Result
985+
Replaces: Scan on gstest2
986+
One-Time Filter: false
987+
(9 rows)
988+
989+
select a, b, count(*) from gstest2 group by grouping sets ((a), (b)) having false;
990+
a | b | count
991+
---+---+-------
992+
(0 rows)
993+
914994
-- HAVING with GROUPING queries
915995
select ten, grouping(ten) from onek
916996
group by grouping sets(ten) having grouping(ten) >= 0

src/test/regress/sql/groupingsets.sql

Lines changed: 19 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -290,11 +290,29 @@ explain (costs off)
290290
select v.c, (select count(*) from gstest2 group by () having v.c)
291291
from (values (false),(true)) v(c) order by v.c;
292292

293-
-- test pushdown of HAVING clause that does not reference any columns that are nullable by grouping sets
293+
-- test pushdown of non-degenerate HAVING clause that does not reference any
294+
-- columns that are nullable by grouping sets
294295
explain (costs off)
295296
select a, b, count(*) from gstest2 group by grouping sets ((a, b), (a)) having a > 1 and b > 1;
296297
select a, b, count(*) from gstest2 group by grouping sets ((a, b), (a)) having a > 1 and b > 1;
297298

299+
explain (costs off)
300+
select a, b, count(*) from gstest2 group by rollup(a), b having b > 1;
301+
select a, b, count(*) from gstest2 group by rollup(a), b having b > 1;
302+
303+
-- test pushdown of degenerate HAVING clause
304+
explain (costs off)
305+
select count(*) from gstest2 group by grouping sets (()) having false;
306+
select count(*) from gstest2 group by grouping sets (()) having false;
307+
308+
explain (costs off)
309+
select a, count(*) from gstest2 group by grouping sets ((a), ()) having false;
310+
select a, count(*) from gstest2 group by grouping sets ((a), ()) having false;
311+
312+
explain (costs off)
313+
select a, b, count(*) from gstest2 group by grouping sets ((a), (b)) having false;
314+
select a, b, count(*) from gstest2 group by grouping sets ((a), (b)) having false;
315+
298316
-- HAVING with GROUPING queries
299317
select ten, grouping(ten) from onek
300318
group by grouping sets(ten) having grouping(ten) >= 0

0 commit comments

Comments
 (0)