Partitioned index can be not dumped

Lists: pgsql-hackers
From: Alexander Pyhalov <a(dot)pyhalov(at)postgrespro(dot)ru>
To: Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Cc: Álvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
Subject: Partitioned index can be not dumped
Date: 2021-06-30 14:26:08
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

Hi.

I've seen the following effect on PostgreSQL 14 stable branch.
Index, created on partitioned table, disappears from pg_dump or psql \d
output.
This seems to begin after analyze. Partitoned relation relhasindex
pg_class field suddenly becomes false.

The issue happens after

commit 0e69f705cc1a3df273b38c9883fb5765991e04fe (HEAD, refs/bisect/bad)
Author: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
Date: Fri Apr 9 11:29:08 2021 -0400

Set pg_class.reltuples for partitioned tables

When commit 0827e8af70f4 added auto-analyze support for partitioned
tables, it included code to obtain reltuples for the partitioned
table
as a number of catalog accesses to read pg_class.reltuples for each
partition. That's not only very inefficient, but also problematic
because autovacuum doesn't hold any locks on any of those tables --
and
doesn't want to. Replace that code with a read of
pg_class.reltuples
for the partitioned table, and make sure ANALYZE and TRUNCATE
properly
maintain that value.

I found no code that would be affected by the change of relpages
from
zero to non-zero for partitioned tables, and no other code that
should
be maintaining it, but if there is, hopefully it'll be an easy fix.

Per buildfarm.

Author: Álvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
Reviewed-by: Zhihong Yu <zyu(at)yugabyte(dot)com>

It seems that in this commit we unconditionally overwrite this data with
0.
I've tried to fix it by getting this information when inh is true and
ignoring nindexes when inh is not true.

--
Best regards,
Alexander Pyhalov,
Postgres Professional

Attachment Content-Type Size
0001-Set-relhasindex-for-partitioned-tables-correctly.patch text/x-diff 6.5 KB

From: Alexander Pyhalov <a(dot)pyhalov(at)postgrespro(dot)ru>
To: Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Cc: Álvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
Subject: Re: Partitioned index can be not dumped
Date: 2021-06-30 14:33:50
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

Alexander Pyhalov писал 2021-06-30 17:26:
> Hi.
>
>

Sorry, test had an issue.

--
Best regards,
Alexander Pyhalov,
Postgres Professional

Attachment Content-Type Size
0001-Set-relhasindex-for-partitioned-tables-correctly-v2.patch text/x-diff 6.5 KB

From: Álvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
To: Alexander Pyhalov <a(dot)pyhalov(at)postgrespro(dot)ru>
Cc: Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Partitioned index can be not dumped
Date: 2021-06-30 14:44:13
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

On 2021-Jun-30, Alexander Pyhalov wrote:

> Hi.
>
> I've seen the following effect on PostgreSQL 14 stable branch.
> Index, created on partitioned table, disappears from pg_dump or psql \d
> output.
> This seems to begin after analyze. Partitoned relation relhasindex pg_class
> field suddenly becomes false.

Uh, ouch.

I'll look into this shortly.

--
Álvaro Herrera Valdivia, Chile
https://www.EnterpriseDB.com/


From: Álvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
To: Alexander Pyhalov <a(dot)pyhalov(at)postgrespro(dot)ru>
Cc: Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Partitioned index can be not dumped
Date: 2021-06-30 18:54:09
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

On 2021-Jun-30, Alexander Pyhalov wrote:

> I've seen the following effect on PostgreSQL 14 stable branch.
> Index, created on partitioned table, disappears from pg_dump or psql \d
> output.
> This seems to begin after analyze. Partitoned relation relhasindex pg_class
> field suddenly becomes false.

Yeah, that seems correct.

I didn't verify your test case, but after looking at the code I thought
there was a bit too much churn and the new conditions looked quite messy
and unexplained. It seems simpler to be explicit at the start about
what we're doing, and keep nindexes=0 for partitioned tables; with that,
the code works unchanged because the "for" loops do nothing without
having to check for anything. My proposal is attached.

I did run the tests and they do pass, but I didn't look very closely at
what the tests are actually doing.

I noticed that part of that comment seems to be a leftover from ... I
don't know when: "We do not analyze index columns if there was an
explicit column list in the ANALYZE command, however." I suppose this
is about some code that was removed, but I didn't dig into it.

--
Álvaro Herrera Valdivia, Chile
"How strange it is to find the words "Perl" and "saner" in such close
proximity, with no apparent sense of irony. I doubt that Larry himself
could have managed it." (ncm, http://lwn.net/Articles/174769/)

Attachment Content-Type Size
v3-0001-Set-relhasindex-for-partitioned-tables-correctly.patch text/x-diff 5.1 KB

From: Zhihong Yu <zyu(at)yugabyte(dot)com>
To: Álvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
Cc: Alexander Pyhalov <a(dot)pyhalov(at)postgrespro(dot)ru>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Partitioned index can be not dumped
Date: 2021-06-30 19:26:20
Message-ID: CALNJ-vS1R3Qoe5t4tbzxrkpBtzRbPq1dDcW4RmA_a+oqweF30w@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Jun 30, 2021 at 11:54 AM Álvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
wrote:

> On 2021-Jun-30, Alexander Pyhalov wrote:
>
> > I've seen the following effect on PostgreSQL 14 stable branch.
> > Index, created on partitioned table, disappears from pg_dump or psql \d
> > output.
> > This seems to begin after analyze. Partitoned relation relhasindex
> pg_class
> > field suddenly becomes false.
>
> Yeah, that seems correct.
>
> I didn't verify your test case, but after looking at the code I thought
> there was a bit too much churn and the new conditions looked quite messy
> and unexplained. It seems simpler to be explicit at the start about
> what we're doing, and keep nindexes=0 for partitioned tables; with that,
> the code works unchanged because the "for" loops do nothing without
> having to check for anything. My proposal is attached.
>
> I did run the tests and they do pass, but I didn't look very closely at
> what the tests are actually doing.
>
> I noticed that part of that comment seems to be a leftover from ... I
> don't know when: "We do not analyze index columns if there was an
> explicit column list in the ANALYZE command, however." I suppose this
> is about some code that was removed, but I didn't dig into it.
>
> --
> Álvaro Herrera Valdivia, Chile
> "How strange it is to find the words "Perl" and "saner" in such close
> proximity, with no apparent sense of irony. I doubt that Larry himself
> could have managed it." (ncm, http://lwn.net/Articles/174769/)

Hi,
nit:
- if (hasindex)
+ if (nindexes > 0)

It seems hasindex is no longer needed since nindexes is checked.

Cheers


From: Alexander Pyhalov <a(dot)pyhalov(at)postgrespro(dot)ru>
To: Álvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
Cc: Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Partitioned index can be not dumped
Date: 2021-06-30 19:28:42
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

Álvaro Herrera писал 2021-06-30 21:54:
> On 2021-Jun-30, Alexander Pyhalov wrote:
>
>> I've seen the following effect on PostgreSQL 14 stable branch.
>> Index, created on partitioned table, disappears from pg_dump or psql
>> \d
>> output.
>> This seems to begin after analyze. Partitoned relation relhasindex
>> pg_class
>> field suddenly becomes false.
>
> Yeah, that seems correct.
>
> I didn't verify your test case, but after looking at the code I thought
> there was a bit too much churn and the new conditions looked quite
> messy
> and unexplained. It seems simpler to be explicit at the start about
> what we're doing, and keep nindexes=0 for partitioned tables; with
> that,
> the code works unchanged because the "for" loops do nothing without
> having to check for anything. My proposal is attached.
>
> I did run the tests and they do pass, but I didn't look very closely at
> what the tests are actually doing.
>
> I noticed that part of that comment seems to be a leftover from ... I
> don't know when: "We do not analyze index columns if there was an
> explicit column list in the ANALYZE command, however." I suppose this
> is about some code that was removed, but I didn't dig into it.

Looks good. It seems this comment refers to line 455.

445 if (nindexes > 0)
446 {
447 indexdata = (AnlIndexData *) palloc0(nindexes *
sizeof(AnlIndexData));
448 for (ind = 0; ind < nindexes; ind++)
449 {
450 AnlIndexData *thisdata = &indexdata[ind];
451 IndexInfo *indexInfo;
452
453 thisdata->indexInfo = indexInfo =
BuildIndexInfo(Irel[ind]);
454 thisdata->tupleFract = 1.0; /* fix later if partial */
455 if (indexInfo->ii_Expressions != NIL && va_cols == NIL)
456 {
457 ListCell *indexpr_item =
list_head(indexInfo->ii_Expressions);
458
459 thisdata->vacattrstats = (VacAttrStats **)
460 palloc(indexInfo->ii_NumIndexAttrs *
sizeof(VacAttrStats *));

Also I've added non-necessary new line in test.
Restored comment and removed new line.
--
Best regards,
Alexander Pyhalov,
Postgres Professional

Attachment Content-Type Size
v4-0001-Set-relhasindex-for-partitioned-tables-correctly.patch text/x-diff 4.8 KB

From: Álvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
To: Zhihong Yu <zyu(at)yugabyte(dot)com>
Cc: Alexander Pyhalov <a(dot)pyhalov(at)postgrespro(dot)ru>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Partitioned index can be not dumped
Date: 2021-06-30 21:32:40
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

On 2021-Jun-30, Zhihong Yu wrote:

> Hi,
> nit:
> - if (hasindex)
> + if (nindexes > 0)
>
> It seems hasindex is no longer needed since nindexes is checked.

It's still used to call vac_update_relstats(). We want nindexes to be 0
for partitioned tables, but still pass true when there are indexes.

Please don't forget to trim the text of the email you're replying to.

--
Álvaro Herrera Valdivia, Chile
https://www.EnterpriseDB.com/


From: Zhihong Yu <zyu(at)yugabyte(dot)com>
To: Álvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
Cc: Alexander Pyhalov <a(dot)pyhalov(at)postgrespro(dot)ru>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Partitioned index can be not dumped
Date: 2021-06-30 21:56:17
Message-ID: CALNJ-vRZxGfz-3u0rWELJ496A4tPKLV6hw1YJEB=N=0Z+7Q0GA@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Jun 30, 2021 at 2:32 PM Álvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
wrote:

> On 2021-Jun-30, Zhihong Yu wrote:
>
> > Hi,
> > nit:
> > - if (hasindex)
> > + if (nindexes > 0)
> >
> > It seems hasindex is no longer needed since nindexes is checked.
>
> It's still used to call vac_update_relstats(). We want nindexes to be 0
> for partitioned tables, but still pass true when there are indexes.
>
Hi,
In that case, I wonder whether nindexes can be negated following the call
to vac_open_indexes().

vac_open_indexes(onerel, AccessShareLock, &nindexes, &Irel);
+ nindexes = -nindexes;

That way, hasindex can be dropped.
vac_update_relstats() call would become:

vac_update_relstats(onerel, -1, totalrows,
- 0, false, InvalidTransactionId,
+ 0, nindexes != 0, InvalidTransactionId,

My thinking is that without hasindex, the code is easier to maintain.

Thanks

> Please don't forget to trim the text of the email you're replying to.
>
> --
> Álvaro Herrera Valdivia, Chile
> https://www.EnterpriseDB.com/
>


From: Álvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
To: Zhihong Yu <zyu(at)yugabyte(dot)com>
Cc: Alexander Pyhalov <a(dot)pyhalov(at)postgrespro(dot)ru>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Partitioned index can be not dumped
Date: 2021-06-30 21:57:36
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

On 2021-Jun-30, Zhihong Yu wrote:

> Hi,
> In that case, I wonder whether nindexes can be negated following the call
> to vac_open_indexes().
>
> vac_open_indexes(onerel, AccessShareLock, &nindexes, &Irel);
> + nindexes = -nindexes;
>
> That way, hasindex can be dropped.
> vac_update_relstats() call would become:
>
> vac_update_relstats(onerel, -1, totalrows,
> - 0, false, InvalidTransactionId,
> + 0, nindexes != 0, InvalidTransactionId,

Perhaps this works, but I don't think it's a readability improvement.

> My thinking is that without hasindex, the code is easier to maintain.

You have one less variable but one additional concept (negative
nindexes). It doesn't seem easier to me, TBH, rather the opposite.

--
Álvaro Herrera 39°49'30"S 73°17'W
https://www.EnterpriseDB.com/


From: Álvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
To: Alexander Pyhalov <a(dot)pyhalov(at)postgrespro(dot)ru>
Cc: Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>, Zhihong Yu <zyu(at)yugabyte(dot)com>
Subject: Re: Partitioned index can be not dumped
Date: 2021-07-01 17:01:19
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

On 2021-Jun-30, Alexander Pyhalov wrote:

> Looks good. It seems this comment refers to line 455.

Ah, so it does.

I realized that we don't need to do vac_open_indexes for partitioned
tables: it is sufficient to know whether there are any indexes at all.
So I replaced that with RelationGetIndexList() and checking if the list
is nonempty, specifically for the partitioned table case. Pushed now.

Thanks for reporting and fixing this, and to Zhihong Yu for reviewing.

--
Álvaro Herrera Valdivia, Chile — https://www.EnterpriseDB.com/
"Investigación es lo que hago cuando no sé lo que estoy haciendo"
(Wernher von Braun)