Skip to content

Commit 1a80b7d

Browse files
danielgustafssontvondra
authored andcommitted
Online enabling and disabling of data checksums
This allows data checksums to be enabled, or disabled, in a running cluster without restricting access to the cluster during processing. Data checksums could prior to this only be enabled during initdb or when the cluster is offline using the pg_checksums app. This commit introduce functionality to enable, or disable, data checksums while the cluster is running regardless of how it was initialized. A background worker launcher process is responsible for launching a dynamic per-database background worker which will mark all buffers dirty for all relation with storage in order for them to have data checksums calcuated on write. Once all relations in all databases have been processed, the data_checksums state will be set to on and the cluster will at that point be identical to one which had data checksums enabled during initialization or via offline processing. When data checksums are being enabled, concurrent I/O operations from backends other than the data checksums worker will write the checksums but not verify them on reading. Only when all backends have absorbed the procsignalbarrier for setting data_checksums to on will they also start verifying checksums on reading. The same process is repeated during disabling; all backends write checksums but do not verify them until the barrier for setting the state to off has been absorbed by all. This in-progress state is used to ensure there are no false negatives (or positives) due to reading a checksum which is not in sync with the page. A new testmodule, test_checksums, is introduced with an extensive set of tests covering both online and offline data checksum mode changes. The tests for online processing are gated begind the PG_TEST_EXTRA flag to some degree due to being very time consuming to run. This work is based on an earlier version of this patch which was reviewed by among others Heikki Linnakangas, Robert Haas, Andres Freund, Tomas Vondra, Michael Banck and Andrey Borodin. During the work on this new version, Tomas Vondra has given invaluable assistance with not only coding and reviewing but very in-depth testing. Author: Daniel Gustafsson <[email protected]> Author: Magnus Hagander <[email protected]> Co-authored-by: Tomas Vondra <[email protected]> Reviewed-by: Tomas Vondra <[email protected]> Discussion: https://postgr.es/m/CABUevExz9hUUOLnJVr2kpw9Cx=o4MCr1SVKwbupzuxP7ckNutA@mail.gmail.com Discussion: https://postgr.es/m/[email protected] Discussion: https://postgr.es/m/CABUevEwE3urLtwxxqdgd5O2oQz9J717ZzMbh+ziCSa5YLLU_BA@mail.gmail.com
1 parent a4fd971 commit 1a80b7d

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

70 files changed

+4815
-47
lines changed

doc/src/sgml/func/func-admin.sgml

Lines changed: 71 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2979,4 +2979,75 @@ SELECT convert_from(pg_read_binary_file('file_in_utf8.txt'), 'UTF8');
29792979

29802980
</sect2>
29812981

2982+
<sect2 id="functions-admin-checksum">
2983+
<title>Data Checksum Functions</title>
2984+
2985+
<para>
2986+
The functions shown in <xref linkend="functions-checksums-table" /> can
2987+
be used to enable or disable data checksums in a running cluster.
2988+
See <xref linkend="checksums" /> for details.
2989+
</para>
2990+
2991+
<table id="functions-checksums-table">
2992+
<title>Data Checksum Functions</title>
2993+
<tgroup cols="1">
2994+
<thead>
2995+
<row>
2996+
<entry role="func_table_entry"><para role="func_signature">
2997+
Function
2998+
</para>
2999+
<para>
3000+
Description
3001+
</para></entry>
3002+
</row>
3003+
</thead>
3004+
3005+
<tbody>
3006+
<row>
3007+
<entry role="func_table_entry"><para role="func_signature">
3008+
<indexterm>
3009+
<primary>pg_enable_data_checksums</primary>
3010+
</indexterm>
3011+
<function>pg_enable_data_checksums</function> ( <optional><parameter>cost_delay</parameter> <type>int</type>, <parameter>cost_limit</parameter> <type>int</type></optional> )
3012+
<returnvalue>void</returnvalue>
3013+
</para>
3014+
<para>
3015+
Initiates data checksums for the cluster. This will switch the data
3016+
checksums mode to <literal>inprogress-on</literal> as well as start a
3017+
background worker that will process all pages in the database and
3018+
enable checksums on them. When all data pages have had checksums
3019+
enabled, the cluster will automatically switch data checksums mode to
3020+
<literal>on</literal>.
3021+
</para>
3022+
<para>
3023+
If <parameter>cost_delay</parameter> and <parameter>cost_limit</parameter> are
3024+
specified, the speed of the process is throttled using the same principles as
3025+
<link linkend="runtime-config-resource-vacuum-cost">Cost-based Vacuum Delay</link>.
3026+
</para></entry>
3027+
</row>
3028+
3029+
<row>
3030+
<entry role="func_table_entry"><para role="func_signature">
3031+
<indexterm>
3032+
<primary>pg_disable_data_checksums</primary>
3033+
</indexterm>
3034+
<function>pg_disable_data_checksums</function> ()
3035+
<returnvalue>void</returnvalue>
3036+
</para>
3037+
<para>
3038+
Disables data checksum validation and calculation for the cluster. This
3039+
will switch the data checksum mode to <literal>inprogress-off</literal>
3040+
while data checksums are being disabled. When all active backends have
3041+
stopped validating data checksums, the data checksum mode will be
3042+
changed to <literal>off</literal>. At this point the data pages will
3043+
still have checksums recorded but they are not updated when pages are
3044+
modified.
3045+
</para></entry>
3046+
</row>
3047+
</tbody>
3048+
</tgroup>
3049+
</table>
3050+
3051+
</sect2>
3052+
29823053
</sect1>

doc/src/sgml/glossary.sgml

Lines changed: 23 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -184,6 +184,8 @@
184184
(but not the autovacuum workers),
185185
the <glossterm linkend="glossary-background-writer">background writer</glossterm>,
186186
the <glossterm linkend="glossary-checkpointer">checkpointer</glossterm>,
187+
the <glossterm linkend="glossary-data-checksums-worker">data checksums worker</glossterm>,
188+
the <glossterm linkend="glossary-data-checksums-worker-launcher">data checksums worker launcher</glossterm>,
187189
the <glossterm linkend="glossary-logger">logger</glossterm>,
188190
the <glossterm linkend="glossary-startup-process">startup process</glossterm>,
189191
the <glossterm linkend="glossary-wal-archiver">WAL archiver</glossterm>,
@@ -573,6 +575,27 @@
573575
</glossdef>
574576
</glossentry>
575577

578+
<glossentry id="glossary-data-checksums-worker">
579+
<glossterm>Data Checksums Worker</glossterm>
580+
<glossdef>
581+
<para>
582+
An <glossterm linkend="glossary-auxiliary-proc">auxiliary process</glossterm>
583+
which enables or disables data checksums in a specific database.
584+
</para>
585+
</glossdef>
586+
</glossentry>
587+
588+
<glossentry id="glossary-data-checksums-worker-launcher">
589+
<glossterm>Data Checksums Worker Launcher</glossterm>
590+
<glossdef>
591+
<para>
592+
An <glossterm linkend="glossary-auxiliary-proc">auxiliary process</glossterm>
593+
which starts <glossterm linkend="glossary-data-checksums-worker"> processes</glossterm>
594+
for each database.
595+
</para>
596+
</glossdef>
597+
</glossentry>
598+
576599
<glossentry id="glossary-db-cluster">
577600
<glossterm>Database cluster</glossterm>
578601
<glossdef>

doc/src/sgml/monitoring.sgml

Lines changed: 204 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -3551,8 +3551,9 @@ description | Waiting for a newly initialized WAL file to reach durable storage
35513551
</para>
35523552
<para>
35533553
Number of data page checksum failures detected in this
3554-
database (or on a shared object), or NULL if data checksums are
3555-
disabled.
3554+
database (or on a shared object).
3555+
Detected failures are reported regardless of the
3556+
<xref linkend="guc-data-checksums"/> setting.
35563557
</para></entry>
35573558
</row>
35583559

@@ -3562,8 +3563,8 @@ description | Waiting for a newly initialized WAL file to reach durable storage
35623563
</para>
35633564
<para>
35643565
Time at which the last data page checksum failure was detected in
3565-
this database (or on a shared object), or NULL if data checksums are
3566-
disabled.
3566+
this database (or on a shared object). Last failure is reported
3567+
regardless of the <xref linkend="guc-data-checksums"/> setting.
35673568
</para></entry>
35683569
</row>
35693570

@@ -6946,6 +6947,205 @@ FROM pg_stat_get_backend_idset() AS backendid;
69466947

69476948
</sect2>
69486949

6950+
<sect2 id="data-checksum-progress-reporting">
6951+
<title>Data Checksum Progress Reporting</title>
6952+
6953+
<indexterm>
6954+
<primary>pg_stat_progress_data_checksums</primary>
6955+
</indexterm>
6956+
6957+
<para>
6958+
When data checksums are being enabled on a running cluster, the
6959+
<structname>pg_stat_progress_data_checksums</structname> view will contain
6960+
a row for the launcher process, and one row for each worker process which
6961+
is currently calculating checksums for the data pages in one database.
6962+
</para>
6963+
6964+
<table id="pg-stat-progress-data-checksums-view" xreflabel="pg_stat_progress_data_checksums">
6965+
<title><structname>pg_stat_progress_data_checksums</structname> View</title>
6966+
<tgroup cols="1">
6967+
<thead>
6968+
<row>
6969+
<entry role="catalog_table_entry">
6970+
<para role="column_definition">
6971+
Column Type
6972+
</para>
6973+
<para>
6974+
Description>
6975+
</para>
6976+
</entry>
6977+
</row>
6978+
</thead>
6979+
6980+
<tbody>
6981+
<row>
6982+
<entry role="catalog_table_entry">
6983+
<para role="column_definition">
6984+
<structfield>pid</structfield> <type>integer</type>
6985+
</para>
6986+
<para>
6987+
Process ID of a datachecksumworker process.
6988+
</para>
6989+
</entry>
6990+
</row>
6991+
6992+
<row>
6993+
<entry role="catalog_table_entry"><para role="column_definition">
6994+
<structfield>datid</structfield> <type>oid</type>
6995+
</para>
6996+
<para>
6997+
OID of this database, or 0 for the launcher process
6998+
relation
6999+
</para></entry>
7000+
</row>
7001+
7002+
<row>
7003+
<entry role="catalog_table_entry"><para role="column_definition">
7004+
<structfield>datname</structfield> <type>name</type>
7005+
</para>
7006+
<para>
7007+
Name of this database, or <literal>NULL</literal> for the
7008+
launcher process.
7009+
</para></entry>
7010+
</row>
7011+
7012+
<row>
7013+
<entry role="catalog_table_entry">
7014+
<para role="column_definition">
7015+
<structfield>phase</structfield> <type>text</type>
7016+
</para>
7017+
<para>
7018+
Current processing phase, see <xref linkend="datachecksum-phases"/>
7019+
for description of the phases.
7020+
</para>
7021+
</entry>
7022+
</row>
7023+
7024+
<row>
7025+
<entry role="catalog_table_entry">
7026+
<para role="column_definition">
7027+
<structfield>databases_total</structfield> <type>integer</type>
7028+
</para>
7029+
<para>
7030+
The total number of databases which will be processed. Only the
7031+
launcher worker has this value set, the other worker processes
7032+
have this set to <literal>NULL</literal>.
7033+
</para>
7034+
</entry>
7035+
</row>
7036+
7037+
<row>
7038+
<entry role="catalog_table_entry">
7039+
<para role="column_definition">
7040+
<structfield>databases_done</structfield> <type>integer</type>
7041+
</para>
7042+
<para>
7043+
The number of databases which have been processed. Only the
7044+
launcher worker has this value set, the other worker processes
7045+
have this set to <literal>NULL</literal>.
7046+
</para>
7047+
</entry>
7048+
</row>
7049+
7050+
<row>
7051+
<entry role="catalog_table_entry">
7052+
<para role="column_definition">
7053+
<structfield>relations_total</structfield> <type>integer</type>
7054+
</para>
7055+
<para>
7056+
The total number of relations which will be processed, or
7057+
<literal>NULL</literal> if the data checksums worker process hasn't
7058+
calculated the number of relations yet. The launcher process has
7059+
this <literal>NULL</literal>.
7060+
</para>
7061+
</entry>
7062+
</row>
7063+
7064+
<row>
7065+
<entry role="catalog_table_entry">
7066+
<para role="column_definition">
7067+
<structfield>relations_done</structfield> <type>integer</type>
7068+
</para>
7069+
<para>
7070+
The number of relations which have been processed. The launcher
7071+
process has this <literal>NULL</literal>.
7072+
</para>
7073+
</entry>
7074+
</row>
7075+
7076+
<row>
7077+
<entry role="catalog_table_entry">
7078+
<para role="column_definition">
7079+
<structfield>blocks_total</structfield> <type>integer</type>
7080+
</para>
7081+
<para>
7082+
The number of blocks in the current relation which will be processed,
7083+
or <literal>NULL</literal> if the data checksums worker process hasn't
7084+
calculated the number of blocks yet. The launcher process has
7085+
this <literal>NULL</literal>.
7086+
</para>
7087+
</entry>
7088+
</row>
7089+
7090+
<row>
7091+
<entry role="catalog_table_entry">
7092+
<para role="column_definition">
7093+
<structfield>blocks_done</structfield> <type>integer</type>
7094+
</para>
7095+
<para>
7096+
The number of blocks in the current relation which have been processed.
7097+
The launcher process has this <literal>NULL</literal>.
7098+
</para>
7099+
</entry>
7100+
</row>
7101+
7102+
</tbody>
7103+
</tgroup>
7104+
</table>
7105+
7106+
<table id="datachecksum-phases">
7107+
<title>Data Checksum Phases</title>
7108+
<tgroup cols="2">
7109+
<colspec colname="col1" colwidth="1*"/>
7110+
<colspec colname="col2" colwidth="2*"/>
7111+
<thead>
7112+
<row>
7113+
<entry>Phase</entry>
7114+
<entry>Description</entry>
7115+
</row>
7116+
</thead>
7117+
<tbody>
7118+
<row>
7119+
<entry><literal>enabling</literal></entry>
7120+
<entry>
7121+
The command is currently enabling data checksums on the cluster.
7122+
</entry>
7123+
</row>
7124+
<row>
7125+
<entry><literal>disabling</literal></entry>
7126+
<entry>
7127+
The command is currently disabling data checksums on the cluster.
7128+
</entry>
7129+
</row>
7130+
<row>
7131+
<entry><literal>waiting on temporary tables</literal></entry>
7132+
<entry>
7133+
The command is currently waiting for all temporary tables which existed
7134+
at the time the command was started to be removed.
7135+
</entry>
7136+
</row>
7137+
<row>
7138+
<entry><literal>waiting on checkpoint</literal></entry>
7139+
<entry>
7140+
The command is currently waiting for a checkpoint to update the checksum
7141+
state before finishing.
7142+
</entry>
7143+
</row>
7144+
</tbody>
7145+
</tgroup>
7146+
</table>
7147+
</sect2>
7148+
69497149
</sect1>
69507150

69517151
<sect1 id="dynamic-trace">

doc/src/sgml/ref/pg_checksums.sgml

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -45,6 +45,12 @@ PostgreSQL documentation
4545
exit status is nonzero if the operation failed.
4646
</para>
4747

48+
<para>
49+
When enabling checksums, if checksums were in the process of being enabled
50+
when the cluster was shut down, <application>pg_checksums</application>
51+
will still process all relations regardless of the online processing.
52+
</para>
53+
4854
<para>
4955
When verifying checksums, every file in the cluster is scanned. When
5056
enabling checksums, each relation file block with a changed checksum is

doc/src/sgml/regress.sgml

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -263,6 +263,18 @@ make check-world PG_TEST_EXTRA='kerberos ldap ssl load_balance libpq_encryption'
263263
</programlisting>
264264
The following values are currently supported:
265265
<variablelist>
266+
<varlistentry>
267+
<term><literal>checksum_extended</literal></term>
268+
<listitem>
269+
<para>
270+
Runs additional tests for enabling data checksums which inject delays
271+
and re-tries in the processing, as well as tests that run pgbench
272+
concurrently and randomly restarts the cluster. Some of these test
273+
suites requires injection points enabled in the installation.
274+
</para>
275+
</listitem>
276+
</varlistentry>
277+
266278
<varlistentry>
267279
<term><literal>kerberos</literal></term>
268280
<listitem>

0 commit comments

Comments
 (0)