From 99414cf206606ea3d7fa55231830ff194d146c50 Mon Sep 17 00:00:00 2001 From: Dementii Priadko Date: Thu, 24 Apr 2025 13:57:21 +0000 Subject: [PATCH 1/6] General features of pg_repack --- ...o_use_pg_repack_to_mitigate_table_bloat.md | 129 ++++++++++++++++++ 1 file changed, 129 insertions(+) create mode 100644 0096_how_to_use_pg_repack_to_mitigate_table_bloat.md diff --git a/0096_how_to_use_pg_repack_to_mitigate_table_bloat.md b/0096_how_to_use_pg_repack_to_mitigate_table_bloat.md new file mode 100644 index 0000000..d3c2f10 --- /dev/null +++ b/0096_how_to_use_pg_repack_to_mitigate_table_bloat.md @@ -0,0 +1,129 @@ +# General features of pg_repack + +This chapter focuses on features that are important to know before launching pg_repack + +1. pg_repack doesn’t process signals (such as SIGINT, which Ctrl+C causes) and doesn’t do any cleanup in case of interruption + In case of interruption, pg_repack leaves behind three things that need to be cleaned up by hand: + +2. Trigger on the processed table (in case there was a processing of a table) + +3. Indexes in indisready=false or indisvalid=false states (optional; may not be the case), see [https://www.postgresql.org/docs/current/catalog-pg-index.html](https://www.postgresql.org/docs/current/catalog-pg-index.html) + +4. Before the table’s processing, autovacuum turns off + +5. `-T` (`--wait-timeout=SECS`, default value: 60) and `-D` (`--no-kill-backend`) options work only for DDL queries and don’t work for DML + `-T` won’t help to cancel DML queries block – this should be done manually instead + See:  [https://github.com/reorg/pg_repack/blob/master/bin/pg_repack.c#L137](https://github.com/reorg/pg_repack/blob/master/bin/pg_repack.c#L137) + +6. Don’t use more than one pg_repack instance. When two pg_repack commands are executed simultaneously, there could be a deadlock. + +7. pg_repack can’t reorganize temporary tables. + +8. pg_repack can’t reorganize tables with GiST indexes. + +9. While `pg_repack` is running, `DDL` commands cannot be executed on the target tables, except `VACUUM` and `ANALYZE`. To enforce this restriction, `pg_repack` places an `ACCESS SHARE` lock on the target table while the table is being reorganized. + + +# Preparation to launch + +This chapter describes the actions that should be taken to lower the risks of failure while pg_repack is working. + +1. For the user under which `pg_repack` is launched (usually postgres), the limit on opening files should be increased, the optimal value is 65535. This will prevent pg_repack from crashing when this limit is reached. + +2. Long-running transactions with an isolation level higher than `READ COMMITTED` will block pg_repack. Therefore, it is necessary to provide a way to stop queries that will prevent pg_repack from working. + +3. Before starting a repack of large tables, verify that you have enough free disk space, as each processed table will be cloned during the process. + +4. It is advised to disable `idle_in_transaction_session_timeout=0` for the user under which `pg_repack` is run. The point is that if the parameter is, for example, 10 minutes, then when processing a table with a size of >100 GB, the connection may be reset. + `pg_repack` uses multiple connections, and some connections may be in the `idle in transaction` state (this is normal behavior). + + +# Diagnostics and troubleshooting + +1. Finding corrupted indexes + ```sql + SELECT c_r.relname as tbl, c_i.relname as idx, + i.indisvalid, -- If true, the index can be used in queries. A value of false means the index might be incomplete: it will still be updated by INSERT/UPDATE commands, but it's not safe to use it in queries. + i.indisready  -- If true, the index is ready to accept data. A value of false means the index is ignored by INSERT/UPDATE operations. + from pg_index i + join pg_class c_r on i.indrelid = c_r.oid and c_r.relkind = 'r' + join pg_class c_i on i.indexrelid = c_i.oid and c_i.relkind = 'i' + where i.indisvalid = false OR i.indisready = false + ``` + + +2. Blockings diagnostics + ```sql + SELECT + blocking_locks.pid AS blocker_pid, + blocking_activity.usename AS blocker_user, + substring(blocking_activity.query FROM 0 FOR 150) AS blocker_statement, + blocked_locks.pid AS blocked_pid, + blocked_activity.usename AS blocked_user, + substring(blocked_activity.query FROM 0 FOR 150) AS blocked_statement + FROM pg_catalog.pg_locks blocked_locks + JOIN pg_catalog.pg_stat_activity blocked_activity ON blocked_activity.pid = blocked_locks.pid + JOIN pg_catalog.pg_locks blocking_locks ON blocking_locks.locktype = blocked_locks.locktype + AND blocking_locks.DATABASE IS NOT DISTINCT FROM blocked_locks.DATABASE + AND blocking_locks.relation IS NOT DISTINCT FROM blocked_locks.relation + AND blocking_locks.page IS NOT DISTINCT FROM blocked_locks.page + AND blocking_locks.tuple IS NOT DISTINCT FROM blocked_locks.tuple + AND blocking_locks.virtualxid IS NOT DISTINCT FROM blocked_locks.virtualxid + AND blocking_locks.transactionid IS NOT DISTINCT FROM blocked_locks.transactionid + AND blocking_locks.classid IS NOT DISTINCT FROM blocked_locks.classid + AND blocking_locks.objid IS NOT DISTINCT FROM blocked_locks.objid + AND blocking_locks.objsubid IS NOT DISTINCT FROM blocked_locks.objsubid + AND blocking_locks.pid != blocked_locks.pid + JOIN pg_catalog.pg_stat_activity blocking_activity ON blocking_activity.pid = blocking_locks.pid + WHERE NOT blocked_locks.GRANTED; + ``` + + +3. Cleanup in case of pg_repack failure + ```sql + drop extension pg_repack cascade; + create extension pg_repack; + -- check if there are any indexes in INVALID state and drop those: + do $$ + declare + index_name name; + begin + for index_name in + select c.relname + from pg_index as i + join pg_class c on + c.oid = i.indexrelid + and not indisvalid + and relname ~ '^index_\d+$' + loop + execute format( + 'drop index %I', + index_name + ); + end loop; + end; + $$ language plpgsql; + ``` + +4. Search for tables with autovacuum turned off +```sql +select * +from pg_class +where '{autovacuum_enabled=false}'::text[] @> reloptions and relkind = 'r' +``` + +# Useful links + +pg_repack links: + +[https://reorg.github.io/pg_repack/](https://reorg.github.io/pg_repack/) + +[https://gitlab.com/gitlab-com/gl-infra/infrastructure/issues/6181#note_141114178](https://gitlab.com/gitlab-com/gl-infra/infrastructure/issues/6181#note_141114178) + +References on database internals in the context of concurrent transaction processing: + +[https://momjian.us/main/presentations/internals.html](https://momjian.us/main/presentations/internals.html) + +[http://www.interdb.jp/pg/pgsql05.html](http://www.interdb.jp/pg/pgsql05.html) + +[http://www.interdb.jp/pg/pgsql06.html](http://www.interdb.jp/pg/pgsql06.html) \ No newline at end of file -- GitLab From 6310024f63f864722a044904e3072f3dcddede5d Mon Sep 17 00:00:00 2001 From: Dementii Priadko Date: Thu, 24 Apr 2025 14:31:16 +0000 Subject: [PATCH 2/6] Edit 0096_how_to_use_pg_repack_to_mitigate_table_bloat.md --- ...o_use_pg_repack_to_mitigate_table_bloat.md | 36 +++++++++---------- 1 file changed, 16 insertions(+), 20 deletions(-) diff --git a/0096_how_to_use_pg_repack_to_mitigate_table_bloat.md b/0096_how_to_use_pg_repack_to_mitigate_table_bloat.md index d3c2f10..fa4c130 100644 --- a/0096_how_to_use_pg_repack_to_mitigate_table_bloat.md +++ b/0096_how_to_use_pg_repack_to_mitigate_table_bloat.md @@ -1,36 +1,32 @@ -# General features of pg_repack +# General features of `pg_repack` -This chapter focuses on features that are important to know before launching pg_repack +This chapter focuses on features that are important to know before launching `pg_repack` -1. pg_repack doesn’t process signals (such as SIGINT, which Ctrl+C causes) and doesn’t do any cleanup in case of interruption - In case of interruption, pg_repack leaves behind three things that need to be cleaned up by hand: +1. `pg_repack` doesn’t process signals (such as SIGINT, which Ctrl+C causes) and doesn’t do any cleanup in case of interruption. In case of interruption, pg_repack leaves behind three things that need to be cleaned up by hand: + a) Trigger on the processed table (in case there was a processing of a table) + b) Indexes in indisready=false or indisvalid=false states (optional; may not be the case), see [https://www.postgresql.org/docs/current/catalog-pg-index.html](https://www.postgresql.org/docs/current/catalog-pg-index.html) + c) Before the table’s processing, autovacuum for this table turns off -2. Trigger on the processed table (in case there was a processing of a table) - -3. Indexes in indisready=false or indisvalid=false states (optional; may not be the case), see [https://www.postgresql.org/docs/current/catalog-pg-index.html](https://www.postgresql.org/docs/current/catalog-pg-index.html) - -4. Before the table’s processing, autovacuum turns off - -5. `-T` (`--wait-timeout=SECS`, default value: 60) and `-D` (`--no-kill-backend`) options work only for DDL queries and don’t work for DML +2. `-T` (`--wait-timeout=SECS`, default value: 60) and `-D` (`--no-kill-backend`) options work only for DDL queries and don’t work for DML `-T` won’t help to cancel DML queries block – this should be done manually instead See:  [https://github.com/reorg/pg_repack/blob/master/bin/pg_repack.c#L137](https://github.com/reorg/pg_repack/blob/master/bin/pg_repack.c#L137) -6. Don’t use more than one pg_repack instance. When two pg_repack commands are executed simultaneously, there could be a deadlock. +3. Don’t use more than one `pg_repack` instance. When two `pg_repack` commands are executed simultaneously, there could be a deadlock. -7. pg_repack can’t reorganize temporary tables. +4. `pg_repack` can’t reorganize temporary tables. -8. pg_repack can’t reorganize tables with GiST indexes. +5. `pg_repack` can’t reorganize tables with GiST indexes. -9. While `pg_repack` is running, `DDL` commands cannot be executed on the target tables, except `VACUUM` and `ANALYZE`. To enforce this restriction, `pg_repack` places an `ACCESS SHARE` lock on the target table while the table is being reorganized. +6. While `pg_repack` is running, `DDL` commands cannot be executed on the target tables, except `VACUUM` and `ANALYZE`. To enforce this restriction, `pg_repack` places an `ACCESS SHARE` lock on the target table while the table is being reorganized. # Preparation to launch -This chapter describes the actions that should be taken to lower the risks of failure while pg_repack is working. +This chapter describes the actions that should be taken to lower the risks of failure while `pg_repack` is working. -1. For the user under which `pg_repack` is launched (usually postgres), the limit on opening files should be increased, the optimal value is 65535. This will prevent pg_repack from crashing when this limit is reached. +1. For the user under which `pg_repack` is launched (usually postgres), the limit on opening files should be increased, the optimal value is 65535. This will prevent `pg_repack` from crashing when this limit is reached. -2. Long-running transactions with an isolation level higher than `READ COMMITTED` will block pg_repack. Therefore, it is necessary to provide a way to stop queries that will prevent pg_repack from working. +2. Long-running transactions with an isolation level higher than `READ COMMITTED` will block `pg_repack`. Therefore, it is necessary to provide a way to stop queries that will prevent `pg_repack` from working. 3. Before starting a repack of large tables, verify that you have enough free disk space, as each processed table will be cloned during the process. @@ -79,7 +75,7 @@ This chapter describes the actions that should be taken to lower the risks of fa ``` -3. Cleanup in case of pg_repack failure +3. Cleanup in case of `pg_repack` failure ```sql drop extension pg_repack cascade; create extension pg_repack; @@ -114,7 +110,7 @@ where '{autovacuum_enabled=false}'::text[] @> reloptions and relkind = 'r' # Useful links -pg_repack links: +`pg_repack` links: [https://reorg.github.io/pg_repack/](https://reorg.github.io/pg_repack/) -- GitLab From 7d66fcc1fe4e1677ef4c76d1d8d31cee83794d61 Mon Sep 17 00:00:00 2001 From: Dementii Priadko Date: Thu, 24 Apr 2025 14:32:25 +0000 Subject: [PATCH 3/6] Edit 0096_how_to_use_pg_repack_to_mitigate_table_bloat.md --- 0096_how_to_use_pg_repack_to_mitigate_table_bloat.md | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/0096_how_to_use_pg_repack_to_mitigate_table_bloat.md b/0096_how_to_use_pg_repack_to_mitigate_table_bloat.md index fa4c130..7fe8f52 100644 --- a/0096_how_to_use_pg_repack_to_mitigate_table_bloat.md +++ b/0096_how_to_use_pg_repack_to_mitigate_table_bloat.md @@ -3,10 +3,14 @@ This chapter focuses on features that are important to know before launching `pg_repack` 1. `pg_repack` doesn’t process signals (such as SIGINT, which Ctrl+C causes) and doesn’t do any cleanup in case of interruption. In case of interruption, pg_repack leaves behind three things that need to be cleaned up by hand: - a) Trigger on the processed table (in case there was a processing of a table) + + a) Trigger on the processed table (in case there was a processing of a table) + b) Indexes in indisready=false or indisvalid=false states (optional; may not be the case), see [https://www.postgresql.org/docs/current/catalog-pg-index.html](https://www.postgresql.org/docs/current/catalog-pg-index.html) - c) Before the table’s processing, autovacuum for this table turns off + + c) Before the table’s processing, autovacuum for this table turns off + 2. `-T` (`--wait-timeout=SECS`, default value: 60) and `-D` (`--no-kill-backend`) options work only for DDL queries and don’t work for DML `-T` won’t help to cancel DML queries block – this should be done manually instead See:  [https://github.com/reorg/pg_repack/blob/master/bin/pg_repack.c#L137](https://github.com/reorg/pg_repack/blob/master/bin/pg_repack.c#L137) -- GitLab From efa54fbca79f1f9f2449f3949471ee693f639db6 Mon Sep 17 00:00:00 2001 From: Dementii Priadko Date: Thu, 24 Apr 2025 14:37:05 +0000 Subject: [PATCH 4/6] Edit 0096_how_to_use_pg_repack_to_mitigate_table_bloat.md --- ...how_to_use_pg_repack_to_mitigate_table_bloat.md | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/0096_how_to_use_pg_repack_to_mitigate_table_bloat.md b/0096_how_to_use_pg_repack_to_mitigate_table_bloat.md index 7fe8f52..6192b17 100644 --- a/0096_how_to_use_pg_repack_to_mitigate_table_bloat.md +++ b/0096_how_to_use_pg_repack_to_mitigate_table_bloat.md @@ -3,13 +3,13 @@ This chapter focuses on features that are important to know before launching `pg_repack` 1. `pg_repack` doesn’t process signals (such as SIGINT, which Ctrl+C causes) and doesn’t do any cleanup in case of interruption. In case of interruption, pg_repack leaves behind three things that need to be cleaned up by hand: - - a) Trigger on the processed table (in case there was a processing of a table) - b) Indexes in indisready=false or indisvalid=false states (optional; may not be the case), see [https://www.postgresql.org/docs/current/catalog-pg-index.html](https://www.postgresql.org/docs/current/catalog-pg-index.html) - - c) Before the table’s processing, autovacuum for this table turns off - + a) Trigger on the processed table (in case there was a processing of a table) + + b) Indexes in `indisready=false` or `indisvalid=false` states (optional; may not be the case), see [https://www.postgresql.org/docs/current/catalog-pg-index.html](https://www.postgresql.org/docs/current/catalog-pg-index.html) + + c) Before the table’s processing, autovacuum for this table turns off + 2. `-T` (`--wait-timeout=SECS`, default value: 60) and `-D` (`--no-kill-backend`) options work only for DDL queries and don’t work for DML `-T` won’t help to cancel DML queries block – this should be done manually instead @@ -34,7 +34,7 @@ This chapter describes the actions that should be taken to lower the risks of fa 3. Before starting a repack of large tables, verify that you have enough free disk space, as each processed table will be cloned during the process. -4. It is advised to disable `idle_in_transaction_session_timeout=0` for the user under which `pg_repack` is run. The point is that if the parameter is, for example, 10 minutes, then when processing a table with a size of >100 GB, the connection may be reset. +4. It is advised to set idle_in_transaction_session_timeout=0 for the user under which pg_repack is run The point is that if the parameter is, for example, 10 minutes, then when processing a table with a size of >100 GB, the connection may be reset. `pg_repack` uses multiple connections, and some connections may be in the `idle in transaction` state (this is normal behavior). -- GitLab From f7122bcd4b390a480aab49a16dd4f42b2e5051ab Mon Sep 17 00:00:00 2001 From: Dementii Priadko Date: Thu, 24 Apr 2025 14:43:17 +0000 Subject: [PATCH 5/6] Edit 0096_how_to_use_pg_repack_to_mitigate_table_bloat.md --- 0096_how_to_use_pg_repack_to_mitigate_table_bloat.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/0096_how_to_use_pg_repack_to_mitigate_table_bloat.md b/0096_how_to_use_pg_repack_to_mitigate_table_bloat.md index 6192b17..e04481f 100644 --- a/0096_how_to_use_pg_repack_to_mitigate_table_bloat.md +++ b/0096_how_to_use_pg_repack_to_mitigate_table_bloat.md @@ -2,7 +2,7 @@ This chapter focuses on features that are important to know before launching `pg_repack` -1. `pg_repack` doesn’t process signals (such as SIGINT, which Ctrl+C causes) and doesn’t do any cleanup in case of interruption. In case of interruption, pg_repack leaves behind three things that need to be cleaned up by hand: +1. `pg_repack` doesn’t process signals (such as SIGINT, which Ctrl+C causes) and doesn’t do any cleanup in case of interruption. In case of interruption, `pg_repack` leaves behind three things that need to be cleaned up by hand: a) Trigger on the processed table (in case there was a processing of a table) @@ -34,7 +34,7 @@ This chapter describes the actions that should be taken to lower the risks of fa 3. Before starting a repack of large tables, verify that you have enough free disk space, as each processed table will be cloned during the process. -4. It is advised to set idle_in_transaction_session_timeout=0 for the user under which pg_repack is run The point is that if the parameter is, for example, 10 minutes, then when processing a table with a size of >100 GB, the connection may be reset. +4. It is advised to set `idle_in_transaction_session_timeout=0` for the user under which `pg_repack` is run. The point is that if the parameter is, for example, 10 minutes, then when processing a table with a size of >100 GB, the connection may be reset. `pg_repack` uses multiple connections, and some connections may be in the `idle in transaction` state (this is normal behavior). -- GitLab From 82fd780c2c50f034e12755ab91f09ac0ed2c8247 Mon Sep 17 00:00:00 2001 From: Nikolay Samokhvalov Date: Thu, 26 Jun 2025 06:36:17 -0700 Subject: [PATCH 6/6] Polish wording + misc fixes --- ...o_use_pg_repack_to_mitigate_table_bloat.md | 230 ++++++++---------- 1 file changed, 101 insertions(+), 129 deletions(-) diff --git a/0096_how_to_use_pg_repack_to_mitigate_table_bloat.md b/0096_how_to_use_pg_repack_to_mitigate_table_bloat.md index e04481f..040ad50 100644 --- a/0096_how_to_use_pg_repack_to_mitigate_table_bloat.md +++ b/0096_how_to_use_pg_repack_to_mitigate_table_bloat.md @@ -1,129 +1,101 @@ -# General features of `pg_repack` - -This chapter focuses on features that are important to know before launching `pg_repack` - -1. `pg_repack` doesn’t process signals (such as SIGINT, which Ctrl+C causes) and doesn’t do any cleanup in case of interruption. In case of interruption, `pg_repack` leaves behind three things that need to be cleaned up by hand: - - a) Trigger on the processed table (in case there was a processing of a table) - - b) Indexes in `indisready=false` or `indisvalid=false` states (optional; may not be the case), see [https://www.postgresql.org/docs/current/catalog-pg-index.html](https://www.postgresql.org/docs/current/catalog-pg-index.html) - - c) Before the table’s processing, autovacuum for this table turns off - - -2. `-T` (`--wait-timeout=SECS`, default value: 60) and `-D` (`--no-kill-backend`) options work only for DDL queries and don’t work for DML - `-T` won’t help to cancel DML queries block – this should be done manually instead - See:  [https://github.com/reorg/pg_repack/blob/master/bin/pg_repack.c#L137](https://github.com/reorg/pg_repack/blob/master/bin/pg_repack.c#L137) - -3. Don’t use more than one `pg_repack` instance. When two `pg_repack` commands are executed simultaneously, there could be a deadlock. - -4. `pg_repack` can’t reorganize temporary tables. - -5. `pg_repack` can’t reorganize tables with GiST indexes. - -6. While `pg_repack` is running, `DDL` commands cannot be executed on the target tables, except `VACUUM` and `ANALYZE`. To enforce this restriction, `pg_repack` places an `ACCESS SHARE` lock on the target table while the table is being reorganized. - - -# Preparation to launch - -This chapter describes the actions that should be taken to lower the risks of failure while `pg_repack` is working. - -1. For the user under which `pg_repack` is launched (usually postgres), the limit on opening files should be increased, the optimal value is 65535. This will prevent `pg_repack` from crashing when this limit is reached. - -2. Long-running transactions with an isolation level higher than `READ COMMITTED` will block `pg_repack`. Therefore, it is necessary to provide a way to stop queries that will prevent `pg_repack` from working. - -3. Before starting a repack of large tables, verify that you have enough free disk space, as each processed table will be cloned during the process. - -4. It is advised to set `idle_in_transaction_session_timeout=0` for the user under which `pg_repack` is run. The point is that if the parameter is, for example, 10 minutes, then when processing a table with a size of >100 GB, the connection may be reset. - `pg_repack` uses multiple connections, and some connections may be in the `idle in transaction` state (this is normal behavior). - - -# Diagnostics and troubleshooting - -1. Finding corrupted indexes - ```sql - SELECT c_r.relname as tbl, c_i.relname as idx, - i.indisvalid, -- If true, the index can be used in queries. A value of false means the index might be incomplete: it will still be updated by INSERT/UPDATE commands, but it's not safe to use it in queries. - i.indisready  -- If true, the index is ready to accept data. A value of false means the index is ignored by INSERT/UPDATE operations. - from pg_index i - join pg_class c_r on i.indrelid = c_r.oid and c_r.relkind = 'r' - join pg_class c_i on i.indexrelid = c_i.oid and c_i.relkind = 'i' - where i.indisvalid = false OR i.indisready = false - ``` - - -2. Blockings diagnostics - ```sql - SELECT - blocking_locks.pid AS blocker_pid, - blocking_activity.usename AS blocker_user, - substring(blocking_activity.query FROM 0 FOR 150) AS blocker_statement, - blocked_locks.pid AS blocked_pid, - blocked_activity.usename AS blocked_user, - substring(blocked_activity.query FROM 0 FOR 150) AS blocked_statement - FROM pg_catalog.pg_locks blocked_locks - JOIN pg_catalog.pg_stat_activity blocked_activity ON blocked_activity.pid = blocked_locks.pid - JOIN pg_catalog.pg_locks blocking_locks ON blocking_locks.locktype = blocked_locks.locktype - AND blocking_locks.DATABASE IS NOT DISTINCT FROM blocked_locks.DATABASE - AND blocking_locks.relation IS NOT DISTINCT FROM blocked_locks.relation - AND blocking_locks.page IS NOT DISTINCT FROM blocked_locks.page - AND blocking_locks.tuple IS NOT DISTINCT FROM blocked_locks.tuple - AND blocking_locks.virtualxid IS NOT DISTINCT FROM blocked_locks.virtualxid - AND blocking_locks.transactionid IS NOT DISTINCT FROM blocked_locks.transactionid - AND blocking_locks.classid IS NOT DISTINCT FROM blocked_locks.classid - AND blocking_locks.objid IS NOT DISTINCT FROM blocked_locks.objid - AND blocking_locks.objsubid IS NOT DISTINCT FROM blocked_locks.objsubid - AND blocking_locks.pid != blocked_locks.pid - JOIN pg_catalog.pg_stat_activity blocking_activity ON blocking_activity.pid = blocking_locks.pid - WHERE NOT blocked_locks.GRANTED; - ``` - - -3. Cleanup in case of `pg_repack` failure - ```sql - drop extension pg_repack cascade; - create extension pg_repack; - -- check if there are any indexes in INVALID state and drop those: - do $$ - declare - index_name name; - begin - for index_name in - select c.relname - from pg_index as i - join pg_class c on - c.oid = i.indexrelid - and not indisvalid - and relname ~ '^index_\d+$' - loop - execute format( - 'drop index %I', - index_name - ); - end loop; - end; - $$ language plpgsql; - ``` - -4. Search for tables with autovacuum turned off -```sql -select * -from pg_class -where '{autovacuum_enabled=false}'::text[] @> reloptions and relkind = 'r' -``` - -# Useful links - -`pg_repack` links: - -[https://reorg.github.io/pg_repack/](https://reorg.github.io/pg_repack/) - -[https://gitlab.com/gitlab-com/gl-infra/infrastructure/issues/6181#note_141114178](https://gitlab.com/gitlab-com/gl-infra/infrastructure/issues/6181#note_141114178) - -References on database internals in the context of concurrent transaction processing: - -[https://momjian.us/main/presentations/internals.html](https://momjian.us/main/presentations/internals.html) - -[http://www.interdb.jp/pg/pgsql05.html](http://www.interdb.jp/pg/pgsql05.html) - -[http://www.interdb.jp/pg/pgsql06.html](http://www.interdb.jp/pg/pgsql06.html) \ No newline at end of file +## How to Avoid Failures +Here we describe the actions worth taking to reduce the risk of failures when using `pg_repack`. +1. `pg_repack` doesn't have comprehensive signal processing (such as `SIGINT`, which is issued by `Ctrl+C`) and doesn't perform cleanup when interrupted. In such cases, `pg_repack` may leave behind three types of objects that need manual cleanup: + a. Triggers on the processed table (if a table was being processed) + b. Indexes in `indisready=false` or `indisvalid=false` states (this may or may not occur), see https://www.postgresql.org/docs/current/catalog-pg-index.html + c. Before processing a table, autovacuum is disabled at the table level – this setting needs to be restored +2. The `-T` (`--wait-timeout=SECS`, default: 60) and `-D` (`--no-kill-backend`) options work only for DDL queries, not for DML. `-T` won't help cancel blocking DML queries; these must be cancelled manually using `pg_cancel_backend()`. + - See [source code](https://github.com/reorg/pg_repack/blob/693bce67ba19f8e1fe4d18c266e6c34a3092c777/bin/pg_repack.c#L137) + - Also see this issue: https://github.com/reorg/pg_repack/issues/456 +3. Don't run multiple `pg_repack` processes simultaneously. When two `pg_repack` commands execute at the same time, they might cause a deadlock. +4. `pg_repack` cannot reorganize temporary tables. +5. `pg_repack` cannot reorganize tables with GiST indexes. +6. While `pg_repack` is running, DDL commands cannot be executed on the target tables, except for `VACUUM` and `ANALYZE`. To enforce this restriction, `pg_repack` places an `ACCESS SHARE` lock on the target table during repacking. +7. For the user running `pg_repack` (usually `postgres`), increase the open files limit (`ulimit -n`) to 65535. This prevents `pg_repack` from crashing when reaching this limit. +8. Long-running transactions with an isolation level higher than `READ COMMITTED` will block `pg_repack`. Therefore, you need a way to stop queries that would prevent `pg_repack` from working. +9. Before running `pg_repack` on large tables, verify you have sufficient free disk space, as each processed table will be cloned during the process. +10. We recommend setting `idle_in_transaction_session_timeout = 0` for the user running `pg_repack`. If this parameter is set to, for example, 10 minutes, then when processing a table larger than 100 GB, the connection may be reset. `pg_repack` uses multiple connections, and some connections may be in the `idle in transaction` state (this is expected behavior). + +# Diagnostics and Troubleshooting +1. How to find corrupted indexes (left behind by `pg_repack` after an unsuccessful attempt): + ```sql + select + c_r.relname as tbl, + c_i.relname as idx, + i.indisvalid, -- If true, the index can be used in queries. "false" means the index might be incomplete: it will still be updated by INSERT/UPDATE commands, but it's not safe to use in queries. + i.indisready -- If true, the index is ready to accept data. "false" means the index is ignored by INSERT/UPDATE operations. + from pg_index i + join pg_class c_r on i.indrelid = c_r.oid and c_r.relkind = 'r' + join pg_class c_i on i.indexrelid = c_i.oid and c_i.relkind = 'i' + where + not i.indisvalid + or not i.indisready; + ``` +2. Blocking diagnostics: + ```sql + select + blocking_locks.pid as blocker_pid, + blocking_activity.usename as blocker_user, + substring(blocking_activity.query from 0 for 150) as blocker_statement, + blocked_locks.pid as blocked_pid, + blocked_activity.usename as blocked_user, + substring(blocked_activity.query from 0 for 150) as blocked_statement + from pg_catalog.pg_locks blocked_locks + join pg_catalog.pg_stat_activity blocked_activity on blocked_activity.pid = blocked_locks.pid + join pg_catalog.pg_locks blocking_locks on + blocking_locks.locktype = blocked_locks.locktype + and blocking_locks.database is not distinct from blocked_locks.database + and blocking_locks.relation is not distinct from blocked_locks.relation + and blocking_locks.page is not distinct from blocked_locks.page + and blocking_locks.tuple is not distinct from blocked_locks.tuple + and blocking_locks.virtualxid is not distinct from blocked_locks.virtualxid + and blocking_locks.transactionid is not distinct from blocked_locks.transactionid + and blocking_locks.classid is not distinct from blocked_locks.classid + and blocking_locks.objid is not distinct from blocked_locks.objid + and blocking_locks.objsubid is not distinct from blocked_locks.objsubid + and blocking_locks.pid <> blocked_locks.pid + join pg_catalog.pg_stat_activity blocking_activity on blocking_activity.pid = blocking_locks.pid + where not blocked_locks.granted; + ``` +3. Cleanup after `pg_repack` failure: + ```sql + drop extension pg_repack cascade; + create extension pg_repack; + -- check if there are any indexes in INVALID state and drop them: + do $$ + declare + index_name name; + begin + for index_name in + select c.relname + from pg_index as i + join pg_class c on + c.oid = i.indexrelid + and not indisvalid + and relname ~ '^index_\d+$' + loop + execute format( + 'drop index %I', + index_name + ); + end loop; + end; + $$ language plpgsql; + ``` +4. Search for tables with autovacuum disabled: + ```sql + select * + from pg_class + where + '{autovacuum_enabled=false}'::text[] @> reloptions + and relkind = 'r' + ``` +# Useful Links +- https://reorg.github.io/pg_repack/ +- https://gitlab.com/gitlab-com/gl-infra/infrastructure/issues/6181#note_141114178 +- https://github.com/reorg/pg_repack/issues/456 + +Additional materials on database internals in the context of concurrent transaction processing: +- https://momjian.us/main/presentations/internals.html +- http://www.interdb.jp/pg/pgsql05.html +- http://www.interdb.jp/pg/pgsql06.html \ No newline at end of file -- GitLab