Skip to content

Commit 2e5fb44

Browse files
ashutosh-bapatCommitfest Bot
authored andcommitted
Report output plugin statistics in pg_stat_replication_slots
As of now pg_stat_replication_slots reports statistics about the reorder buffer, but it does not report output plugin statistics like the amount of data filtered by the output plugin, amount of data sent downstream or the number of transactions sent downstream. This statistics is useful when investigating issues related to a slow downstream. This commit adds following fields to pg_stat_replication_slots - plugin_filtered_bytes is the amount of changes filtered out by the output plugin - plugin_sent_txns is the amount of transactions sent downstream by the output plugin - plugin_sent_bytes is the amount of data sent downstream by the output plugin. The prefix "plugin_" indicates that these counters are related to and maintained by the output plugin. An output plugin may choose not to initialize LogicalDecodingContext::stats, which holds these counters, in which case the above columns will be reported as NULL. When the stats are disabled after being enabled for a while, the plugin stats are reset to 0, rather than carrying over the stale stats from the time when the plugin was supporting the stats. This does not matter if the plugin continues not to support statistics forever. But in case it was supporting the stats once, discontinued doing so at some point in time and then starts supporting the stats later, accumulating the new stats based on the earlier accumulated stats could be misleading. Filtered bytes are reported next to total_bytes to keep these two closely related fields together. Additionally report name of the output plugin in the view for an easy reference. total_bytes and total_txns are the only fields remaining unqualified - they do not convey what those bytes and txns are. Hence rename them total_wal_bytes and total_wal_txns respectively to indicate that those counts come from WAL stream. Author: Ashutosh Bapat <[email protected]> Reviewed-by: Shveta Malik <[email protected]> Reviewed-by: Bertrand Drouvot <[email protected]> Reviewed-by: Amit Kapila <[email protected]> Reviewed-by: Ashutosh Sharma <[email protected]> Discussion: https://www.postgresql.org/message-id/CAExHW5s6KntzUyUoMbKR5dgwRmdV2Ay_2+AnTgYGAzo=Qv61wA@mail.gmail.com
1 parent f242dbc commit 2e5fb44

File tree

26 files changed

+375
-90
lines changed

26 files changed

+375
-90
lines changed

contrib/test_decoding/expected/stats.out

Lines changed: 41 additions & 36 deletions
Original file line numberDiff line numberDiff line change
@@ -37,12 +37,17 @@ SELECT pg_stat_force_next_flush();
3737

3838
(1 row)
3939

40-
SELECT slot_name, spill_txns = 0 AS spill_txns, spill_count = 0 AS spill_count, total_txns > 0 AS total_txns, total_bytes > 0 AS total_bytes, mem_exceeded_count = 0 AS mem_exceeded_count FROM pg_stat_replication_slots ORDER BY slot_name;
41-
slot_name | spill_txns | spill_count | total_txns | total_bytes | mem_exceeded_count
42-
------------------------+------------+-------------+------------+-------------+--------------------
43-
regression_slot_stats1 | t | t | t | t | t
44-
regression_slot_stats2 | t | t | t | t | t
45-
regression_slot_stats3 | t | t | t | t | t
40+
-- total_wal_txns may vary based on the background activity but plugin_sent_txns
41+
-- should always be 1 since the background transactions are always skipped.
42+
-- Filtered bytes would be set only when there's a change that was passed to the
43+
-- plugin but was filtered out. Depending upon the background transactions,
44+
-- filtered bytes may or may not be zero.
45+
SELECT slot_name, spill_txns = 0 AS spill_txns, spill_count = 0 AS spill_count, total_wal_txns > 0 AS total_wal_txns, total_wal_bytes > 0 AS total_wal_bytes, plugin_sent_txns, plugin_sent_bytes > 0 AS sent_bytes, plugin_filtered_bytes >= 0 AS filtered_bytes, mem_exceeded_count = 0 AS mem_exceeded_count FROM pg_stat_replication_slots ORDER BY slot_name;
46+
slot_name | spill_txns | spill_count | total_wal_txns | total_wal_bytes | plugin_sent_txns | sent_bytes | filtered_bytes | mem_exceeded_count
47+
------------------------+------------+-------------+----------------+-----------------+------------------+------------+----------------+--------------------
48+
regression_slot_stats1 | t | t | t | t | 1 | t | t | t
49+
regression_slot_stats2 | t | t | t | t | 1 | t | t | t
50+
regression_slot_stats3 | t | t | t | t | 1 | t | t | t
4651
(3 rows)
4752

4853
RESET logical_decoding_work_mem;
@@ -53,12 +58,12 @@ SELECT pg_stat_reset_replication_slot('regression_slot_stats1');
5358

5459
(1 row)
5560

56-
SELECT slot_name, spill_txns = 0 AS spill_txns, spill_count = 0 AS spill_count, total_txns > 0 AS total_txns, total_bytes > 0 AS total_bytes, mem_exceeded_count = 0 AS mem_exceeded_count FROM pg_stat_replication_slots ORDER BY slot_name;
57-
slot_name | spill_txns | spill_count | total_txns | total_bytes | mem_exceeded_count
58-
------------------------+------------+-------------+------------+-------------+--------------------
59-
regression_slot_stats1 | t | t | f | f | t
60-
regression_slot_stats2 | t | t | t | t | t
61-
regression_slot_stats3 | t | t | t | t | t
61+
SELECT slot_name, spill_txns = 0 AS spill_txns, spill_count = 0 AS spill_count, total_wal_txns > 0 AS total_wal_txns, total_wal_bytes > 0 AS total_wal_bytes, plugin_sent_txns, plugin_sent_bytes > 0 AS sent_bytes, plugin_filtered_bytes >= 0 AS filtered_bytes, mem_exceeded_count = 0 AS mem_exceeded_count FROM pg_stat_replication_slots ORDER BY slot_name;
62+
slot_name | spill_txns | spill_count | total_wal_txns | total_wal_bytes | plugin_sent_txns | sent_bytes | filtered_bytes | mem_exceeded_count
63+
------------------------+------------+-------------+----------------+-----------------+------------------+------------+----------------+--------------------
64+
regression_slot_stats1 | t | t | f | f | | | | t
65+
regression_slot_stats2 | t | t | t | t | 1 | t | t | t
66+
regression_slot_stats3 | t | t | t | t | 1 | t | t | t
6267
(3 rows)
6368

6469
-- reset stats for all slots
@@ -68,27 +73,27 @@ SELECT pg_stat_reset_replication_slot(NULL);
6873

6974
(1 row)
7075

71-
SELECT slot_name, spill_txns = 0 AS spill_txns, spill_count = 0 AS spill_count, total_txns > 0 AS total_txns, total_bytes > 0 AS total_bytes, mem_exceeded_count = 0 AS mem_exceeded_count FROM pg_stat_replication_slots ORDER BY slot_name;
72-
slot_name | spill_txns | spill_count | total_txns | total_bytes | mem_exceeded_count
73-
------------------------+------------+-------------+------------+-------------+--------------------
74-
regression_slot_stats1 | t | t | f | f | t
75-
regression_slot_stats2 | t | t | f | f | t
76-
regression_slot_stats3 | t | t | f | f | t
76+
SELECT slot_name, spill_txns = 0 AS spill_txns, spill_count = 0 AS spill_count, total_wal_txns > 0 AS total_wal_txns, total_wal_bytes > 0 AS total_wal_bytes, plugin_sent_txns, plugin_sent_bytes, plugin_filtered_bytes, mem_exceeded_count = 0 AS mem_exceeded_count FROM pg_stat_replication_slots ORDER BY slot_name;
77+
slot_name | spill_txns | spill_count | total_wal_txns | total_wal_bytes | plugin_sent_txns | plugin_sent_bytes | plugin_filtered_bytes | mem_exceeded_count
78+
------------------------+------------+-------------+----------------+-----------------+------------------+-------------------+-----------------------+--------------------
79+
regression_slot_stats1 | t | t | f | f | | | | t
80+
regression_slot_stats2 | t | t | f | f | | | | t
81+
regression_slot_stats3 | t | t | f | f | | | | t
7782
(3 rows)
7883

7984
-- verify accessing/resetting stats for non-existent slot does something reasonable
8085
SELECT * FROM pg_stat_get_replication_slot('do-not-exist');
81-
slot_name | spill_txns | spill_count | spill_bytes | stream_txns | stream_count | stream_bytes | mem_exceeded_count | total_txns | total_bytes | stats_reset
82-
--------------+------------+-------------+-------------+-------------+--------------+--------------+--------------------+------------+-------------+-------------
83-
do-not-exist | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
86+
slot_name | spill_txns | spill_count | spill_bytes | stream_txns | stream_count | stream_bytes | mem_exceeded_count | total_wal_txns | total_wal_bytes | plugin_filtered_bytes | plugin_sent_txns | plugin_sent_bytes | stats_reset
87+
--------------+------------+-------------+-------------+-------------+--------------+--------------+--------------------+----------------+-----------------+-----------------------+------------------+-------------------+-------------
88+
do-not-exist | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | | | |
8489
(1 row)
8590

8691
SELECT pg_stat_reset_replication_slot('do-not-exist');
8792
ERROR: replication slot "do-not-exist" does not exist
8893
SELECT * FROM pg_stat_get_replication_slot('do-not-exist');
89-
slot_name | spill_txns | spill_count | spill_bytes | stream_txns | stream_count | stream_bytes | mem_exceeded_count | total_txns | total_bytes | stats_reset
90-
--------------+------------+-------------+-------------+-------------+--------------+--------------+--------------------+------------+-------------+-------------
91-
do-not-exist | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
94+
slot_name | spill_txns | spill_count | spill_bytes | stream_txns | stream_count | stream_bytes | mem_exceeded_count | total_wal_txns | total_wal_bytes | plugin_filtered_bytes | plugin_sent_txns | plugin_sent_bytes | stats_reset
95+
--------------+------------+-------------+-------------+-------------+--------------+--------------+--------------------+----------------+-----------------+-----------------------+------------------+-------------------+-------------
96+
do-not-exist | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | | | |
9297
(1 row)
9398

9499
-- spilling the xact
@@ -121,20 +126,20 @@ SELECT slot_name, spill_txns > 0 AS spill_txns, spill_count > 0 AS spill_count,
121126
-- Ensure stats can be repeatedly accessed using the same stats snapshot. See
122127
-- https://postgr.es/m/20210317230447.c7uc4g3vbs4wi32i%40alap3.anarazel.de
123128
BEGIN;
124-
SELECT slot_name FROM pg_stat_replication_slots;
125-
slot_name
126-
------------------------
127-
regression_slot_stats1
128-
regression_slot_stats2
129-
regression_slot_stats3
129+
SELECT slot_name, plugin FROM pg_stat_replication_slots;
130+
slot_name | plugin
131+
------------------------+---------------
132+
regression_slot_stats1 | test_decoding
133+
regression_slot_stats2 | test_decoding
134+
regression_slot_stats3 | test_decoding
130135
(3 rows)
131136

132-
SELECT slot_name FROM pg_stat_replication_slots;
133-
slot_name
134-
------------------------
135-
regression_slot_stats1
136-
regression_slot_stats2
137-
regression_slot_stats3
137+
SELECT slot_name, plugin FROM pg_stat_replication_slots;
138+
slot_name | plugin
139+
------------------------+---------------
140+
regression_slot_stats1 | test_decoding
141+
regression_slot_stats2 | test_decoding
142+
regression_slot_stats3 | test_decoding
138143
(3 rows)
139144

140145
COMMIT;

contrib/test_decoding/sql/stats.sql

Lines changed: 11 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -15,16 +15,22 @@ SELECT count(*) FROM pg_logical_slot_get_changes('regression_slot_stats1', NULL,
1515
SELECT count(*) FROM pg_logical_slot_get_changes('regression_slot_stats2', NULL, NULL, 'skip-empty-xacts', '1');
1616
SELECT count(*) FROM pg_logical_slot_get_changes('regression_slot_stats3', NULL, NULL, 'skip-empty-xacts', '1');
1717
SELECT pg_stat_force_next_flush();
18-
SELECT slot_name, spill_txns = 0 AS spill_txns, spill_count = 0 AS spill_count, total_txns > 0 AS total_txns, total_bytes > 0 AS total_bytes, mem_exceeded_count = 0 AS mem_exceeded_count FROM pg_stat_replication_slots ORDER BY slot_name;
18+
19+
-- total_wal_txns may vary based on the background activity but plugin_sent_txns
20+
-- should always be 1 since the background transactions are always skipped.
21+
-- Filtered bytes would be set only when there's a change that was passed to the
22+
-- plugin but was filtered out. Depending upon the background transactions,
23+
-- filtered bytes may or may not be zero.
24+
SELECT slot_name, spill_txns = 0 AS spill_txns, spill_count = 0 AS spill_count, total_wal_txns > 0 AS total_wal_txns, total_wal_bytes > 0 AS total_wal_bytes, plugin_sent_txns, plugin_sent_bytes > 0 AS sent_bytes, plugin_filtered_bytes >= 0 AS filtered_bytes, mem_exceeded_count = 0 AS mem_exceeded_count FROM pg_stat_replication_slots ORDER BY slot_name;
1925
RESET logical_decoding_work_mem;
2026

2127
-- reset stats for one slot, others should be unaffected
2228
SELECT pg_stat_reset_replication_slot('regression_slot_stats1');
23-
SELECT slot_name, spill_txns = 0 AS spill_txns, spill_count = 0 AS spill_count, total_txns > 0 AS total_txns, total_bytes > 0 AS total_bytes, mem_exceeded_count = 0 AS mem_exceeded_count FROM pg_stat_replication_slots ORDER BY slot_name;
29+
SELECT slot_name, spill_txns = 0 AS spill_txns, spill_count = 0 AS spill_count, total_wal_txns > 0 AS total_wal_txns, total_wal_bytes > 0 AS total_wal_bytes, plugin_sent_txns, plugin_sent_bytes > 0 AS sent_bytes, plugin_filtered_bytes >= 0 AS filtered_bytes, mem_exceeded_count = 0 AS mem_exceeded_count FROM pg_stat_replication_slots ORDER BY slot_name;
2430

2531
-- reset stats for all slots
2632
SELECT pg_stat_reset_replication_slot(NULL);
27-
SELECT slot_name, spill_txns = 0 AS spill_txns, spill_count = 0 AS spill_count, total_txns > 0 AS total_txns, total_bytes > 0 AS total_bytes, mem_exceeded_count = 0 AS mem_exceeded_count FROM pg_stat_replication_slots ORDER BY slot_name;
33+
SELECT slot_name, spill_txns = 0 AS spill_txns, spill_count = 0 AS spill_count, total_wal_txns > 0 AS total_wal_txns, total_wal_bytes > 0 AS total_wal_bytes, plugin_sent_txns, plugin_sent_bytes, plugin_filtered_bytes, mem_exceeded_count = 0 AS mem_exceeded_count FROM pg_stat_replication_slots ORDER BY slot_name;
2834

2935
-- verify accessing/resetting stats for non-existent slot does something reasonable
3036
SELECT * FROM pg_stat_get_replication_slot('do-not-exist');
@@ -46,8 +52,8 @@ SELECT slot_name, spill_txns > 0 AS spill_txns, spill_count > 0 AS spill_count,
4652
-- Ensure stats can be repeatedly accessed using the same stats snapshot. See
4753
-- https://postgr.es/m/20210317230447.c7uc4g3vbs4wi32i%40alap3.anarazel.de
4854
BEGIN;
49-
SELECT slot_name FROM pg_stat_replication_slots;
50-
SELECT slot_name FROM pg_stat_replication_slots;
55+
SELECT slot_name, plugin FROM pg_stat_replication_slots;
56+
SELECT slot_name, plugin FROM pg_stat_replication_slots;
5157
COMMIT;
5258

5359

contrib/test_decoding/t/001_repl_stats.pl

Lines changed: 14 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -23,10 +23,16 @@ sub test_slot_stats
2323

2424
my ($node, $expected, $msg) = @_;
2525

26+
# If there are background transactions which are filtered out by the output
27+
# plugin, plugin_filtered_bytes may be greater than 0. But it's not
28+
# guaranteed that such transactions would be present.
2629
my $result = $node->safe_psql(
2730
'postgres', qq[
28-
SELECT slot_name, total_txns > 0 AS total_txn,
29-
total_bytes > 0 AS total_bytes
31+
SELECT slot_name, total_wal_txns > 0 AS total_txn,
32+
total_wal_bytes > 0 AS total_bytes,
33+
plugin_sent_txns > 0 AS sent_txn,
34+
plugin_sent_bytes > 0 AS sent_bytes,
35+
plugin_filtered_bytes >= 0 AS filtered_bytes
3036
FROM pg_stat_replication_slots
3137
ORDER BY slot_name]);
3238
is($result, $expected, $msg);
@@ -65,7 +71,7 @@ sub test_slot_stats
6571
'postgres', qq[
6672
SELECT count(slot_name) >= 4 FROM pg_stat_replication_slots
6773
WHERE slot_name ~ 'regression_slot'
68-
AND total_txns > 0 AND total_bytes > 0;
74+
AND total_wal_txns > 0 AND total_wal_bytes > 0;
6975
]) or die "Timed out while waiting for statistics to be updated";
7076

7177
# Test to drop one of the replication slot and verify replication statistics data is
@@ -80,9 +86,9 @@ sub test_slot_stats
8086
# restart.
8187
test_slot_stats(
8288
$node,
83-
qq(regression_slot1|t|t
84-
regression_slot2|t|t
85-
regression_slot3|t|t),
89+
qq(regression_slot1|t|t|t|t|t
90+
regression_slot2|t|t|t|t|t
91+
regression_slot3|t|t|t|t|t),
8692
'check replication statistics are updated');
8793

8894
# Test to remove one of the replication slots and adjust
@@ -104,8 +110,8 @@ sub test_slot_stats
104110
# restart.
105111
test_slot_stats(
106112
$node,
107-
qq(regression_slot1|t|t
108-
regression_slot2|t|t),
113+
qq(regression_slot1|t|t|t|t|t
114+
regression_slot2|t|t|t|t|t),
109115
'check replication statistics after removing the slot file');
110116

111117
# cleanup

contrib/test_decoding/test_decoding.c

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -173,6 +173,7 @@ pg_decode_startup(LogicalDecodingContext *ctx, OutputPluginOptions *opt,
173173
data->only_local = false;
174174

175175
ctx->output_plugin_private = data;
176+
ctx->stats = palloc0(sizeof(OutputPluginStats));
176177

177178
opt->output_type = OUTPUT_PLUGIN_TEXTUAL_OUTPUT;
178179
opt->receive_rewrites = false;
@@ -310,6 +311,7 @@ static void
310311
pg_output_begin(LogicalDecodingContext *ctx, TestDecodingData *data, ReorderBufferTXN *txn, bool last_write)
311312
{
312313
OutputPluginPrepareWrite(ctx, last_write);
314+
ctx->stats->sentTxns++;
313315
if (data->include_xids)
314316
appendStringInfo(ctx->out, "BEGIN %u", txn->xid);
315317
else

doc/src/sgml/logicaldecoding.sgml

Lines changed: 36 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -938,6 +938,42 @@ typedef struct OutputPluginOptions
938938
needs to have a state, it can
939939
use <literal>ctx-&gt;output_plugin_private</literal> to store it.
940940
</para>
941+
942+
<para>
943+
The startup callback may initialize <literal>ctx-&gt;stats</literal>,
944+
typically as follows, if it chooses to maintain and report statistics
945+
about its activity in <structname>pg_stat_replication_slots</structname>.
946+
<programlisting>
947+
ctx->stats = palloc0(sizeof(OutputPluginStats));
948+
</programlisting>
949+
where <literal>OutputPluginStats</literal> is defined as follows:
950+
<programlisting>
951+
typedef struct OutputPluginStats
952+
{
953+
int64 sentTxns;
954+
int64 sentBytes;
955+
int64 filteredBytes;
956+
} OutputPluginStats;
957+
</programlisting>
958+
<literal>sentTxns</literal> is the number of transactions sent downstream
959+
by the output plugin. <literal>sentBytes</literal> is the amount of data,
960+
in bytes, sent downstream by the output plugin.
961+
<literal>filteredBytes</literal> is the size of changes, in bytes, that
962+
are filtered out by the output plugin.
963+
<function>OutputPluginWrite</function> will update
964+
<literal>sentBytes</literal> if <literal>ctx-&gt;stats</literal> is
965+
initialized by the output plugin. Function
966+
<literal>ReorderBufferChangeSize</literal> may be used to find the size of
967+
filtered <literal>ReorderBufferChange</literal>.
968+
</para>
969+
970+
<note>
971+
<para>
972+
Once a plugin starts reporting and maintaining these statistics, it is
973+
not expected that they will discontinue doing so. If they do, the result
974+
may be misleading because of the cumulative nature of these statistics.
975+
</para>
976+
</note>
941977
</sect3>
942978

943979
<sect3 id="logicaldecoding-output-plugin-shutdown">

0 commit comments

Comments
 (0)