Skip to content

incremental chain restore optimizations #169

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
gsmolk opened this issue Jan 21, 2020 · 7 comments
Closed

incremental chain restore optimizations #169

gsmolk opened this issue Jan 21, 2020 · 7 comments
Milestone

Comments

@gsmolk
Copy link
Contributor

gsmolk commented Jan 21, 2020

Current algorithm used in restore and merge of incremental chain is sub-optimal.

@gsmolk gsmolk added this to the 2.3.0 milestone Jan 21, 2020
gsmolk added a commit that referenced this issue Jan 21, 2020
@alexign
Copy link

alexign commented Jan 23, 2020

Hello!

Here is some benchmarks with compiled binary from issue_169 branch and vanilla binary v2.2.7 from pgpro ubuntu version.
All backup files were cached in OS file cache because server has 384GB of RAM

# free
              total        used        free      shared  buff/cache   available
Mem:      396137056     3270828    10228528        3340   382637700   390224304
Swap:             0           0           0
#lscpu
Architecture:        x86_64
CPU op-mode(s):      32-bit, 64-bit
Byte Order:          Little Endian
CPU(s):              56
On-line CPU(s) list: 0-55
Thread(s) per core:  2
Core(s) per socket:  14

Here is some benchmarks:

 instance  10       Q4J2U0  2020-01-22 21:49:35+00  DELTA  STREAM    7/7   10m:3s  5648MB    32MB    1.82  140/F40000D0  140/F4FF0D48  OK
 instance  10       Q4H82I  2020-01-21 21:47:14+00  DELTA  STREAM    7/7   9m:53s  5376MB    48MB    1.82  13E/50000D0   13E/62AF4A0   OK
 instance  10       Q4FDER  2020-01-20 21:45:09+00  DELTA  STREAM    7/7   7m:33s  3745MB    32MB    1.79  13B/3B0000D0  13B/3B9CD8E0  OK
 instance  10       Q4DR2N  2020-01-20 00:48:59+00  DELTA  STREAM    7/7  11m:26s  7195MB   912MB    1.79  139/DF062588  13A/16ADA550  OK
 instance  10       Q4BQMV  2020-01-18 23:38:26+00  FULL   STREAM    7/0    1h:5m    48GB   128MB    1.85  135/A60000D0  135/ACDBA918  OK

#vanilla binary from pgpro ubuntu repo
~/scripts/ansible# pg_probackup-10 --version
pg_probackup-10 2.2.7 (PostgreSQL 10.10)


#compiled binary from issue_169 branch
~/scripts/ansible# pg_probackup-devel-10 --version
pg_probackup-devel-10 2.2.7 (PostgreSQL 10.11)

(.venv) root@server:~/scripts/ansible# time pg_probackup-devel-10 restore --skip-block-validation -j 8 -B /backup/postgresql/backup --instance instance -i Q4J2U0 --remote-proto=none -D /backup/postgresql/tmp/instance
INFO: Validating parents for backup Q4J2U0
INFO: Validating backup Q4BQMV
INFO: Backup Q4BQMV data files are valid
INFO: Validating backup Q4DR2N
INFO: Backup Q4DR2N data files are valid
INFO: Validating backup Q4FDER
INFO: Backup Q4FDER data files are valid
INFO: Validating backup Q4H82I
INFO: Backup Q4H82I data files are valid
INFO: Validating backup Q4J2U0
INFO: Backup Q4J2U0 data files are valid
INFO: Backup Q4J2U0 WAL segments are valid
INFO: Backup Q4J2U0 is valid.
INFO: Restore of backup Q4J2U0 completed.

           issue_169                                                  vanilla
1 run
real    4m50.818s                                                  9m10.289
user    16m36.799s                                               16m41.06
sys     3m12.250s                                                  3m10.250

2 run
real    4m12.526s                                                  9m8.740s
user    16m38.221s                                               16m38.63
sys     2m56.392s                                                  3m8.451s

3 run
real    4m11.876                                                   9m3.121s
user    16m40.398                                                16m38.88
sys     3m1.692s                                                   3m6.698s

gsmolk added a commit that referenced this issue Jan 26, 2020
gsmolk added a commit that referenced this issue Jan 26, 2020
@gsmolk
Copy link
Contributor Author

gsmolk commented Jan 26, 2020

@alexign, thank you for the feedback!
Can you run your benchmarks again using the latest code from issue_169 branch?

New flag --no-sync is added, so you can measure the impact of file copying and syncing to disk separately.

@alexign
Copy link

alexign commented Feb 1, 2020

Here is some new benchmarks based on last commit from issue_169. Hardware is the same as in first message in this issue thread

=================================================================
 Instance  Version  ID      Recovery Time           Mode   WAL Mode  TLI     Time    Data     WAL  Zratio  Start LSN    Stop LSN     Status
============================================================================================================================================
 instance  10       Q4ZR3G  2020-01-31 21:54:26+00  DELTA  STREAM    6/6   9m:11s  1185MB    32MB    3.05  F9/910023C0  F9/91C60950  OK
 instance  10       Q4XWIJ  2020-01-30 21:56:11+00  DELTA  STREAM    6/6    9m:5s  1308MB    32MB    2.96  F8/C3008578  F8/C3BFF348  OK
 instance  10       Q4W1Q4  2020-01-29 21:54:51+00  DELTA  STREAM    6/6  10m:25s  2138MB    32MB    2.82  F7/E3000028  F7/E3DEF228  OK
 instance  10       Q4U6X9  2020-01-28 21:50:48+00  DELTA  STREAM    6/6   9m:16s  1272MB    32MB    3.24  F6/930000D0  F6/93BF0380  OK
 instance  10       Q4SC7B  2020-01-27 22:03:01+00  DELTA  STREAM    6/6  22m:39s  9786MB    48MB    3.13  F5/B4000028  F5/B5E62FE8  OK
 instance  10       Q4QHLU  2020-01-26 22:14:57+00  DELTA  STREAM    6/6   33m:5s    15GB  1648MB    2.93  F2/421AA018  F2/A717F060  OK
 instance  10       Q4OPSP  2020-01-26 00:43:24+00  FULL   STREAM    6/0   1h:59m    70GB   112MB    2.91  ED/38000060  ED/3DA59120  OK

exec string:

time pg_probackup-devel restore --skip-block-validation -j 8 -B /backup/postgresql/backup --instance instance --remote-proto=none -D /backup/postgresql/tmp/instance

patched version:

real         user           sys
5m6.581s     31m14.408s     5m44.039s
5m6.936s     31m14.829s     5m44.954s
5m7.186s     31m16.250s     5m48.475s

vanilla version:

real           user           sys
52m50.003s     31m18.916s     6m40.304s
52m27.249s     31m12.383s     6m38.730s
52m32.369s     31m28.308s     6m44.966s 

@gsmolk
Copy link
Contributor Author

gsmolk commented Feb 2, 2020

Splendid!
What was the efficiency ratio of restore on patched version? It`s reported as elog message. Example:

INFO: Restoring the database from backup at 2020-01-21 23:21:15+03
INFO: Start restoring backup files. PGDATA size: 1536MB
INFO: Backup files are restored. Transfered bytes: 1618MB, time elapsed: 2s
INFO: Approximate restore efficiency ratio: 95% (1536MB/1618MB)
WARNING: Restored files are not synced to disk
INFO: Restore of backup Q4H4JF completed.

@gsmolk
Copy link
Contributor Author

gsmolk commented Feb 3, 2020

TODO: We should also decompress data on the agent side.

@gsmolk
Copy link
Contributor Author

gsmolk commented Feb 6, 2020

Decompression on remote agent is implemented by @knizhnik.
Merged to branch issue_169.

gsmolk added a commit that referenced this issue Feb 20, 2020
gsmolk added a commit that referenced this issue Feb 20, 2020
gsmolk added a commit that referenced this issue Feb 20, 2020
gsmolk added a commit that referenced this issue Feb 21, 2020
gsmolk added a commit that referenced this issue Mar 2, 2020
gsmolk added a commit that referenced this issue Mar 2, 2020
gsmolk added a commit that referenced this issue Mar 5, 2020
@gsmolk
Copy link
Contributor Author

gsmolk commented Mar 5, 2020

Merged

@gsmolk gsmolk closed this as completed Mar 5, 2020
gsmolk added a commit that referenced this issue Mar 6, 2020
…the target incremental backup. It should be possible to rerun merge by using the backup ID of deleted backup as an argument
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants