Skip to content

remote_ssh : ERROR: src/utils/file.c:529: proceeds 0 bytes instead of 8 #44

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
achix opened this issue Jan 21, 2019 · 56 comments
Closed

Comments

@achix
Copy link

achix commented Jan 21, 2019

Hello,
just started to test remote capabilities and I get :
/pg_probackup/pg_probackup add-instance -B /usr/local/var/lib/pgsql/mypgprobacks --remote-proto=ssh --remote-host=192.168.1.109 --remote-port=5432 --remote-path=/var/lib/postgresql/pg_probackup -D /var/lib/postgresql/datatest --instance testdblin
ERROR: src/utils/file.c:529: proceeds 0 bytes instead of 8

the catalog host is FreeBSD and the remote is linux. Could that pose a problem?

@gsmolk
Copy link
Contributor

gsmolk commented Jan 22, 2019

Hello!
Can you pull recent changes from remote_ssh, rebuild binaries on host and remote machines and try one more time?

the catalog host is FreeBSD and the remote is linux. Could that pose a problem?

probably, but I`m not sure

@gsmolk
Copy link
Contributor

gsmolk commented Jan 22, 2019

Oh, I`ve forgot, there is now --remote option, which must be added to cmdline.
--remote-proto is now optional with default value 'ssh'

@knizhnik
Copy link
Contributor

You are also incorrectly using --remote-path options.
I am not sure if it is the reason of the reported error, but if --remote-path options is specified, it is concatenated with name of the program. In you case it is "/pg_probackup/pg_probackup".
So result will be something like: "/var/lib/postgresql/pg_probackup//pg_probackup/pg_probackup".
Not sure that it is what you have intended.,,

@achix
Copy link
Author

achix commented Jan 22, 2019

Hello, thanks, it would have to be just : --remote-path=/var/lib/postgresql . I'll test again.

@gsmolk
Copy link
Contributor

gsmolk commented Jan 22, 2019

I am not sure if it is the reason of the reported error, but if --remote-path options is specified, it is concatenated with name of the program

I don`t think this is a right way. Remote binary can have a different name, so --remote-path should point to binary, not directory.

@achix
Copy link
Author

achix commented Jan 22, 2019

Ok, I am testing against are large testdb :
I successfully :

  • built pg_probackup (remote_ssh) on both the catalog host and the database host, both running linux same version
  • on the catalog host :

./pg_probackup/pg_probackup init -B /var/lib/pgbackup/pgprobackups
INFO: Backup catalog '/var/lib/pgbackup/pgprobackups' successfully inited
./pg_probackup/pg_probackup add-instance -B /var/lib/pgbackup/pgprobackups --remote --remote-host="DBHOST" -D /var/lib/pgsql/data --instance testdb
INFO: Instance 'testdb' successfully inited
But when I try to run the first backup I get :
./pg_probackup/pg_probackup backup -B /var/lib/pgbackup/pgprobackups/ --instance testdb -b FULL --stream
INFO: Validating backup 0
ERROR: cannot open "/var/lib/pgbackup/pgprobackups/backups/testdb/0/backup_content.control": No such file or directory

@achix
Copy link
Author

achix commented Jan 23, 2019

hello,
git fetched rebased , in both hosts, now it won't even let me add the instance :
postgres@smadb2cs:~$ ./pg_probackup/pg_probackup add-instance -B /var/lib/pgbackup/pgprobackups -D /var/lib/pgsql/data --remote-proto=ssh --remote-host=10.9.0.77 --remote-path=/var/lib/pgsql/pg_probackup/pg_probackup --instance testdbERROR: src/utils/file.c:537: proceeds 0 bytes instead of 8: No such file or directory

postgres@smadb2cs:~$ ./pg_probackup/pg_probackup -V
pg_probackup 2.0.26 (PostgreSQL 10.6)

@gsmolk
Copy link
Contributor

gsmolk commented Jan 23, 2019

fixed

@gsmolk
Copy link
Contributor

gsmolk commented Jan 23, 2019

--remote option is now obsolete

@achix
Copy link
Author

achix commented Jan 23, 2019

./pg_probackup/pg_probackup add-instance -B /var/lib/pgbackup/pgprobackups -D /var/lib/pgsql/data --remote-proto=ssh --remote-host=10.9.0.77 --remote-path=/var/lib/pgsql/pg_probackup/. --instance testdb
INFO: Instance 'testdb' successfully inited

this worked.

also backup with --stream seems to start.

pls tell how's the archive command supposed to look like for remote? I cannot figure it out.

it always asks for local -B. should I create an instance in the PostgreSQL host as well?

@achix
Copy link
Author

achix commented Jan 23, 2019

tried backups like :

./pg_probackup/pg_probackup backup -B /var/lib/pgbackup/pgprobackups/ --instance testdb -b FULL --stream
INFO: Backup start, pg_probackup version: 2.0.26, backup ID: PLSQ78, backup mode: full, instance: testdb, stream: true, remote true
INFO: Start transfering data files
^CERROR: src/utils/file.c:863: proceeds 2741 bytes instead of 8200: File exists

^^^ this seems to work. Just interrupted it.

postgres@smadb2cs:$
postgres@smadb2cs:
$
postgres@smadb2cs:~$ ./pg_probackup/pg_probackup backup -B /var/lib/pgbackup/pgprobackups/ --instance testdb -b FULL
INFO: Backup start, pg_probackup version: 2.0.26, backup ID: PLSQ87, backup mode: full, instance: testdb, stream: false, remote true
INFO: Wait for WAL segment /var/lib/pgbackup/pgprobackups/wal/testdb/00000001000009880000003B to be archived

^^^ this I cannot figure out how to setup archive_command for remote

currently testing :

trying with :
./pg_probackup/pg_probackup backup -B /var/lib/pgbackup/pgprobackups/ --instance testdb -b FULL -j 8 --stream

this will take hours..

but pls tell me about archive-push from the pgsql host to the catalog host.

@achix
Copy link
Author

achix commented Jan 24, 2019

still running , I dont see -j 8 to be respected, current speed below what this line can do : 8.2MB/s.
However it seems to work. Thanks! It will more than day for 1.3TB.

@achix
Copy link
Author

achix commented Jan 24, 2019

delete does not delete back up, seems to work but info still shows backup.

@gsmolk
Copy link
Contributor

gsmolk commented Jan 24, 2019

yes, we will fix that

@gsmolk
Copy link
Contributor

gsmolk commented Jan 24, 2019

still running , I dont see -j 8 to be respected, current speed below what this line can do : 8.2MB/s.
However it seems to work. Thanks! It will more than day for 1.3TB.

What capacity this line has?
Did you try to use a weaker encryption algorithm?

@knizhnik
Copy link
Contributor

Can you also please monitor CPU usage at your system (both at database and backup nodes)?
If you have "perf" utility installed, it will be also very useful to get profiles (perf top) from both nodes.

@achix
Copy link
Author

achix commented Jan 24, 2019

still running , I dont see -j 8 to be respected, current speed below what this line can do : 8.2MB/s.
However it seems to work. Thanks! It will more than day for 1.3TB.

What capacity this line has?
Did you try to use a weaker encryption algorithm?

I believe this can go up to 15MB/s , at least I think I got those numbers with pgbackrest, although this line is pretty messed up, I don't trust what the provider says . I want to have this FULL backup finished so to be able to test PAGE and DELTA.

No special encryption , just :
./pg_probackup/pg_probackup backup -B /var/lib/pgbackup/pgprobackups/ --instance testdb -b FULL --stream

BTW any news on the archive-push from the remote PgSql server?

@achix
Copy link
Author

achix commented Jan 24, 2019

Can you also please monitor CPU usage at your system (both at database and backup nodes)?
If you have "perf" utility installed, it will be also very useful to get profiles (perf top) from both nodes.

CPU usage is low, less than 10% on 4 core VM

@gsmolk
Copy link
Contributor

gsmolk commented Jan 24, 2019

What about disk usage on database and backup node?

@achix
Copy link
Author

achix commented Jan 24, 2019

yes, we will fix that

also there is a typo :
$ ./pg_probackup/pg_probackup delete --force -B /var/lib/pgbackup/pgprobackups/ --instance testdb
ERROR: You must specify at least one of the delete options: --expired |--wal |--backup_id

this backup_id should be -i backup-id

@achix
Copy link
Author

achix commented Jan 24, 2019

What about disk usage on database and backup node?

as expected, full backup should consume about 1.3 T , will let you now about that.

for the time being the show stopper is the archive_command from the remote and how to configure : archive-push (params to use).

thanks again.

@knizhnik
Copy link
Contributor

Is it possible to ask you to attach debugger to running instance of pg_probackup (at database host if you are executing "backup" command) and run the following command:

thread apply all bt

@gsmolk
Copy link
Contributor

gsmolk commented Jan 24, 2019

as expected, full backup should consume about 1.3 T , will let you now about that.

I mean I/O utilization on database and backup node, can you lookup that?

@achix
Copy link
Author

achix commented Jan 24, 2019

as expected, full backup should consume about 1.3 T , will let you now about that.

I mean I/O utilization on database and backup node, can you lookup that?

just checked with iotop on the database host , it is about 8M/sec

@achix
Copy link
Author

achix commented Jan 24, 2019

Is it possible to ask you to attach debugger to running instance of pg_probackup (at database host if you are executing "backup" command) and run the following command:

thread apply all bt

`(gdb) thread apply all bt

Thread 2 (Thread 0x7f9c47e4d700 (LWP 2204)):
#0 0x00007f9c4a75ea9d in read () at ../sysdeps/unix/syscall-template.S:81
#1 0x0000000000409474 in error_reader_proc ()
#2 0x00007f9c4a758064 in start_thread (arg=0x7f9c47e4d700) at pthread_create.c:309
#3 0x00007f9c4a27262d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

Thread 1 (Thread 0x7f9c4adbe700 (LWP 2202)):
#0 0x00007f9c4a75ea9d in read () at ../sysdeps/unix/syscall-template.S:81
#1 0x0000000000409eba in fio_read_all ()
#2 0x000000000040c404 in fio_communicate ()
#3 0x0000000000409b35 in remote_execute ()
#4 0x00000000004045c0 in main ()
(gdb)
`

@achix
Copy link
Author

achix commented Jan 24, 2019

For the time being -j is not such a major issue. However, archive_command is a must.

Also I got huge rush with other tasks, talk to you later ppl!

Thanks!

@gsmolk
Copy link
Contributor

gsmolk commented Jan 24, 2019

archive-push issue will be resolved soon.
Thank you very much for your efforts!

@knizhnik
Copy link
Contributor

Sorry, but looks like you have attached debugger to pg_probackup instance at backup host (host from which you have launched remote backup).
In case of executing "backup" command, most of activity is happen at database host (where database files are located). There should be also process with name pg_probackupand "--agent" option in command line.

Can you please attach debugger to this process and get stacktraces of its threads?

@achix
Copy link
Author

achix commented Jan 24, 2019

Sorry, but looks like you have attached debugger to pg_probackup instance at backup host (host from which you have launched remote backup).
In case of executing "backup" command, most of activity is happen at database host (where database files are located). There should be also process with name pg_probackupand "--agent" option in command line.

Can you please attach debugger to this process and get stacktraces of its threads?

`(gdb) thread apply all bt

Thread 3 (Thread 0x7f930adb9700 (LWP 127313)):
#0 0x00007f930d1d7893 in select () at ../sysdeps/unix/syscall-template.S:81
#1 0x00000000004216f2 in CopyStreamPoll (stop_socket=-1, timeout_ms=, conn=0x2126670) at src/receivelog.c:937
#2 CopyStreamReceive (conn=0x2126670, timeout=, stop_socket=-1, buffer=0x7f930adb89a8) at src/receivelog.c:987
#3 0x0000000000422122 in HandleCopyStream (stoppos=, stream=, conn=)
at src/receivelog.c:838
#4 ReceiveXlogStream (conn=0x6, stream=0x7f930adb8ef0) at src/receivelog.c:611
#5 0x000000000040e835 in StreamLog (arg=0x63fda0 <stream_thread_arg>) at src/backup.c:2788
#6 0x00007f930d6c4064 in start_thread (arg=0x7f930adb9700) at pthread_create.c:309
#7 0x00007f930d1de62d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

Thread 2 (Thread 0x7f930a5b8700 (LWP 127314)):
#0 0x00007f930d6caa3d in write () at ../sysdeps/unix/syscall-template.S:81
#1 0x0000000000409e4a in fio_write_all (fd=1, buf=buf@entry=0x7f930a5af930, size=size@entry=8) at src/utils/file.c:81
#2 0x000000000040ad75 in fio_write (fd=, buf=0x7f930a5af970, size=8200) at src/utils/file.c:458
#3 0x0000000000414145 in compress_and_backup_page (file=0x1, file@entry=0x285fb40, blknum=173734192, in=0x8,
in@entry=0x7f92fc0008c0, out=0xffffffffffffffff, out@entry=0x1, crc=0x641180 <fio_write_mutex>, crc@entry=0x285fb68,
page_state=page_state@entry=0, page=0x7f930a5b5a30 "\301\001", calg=NOT_DEFINED_COMPRESS, clevel=1) at src/data.c:495
#4 0x00000000004144b3 in backup_data_file (arguments=0x2125430,
to_path=0x7f930a5b7b20 "/var/lib/pgbackup/pgprobackups/backups/testdb/PLTUFH/database/pg_tblspc/17747/PG_10_201707211/17748/185229012", file=0x285fb40, prev_backup_start_lsn=0, backup_mode=BACKUP_MODE_FULL, calg=NOT_DEFINED_COMPRESS, clevel=1)
at src/data.c:612
#5 0x000000000040e688 in backup_files (arg=0x2125430) at src/backup.c:2301
#6 0x00007f930d6c4064 in start_thread (arg=0x7f930a5b8700) at pthread_create.c:309
#7 0x00007f930d1de62d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

Thread 1 (Thread 0x7f930dd2a700 (LWP 127308)):
#0 0x00007f930d6c549b in pthread_join (threadid=140269510690560, thread_return=thread_return@entry=0x0) at pthread_join.c:92
#1 0x000000000040f423 in do_backup_instance () at src/backup.c:762
#2 0x0000000000410e3e in do_backup (start_time=1548318797) at src/backup.c:983
#3 0x00000000004043ef in main (argc=173771216, argv=0x20fbb80) at src/pg_probackup.c:525
(gdb)
`

@knizhnik
Copy link
Contributor

Did you specify -j 8 options?
Looks like there is just one backup thread is working...

@achix
Copy link
Author

achix commented Jan 24, 2019

Did you specify -j 8 options?
Looks like there is just one backup thread is working...

in this run, I did not specify -j. But when I had run with -j 8 I saw no difference in number of procs running.

@knizhnik
Copy link
Contributor

Number of of processes will be the same, but there should be several backup_files threads in one process (unlike Postgres itself, pg_propackup is multithreaded).

@achix
Copy link
Author

achix commented Jan 24, 2019

wow! thanks! still I see 3 threads although I did not specify -j .

@knizhnik
Copy link
Contributor

Except main thread, there is one thread for streaming WAL and one or more threads for copying data files. It will be nice if you can measure speed with larger numbers of threads and produce pg_probackup profile or at least stack traces of all threads.

@achix
Copy link
Author

achix commented Jan 24, 2019

Thank you, I will let this first full backup run, and then try some DELTA/PAGE with -j 4 .

@achix
Copy link
Author

achix commented Jan 25, 2019

Hello,
after almost 14 hrs, 26 mins and 378G , the backup aborted with :
ERROR: src/utils/file.c:863: proceeds 1366 bytes instead of 8200: No such file or directory

As I told you this is a very unreliable line. This is supposed to be between two Swiss cloud providers in Switzerland, but still it sucks.

So, we can't live without resume.

Thank you for all your effort you put in this, and I hope to come back to use pg_probackup!

@gsmolk
Copy link
Contributor

gsmolk commented Jan 25, 2019

Well, resume is not hard to implement, so I think it will be in 2.0.27

@achix
Copy link
Author

achix commented Jan 26, 2019

Thanks.

@achix
Copy link
Author

achix commented Feb 7, 2019

Hello,
I am just writing a blog article and included pg_probackup as well. Are the any news on archive-push configuration/params to the remote server for the archive_command?
Also any news on resume a failed backup!

I hope I get some news on that, and also I hope you'll enjoy the article.

@gsmolk
Copy link
Contributor

gsmolk commented Feb 8, 2019

Hello!
Please, try remote_pull branch.
For a archive-push setup just add --remote-proto=ssh --remote-host=host --ssh-options="-l remote_username" to archive-push cmdline.
Resume is still on WIP stage.

@achix
Copy link
Author

achix commented Feb 8, 2019 via email

@gsmolk
Copy link
Contributor

gsmolk commented Feb 8, 2019

remote_ssh is obsolete, use only remote_pull

@achix
Copy link
Author

achix commented Feb 8, 2019 via email

@achix
Copy link
Author

achix commented Feb 8, 2019

ok checkout remote_pull,
./pg_probackup/pg_probackup add-instance -B /var/lib/pgbackup/pgprobackups/ -D /var/lib/pgsql/data --remote-proto=ssh --remote-host=xx.xx.xx.xx --remote-path=/var/lib/pgsql/pg_probackup/. --instance testdb

worked

postgres@smadb2cs:~$ cat /var/lib/pgbackup/pgprobackups/backups/testdb/pg_probackup.conf

Backup instance information

pgdata = /var/lib/pgsql/data
system-identifier = 6583971847287101381

Remote access parameters

remote-proto = ssh
remote-host = xx.xx.xx.xx
remote-path = /var/lib/pgsql/pg_probackup/.

but

./pg_probackup/pg_probackup backup -B /var/lib/pgbackup/pgprobackups/ -b FULL --instance testdb --stream
INFO: Backup start, pg_probackup version: 2.0.26, backup ID: PMM0W4, backup mode: full, instance: testdb, stream: true, remote true
ERROR: could not connect to database postgres: could not connect to server: No such file or directory
Is the server running locally and accepting
connections on Unix domain socket "/tmp/.s.PGSQL.5432"?

???

on the pgsql logs I see no access to anything ..

@gsmolk
Copy link
Contributor

gsmolk commented Feb 9, 2019

You need to provide parameters for libpq connection:
Connection options:
-U, --username=USERNAME user name to connect as (default: current local user)
-d, --dbname=DBNAME database to connect (default: username)
-h, --host=HOSTNAME database server host or socket directory(default: 'local socket')
-p, --port=PORT database server port (default: 5432)
-w, --no-password never prompt for password
-W, --password force password prompt

@achix
Copy link
Author

achix commented Feb 9, 2019 via email

@achix
Copy link
Author

achix commented Feb 9, 2019

tried with all the libpq params (passwd via .pgpass, verified that it works) :

./pg_probackup/pg_probackup backup -B /var/lib/pgbackup/pgprobackups/ --instance testdb -b FULL -h xx.xx.xx.xx -d postgres -U postgres -p 5432
INFO: Backup start, pg_probackup version: 2.0.26, backup ID: PMO484, backup mode: full, instance: testdb, stream: false, remote true
ERROR: could not open file "/var/lib/pgsql/data/global/pg_control" for reading: No such file or directory
postgres@smadb2cs:~$

same with --stream

@achix
Copy link
Author

achix commented Feb 16, 2019

bump, any news on this? any way to do remote backups? this used to semi-work in the old remote_ssh branch.

@gsmolk
Copy link
Contributor

gsmolk commented Feb 16, 2019

It is in development and very unstable

@achix
Copy link
Author

achix commented Feb 16, 2019 via email

@gsmolk
Copy link
Contributor

gsmolk commented Feb 16, 2019

We appreciate you efforts, thank you!
You can checkout remote_pull branch, if you want.
Archiving now is working, compressed archiving is not, but we get there.

@gsmolk
Copy link
Contributor

gsmolk commented Apr 16, 2019

I think you will find it interesting that remote backup via ssh was merged in master and will be released on this week.

@achix
Copy link
Author

achix commented Apr 16, 2019 via email

@Rakshitha-BR
Copy link

Are remote capabilities enabled now?

This are the rpms i have

pg_probackup-10-2.0.26-1.d8553c06afff82a3.x86_64
pg_probackup-repo-2.0.26-1.noarch

@gsmolk
Copy link
Contributor

gsmolk commented May 2, 2019

Are remote capabilities enabled now?

This are the rpms i have

pg_probackup-10-2.0.26-1.d8553c06afff82a3.x86_64
pg_probackup-repo-2.0.26-1.noarch

Yes, they are available in 2.1.1.
You can update repo package and update installed package with binary. Example:

yum install http://repo.postgrespro.ru/pg_probackup/keys/pg_probackup-repo-centos.noarch.rpm
yum update pg_probackup-10

@Burus
Copy link
Contributor

Burus commented Jun 2, 2022

Problem has been solved by the packages update. Too old task.

@Burus Burus closed this as not planned Won't fix, can't repro, duplicate, stale Jun 2, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants