Skip to content

Question: Effects of additional Sync postgres messages vs jdbc #1213

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
manswami23 opened this issue Feb 10, 2025 · 1 comment
Open

Question: Effects of additional Sync postgres messages vs jdbc #1213

manswami23 opened this issue Feb 10, 2025 · 1 comment

Comments

@manswami23
Copy link

Hi all,

I'm a newbie to rust coming from a java background. I've been doing some comparisons between tokio-postgres and the postgres jdbc driver and was curious about some of the differences I saw. I'm splitting this off of #1212 since the questions aren't really related.

One difference I noticed was that when trying to do a "batch"-y operation (like pipelining requests), the tokio postgres driver would send a Sync message at the end of each query in the pipeline. On the other hand, jdbc would only send a Sync at the end of the batch. The result being that postgres would flush each response message separately back to the rust implementation.

Here's an example captured packet for the tokio pipelined query where the query Binds and Executes are bundled but the responses are split across packets:

Image

Image

Here's an example captured packet for the jdbc batch query and response, both being bundled:

Image

Image

I'm wondering, is it possible to somehow send only one Sync message for a sequence of pipelined queries? And if not, what exactly does the Sync do and are there any performance concerns I should keep in mind with the additional Syncs (besides the additional network overhead)?

Thanks!

@sfackler
Copy link
Owner

sfackler commented Feb 10, 2025

The full details of Postgres's network protocol are described here: https://www.postgresql.org/docs/17/protocol-flow.html#PROTOCOL-FLOW-EXT-QUERY.

At completion of each series of extended-query messages, the frontend should issue a Sync message. This parameterless message causes the backend to close the current transaction if it's not inside a BEGIN/COMMIT transaction block (“close” meaning to commit if no error, or roll back if error). Then a ReadyForQuery response is issued. The purpose of Sync is to provide a resynchronization point for error recovery. When an error is detected while processing any extended-query message, the backend issues ErrorResponse, then reads and discards messages until a Sync is reached, then issues ReadyForQuery and returns to normal message processing. (But note that no skipping occurs if an error is detected while processing Sync — this ensures that there is one and only one ReadyForQuery sent for each Sync.)

I'm not 100% sure what the perf difference would be, but the big behavior difference here is that these requests are pipelined, not batched - they all succeed or fail independently. In the batch workflow, the first error would cause the remainder of the commands to be skipped.

A sync message is 5 bytes on the wire so you are not going to realistically notice overhead on the raw networking side of things.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants