-
-
Notifications
You must be signed in to change notification settings - Fork 32k
Waiting for asyncio.StreamWriter.drain() twice in parallel raises an AssertionError when the transport stopped writing #74116
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
While trying to break some code in a project using asyncio, I found that under certain circumstances, asyncio.StreamWriter.drain raises an AssertionError.
Task exception was never retrieved
future: <Task finished coro=<flooding() done, defined at client.py:10> exception=AssertionError()>
Traceback (most recent call last):
File "client.py", line 12, in flooding
await writer.drain()
File "/usr/local/lib/python3.6/asyncio/streams.py", line 333, in drain
yield from self._protocol._drain_helper()
File "/usr/local/lib/python3.6/asyncio/streams.py", line 208, in _drain_helper
assert waiter is None or waiter.cancelled()
AssertionError I don't know much about how the drain function is working or how networking is handled by the OS, but I'm assuming that I'v reached some OS limitation which trigger this AssertionError. I'm not sure how I'm supposed to handle that. Am I supposed to add some throttling because I should not send too much data concurrently? Is this considered as a bug? Any explanations are welcome. Here some minimal client and server exemples if you want to try to reproduce it:
Also, I don't think this is limited to python 3.6, I'v found this old issue on the aaugustin's websockets repo which looks the same: python-websockets/websockets#16 |
Modified client and server to be able to reproduce the issue on a LAN. |
The bug occurs when the transport pauses writing and the client code calls drain() multiple times in parallel. |
I understood that:
Notes:
-- Metathink told me that he got the bug on a much more complex code using websockets. Thank you Metathink for isolating the bug to a few lines of Python code with simpler asyncio functions! |
Proof-of-concept of patched drain() to support multiple waiters. I don't see any strong reason to not allow two coroutines to wait on drain() in parallel? I'm too lazy to write a full patch with unit tests, doc changed, etc. I started with a PoC to discuss the solution. |
For context, websockets calls If the output buffer fills up, calling API coroutines that write to the websocket connection becomes slow and hopefully the backpressure will propagate (in a correctly implemented application). This is a straightforward application of the only use case described in the documentation. ---- I would find it annoying to have to serialize calls to drain() myself. It doesn't feel like something the "application" should care about. (websockets is the application from asyncio's perspective.) I'm wondering if it could be a problem if a bunch of corountines were waiting on drain() and got released simultaneously. I don't think it would be a problem for websockets. Since my use case seems typical, there's a good chance this also applies to other apps. So I'm in favor of simply allowing an arbitrary number of coroutines to wait on drain() in parallel, if that's feasible. |
Got hit by this too, maaaaany times as I'm working with 3G devices (slow write speeds). As for "drain()", I'd say it should work like a fence/barrier: to let you know that the bytes in the buffer up to when the call was issued have been successfully written on the socket. I'll see what I can cook up. Cheers |
drain() returns when the write buffer reaches the low water mark, not when it's empty, so you don't have a guarantee that your bytes were written to the socket. cpython/Lib/asyncio/protocols.py Lines 36 to 40 in 6f0eb93
The low water mark defaults to 64kB and the high water mark to 256kB. cpython/Lib/asyncio/transports.py Line 290 in 6f0eb93
With websockets, the recommended way to ensure your message was received is: yield from ws.send(...) Given that TCP guarantees ordering, the ping response can only be received after the previous message was fully sent and received. Of course the ping can fail even though the message was received, that's the classical at-most-once vs. at-least-once question. The technique you suggest requires setting the low and high water marks to 0. I'm not sure this is the best way to achieve your goals, since you still don't control the OS buffers. |
Triggered almost the same error. Minimal proof: Documentation did not say that .drain() can't be called simultaneously. =================================== async def do_nothing(client_reader, client_writer):
await asyncio.sleep(10000)
mb = b'*' * (4096*4)
async def write_cmd(writer):
writer.write(mb)
await writer.drain()
async def amain():
srv = await asyncio.start_unix_server(do_nothing, b'\x00qwe')
(reader, writer) = await asyncio.open_unix_connection(b'\x00qwe')
await asyncio.gather(*(write_cmd(writer) for i in range(200)))
loop = asyncio.get_event_loop()
loop.run_until_complete(amain()) =================================== |
I worked around this bug in websockets by serializing access to python-websockets/websockets@198b715 I suspect this is inefficient but it beats crashing. |
Hi, I'm a new contributor: is there any consensus on what or if something needs to be done? If so, I can try and take forward the patch. |
…pythonGH-94705) (cherry picked from commit e5b2453) Co-authored-by: Kumar Aditya <[email protected]>
…pythonGH-94705) (cherry picked from commit e5b2453) Co-authored-by: Kumar Aditya <[email protected]>
…4705) (#96395) (cherry picked from commit e5b2453) Co-authored-by: Kumar Aditya <[email protected]> Co-authored-by: Kumar Aditya <[email protected]>
…4705) (cherry picked from commit e5b2453) Co-authored-by: Kumar Aditya <[email protected]>
Fixed by #94705 and backported to 3.11 and 3.10. |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: