-
-
Notifications
You must be signed in to change notification settings - Fork 32k
Potential SegFault with multithreading garbage collection. #101975
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
I believe in main branch GC now runs only at eval breaker? So one possible approach here would be to update stacktop on eval breaker, not sure how expensive that would be. |
Some extra context here. For my specific case, adding Adding |
Okay I think I finally got this. This problem only happens when the profile/trace function is enabled, and is related to garbage collection. I believe it's because in This is done in the very similar code in @markshannon could you take a look at this because you wrote the original code and I think it would be pretty obvious to you if this fix is correct. |
Thanks for taking the time to find and fix this. |
…H-102803) (GH-102807) Authored-by: gaogaotiantian <[email protected]>
Is there more work left here, or can we close this? |
This comment was marked as spam.
This comment was marked as spam.
Pretty sure I just stumbled across this as well. Python 3.10.8. Fedora 35. What version was this fixed in? Edited 2024-01-141.
Footnotes
|
For now, I can only occationally observe the segfault on github actions. This is an issue that's not easy to reproduce, but I tried to understand the cause of it.
The direct cause would be in
deduce_unreachable
ingcmodule.c
. In that function,gc
tries to find cycles by traversing objects, including frame, which uses_PyFrame_Traverse
for all its objects. In_PyFrame_Traverse
, it usesframe->stacktop
as the index range for all the locals and temporary data on stack(not sure if that's on purpose). However,frame->stacktop
is not updated in real-time, which means the object it traverses might not be valid.For example, in
FOR_ITER
dispatch code, there's aPy_DECREF(iter); STACK_SHRINK(1);
when the iterator is exhausted. However,STACK_SHIRNK
only increasesstack_pointer
, notframe->stacktop
. At this point, theiter
that's just freed will be traversed during garbage collection.There might be something I missed because it's not trivial to reproduce this, but I got a demo that could reproduce this issue occasionally.
It might have something to do with the profile function, I think I can only reproduce this with it. You need to enable
--with-address-sanitizer
to find an error ofERROR: AddressSanitizer: heap-use-after-free on address
. Normally inPy_TYPE Include/object.h:135
, where the code dereferencedob
, which could be freed already.The memory it points to is often benign so I'm not able to reliably generate SegFaults, but in theory, this is a memory violation.
Python Version: cpython/main
OS Version: Ubuntu 20 on WSL
Linked PRs
The text was updated successfully, but these errors were encountered: