GH-133136: Revise QSBR to reduce excess memory held #135473
Draft
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This is a refinement of GH-135107. Additional changes:
_Py_qsbr_advance_with_size()
to reduce duplicated codeWith these changes, the memory held by QSBR is typically freed a bit more quickly and the process RSS stays a bit smaller.
Regarding the changes to advance and processing, GH-135107 has the following minor issues: if the memory threshold is exceeded when a new item is added, by
free_delayed()
, we immediately setmemory_deferred = 0
and process. It is very unlikely that the goal has been reached for the newly added item. If that's a big chunk of memory, we would have to wait until the next process in order to actually free it. This PR tries to avoid that by storing theseq
(local read sequence) as it was at last process time. If that hasn't changed (this thread hasn't entered a quiescent state) then we wait before processing. This at least gives a chance that other readers will catch up and the process can actually free things.This PR also changes how often we can defer the advance of the global write sequence. Previously, we deferred it up to 10 times. However, I think there is not much benefit to advancing it unless we are nearly ready to process. So, the
should_advance_qsbr()
is checking if it seems time to process. The_Py_qsbr_should_process()
checks if the local read sequence has been updated. That means the write sequence has advanced (it's time to process) and the read sequence for this thread has also advanced. This doesn't tell us that the other threads have advanced their read sequence but we don't want to pay the cost of checking that (would require "poll").pyperformance memory usage results