Skip to content

PYTHON-5369 - Re-raise socket.timeout errors if the deadline has alre… #2326

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 10 commits into
base: master
Choose a base branch
from

Conversation

NoahStapp
Copy link
Contributor

…ady been execeeded

@ShaneHarvey
Copy link
Member

ShaneHarvey commented May 1, 2025

On second thought, I intentionally added the non-blocking read to avoid spurious reconnects on SDAM connections when using the driver in FaaS services that pause/resume the process, see https://jira.mongodb.org/browse/PYTHON-3186.

I'm not sure we can accept this change. Although, if the test_sigstop_sigcont regression test still passes, perhaps we can?

@sleepyStick
Copy link
Contributor

On second thought, I intentionally added the non-blocking read to avoid spurious reconnects on SDAM connections when using the driver in FaaS services that pause/resume the process, see jira.mongodb.org/browse/PYTHON-3186.

I'm not sure we can accept this change. Although, if the test_sigstop_sigcont regression test still passes, perhaps we can?

Looks like that regression test isn't passing...
https://spruce.mongodb.com/task/mongo_python_driver_test_macos_arm64_test_standard_v6.0_python3.13_sync_auth_ssl_sharded_cluster_patch_0ec57781d14160531887f61e4edf083acbef7686_6813d280e5d2f8000770ece5_25_05_01_19_58_57?execution=0&sortBy=STATUS&sortDir=ASC

@blink1073
Copy link
Member

Can we not use the FaaS detection logic and preserve the current behavior in that case?

@NoahStapp
Copy link
Contributor Author

Since test_sigstop_sigcont was added specifically for the FaaS fix, can we limit it to only run on FaaS test variants?

@blink1073
Copy link
Member

Since test_sigstop_sigcont was added specifically for the FaaS fix, can we limit it to only run on FaaS test variants?

My vote is yes

sleepyStick
sleepyStick previously approved these changes May 5, 2025
@@ -1967,6 +1967,7 @@ def test_srv_max_hosts_kwarg(self):
"loadBalanced clients do not run SDAM",
)
@unittest.skipIf(sys.platform == "win32", "Windows does not support SIGSTOP")
@unittest.skipUnless(_is_faas(), "Non-FaaS environments raise timeouts faster")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can't skip this test like this because we don't run the test suite on faas platforms. We want this test to run locally.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By "run locally", what exactly do you mean? As in we can't do this change because we need this test to run in our regular test suite?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps instead we can force the metadata to be faas through monkeypatching

@@ -117,7 +117,7 @@
WriteConcernError,
)
from pymongo.monitoring import ServerHeartbeatListener, ServerHeartbeatStartedEvent
from pymongo.pool_options import _MAX_METADATA_SIZE, _METADATA, ENV_VAR_K8S, PoolOptions
from pymongo.pool_options import _MAX_METADATA_SIZE, _METADATA, ENV_VAR_K8S, PoolOptions, _is_faas
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to remove this import

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bump

Copy link
Member

@ShaneHarvey ShaneHarvey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did we determine that this change actually fixes the test flakes?

@NoahStapp
Copy link
Contributor Author

Did we determine that this change actually fixes the test flakes?

I'm running some patches right now to confirm, but so far it appears to fix the flakes on non-PyPy variants. PyPy still has this issue, so I propose we skip test.test_transactions_unified.TestUnifiedErrorLabelsBlockConnection.test_add_RetryableWriteError_and_UnknownTransactionCommitResult_labels_to_connection_errors on it.

@NoahStapp
Copy link
Contributor Author

Did we determine that this change actually fixes the test flakes?

I have not seen any failures of these tests flaking on sync + CPython with this fix.

or not _is_faas()
and deadline is not None
and deadline - time.monotonic() < 0
):
Copy link
Member

@ShaneHarvey ShaneHarvey May 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we want sigstop/sigcont to work the same regardless of what env variables are defined. What if we take an alternative approach here and replace not _is_faas() with not conn.is_sdam? IE only do the extra non-blocking read on SDAM connections.

Will that still fix the flaky tests?

@NoahStapp NoahStapp requested a review from ShaneHarvey May 8, 2025 19:22
Copy link
Member

@blink1073 blink1073 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants