-
Notifications
You must be signed in to change notification settings - Fork 443
chore(profiling): per upload config #13535
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
Bootstrap import analysisComparison of import times between this PR and base. SummaryThe average import time from this PR is: 250 ± 4 ms. The average import time from base is: 249 ± 4 ms. The import time difference between this PR and base is: 0.5 ± 0.2 ms. The difference is not statistically significant (z = 2.92). Import time breakdownThe following import paths have shrunk:
|
BenchmarksBenchmark execution time: 2025-06-03 22:48:45 Comparing candidate commit 176c61d in PR branch Found 0 performance improvements and 0 performance regressions! Performance is the same for 505 metrics, 3 unstable metrics. |
…-trace-py into taegyunkim/pass-upload-config
**tl;dr** 1. Use `uwsgi>=2.0.30` which has unbit/uwsgi#2726 2. When using `uwsgi<2.0.30` with `--lazy-apps`, also set `--skip-atexit`, if you see crashes when workers exit. ----------------------------------------------------- The following explanation happens on `uwsgi<2.0.30` When `uwsgi` is configured with `--lazy-apps`, master doesn't load the application, and simply `fork()` to create workers. Then, the workers load the application. This is different from when it's not set. Without `--lazy-apps`, master process first load the application then `fork()`. dd-trace-py profiler uses Python stdlib `atexit` module to export the profile when the process exits. The code run to export references global static `std::string` variables. However, when `--lazy-apps` option is set, they are destructed before when the profiler needs them to export the profile, and this leads to an undefined behavior. When `--lazy-apps` is not set the following sequence of events happens 1. uwsgi master process, initialize Python plugin, loads the application. 3. The application calls `import ddtrace.profiling.auto` and this leads all native extensions imported to the Python app. During the process, the constructors for global static variables are called and their destructors would be registered to be called by `atexit()` when the program exits. 4. The application also registers `Profiler.stop()` function to `uwsgi` python module's `atexit` attribute. 5. Then `uwsgi` calls `atexit(uwsgi_plugins_atexit)` to register each plugins atexit callbacks, for Python `uwsgi_python_atexit()` function. And the function will call `Py_Finalize()` 6. Master `fork()`s workers 7. When SIGTERM is received at the master, it sends SIGINT to its workers 8. In each worker, `uwsgi_python_atexit()` is called, invoking `Profiler.stop()` and `Py_Finalize()` is called, then destructors for global static variables are called. When `--lazy-apps` is set 1. uwsgi matser process still intializes Python interpreter, but skips loading the application 3. Then `uwsgi` calls `atexit(uwsgi_plugins_atexit)` to register each plugins atexit callbacks, for Python `uwsgi_python_atexit()` function. And the function will call `Py_Finalize()` 4. Master `fork()`s workers 5. Workers load the application. The application calls `import ddtrace.profiling.auto` and this leads all native extensions imported to the Python app. During the process, the constructors for global static variables are called and their destructors would be registered to be called by C `atexit()` when the program exits. 6. When `--lazy-apps` is set, ddtrace simply uses Python's `atexit` module to call `Profiler.stop()` when the program exits. 7. When SIGTERM is received at the master, it sends SIGINT to its workers 8. In each worker, since global destructors are registered later than `uwsgi_python_atexit()` , they're called first. 9. Then `Py_Finalize()` is called in part of `uwsgi_python_atexit()` calling `Profiler.stop()`. Then it will access those destructed global static strings leading to undefined behavior. To workaround this I have tried three different approaches. 1. #11125: Simply pass the needed `std::string` from Python side to native when exporting pprof file for test 2. #13520: Leak the strings needed for exporting profile to prevent them from destructed 3. #13535: Bundle all needed strings needed for exporting profile, and create them whenever we try to upload. The first approach only addresses issues from tests. The second approach mimics what's suggested in pytorch/pytorch#42125, but since there are a few other global static variables we have throughout the profiler, this still showed crashes from test. And we got similar results from the third approach, as we did from the second. Another way to completely work around this problem is setting `--skip-atexit` as suggested in ansible/ansible-ai-connect-service#248. When `--skip-atexit` is set, `uwsgi` uses `_exit()` instead of `exit()`, and the former doesn't call any functions registered with `atexit()` or `on_exit()`. So, the static variable destructors are never called, and the app no longer crashes. The profiler also doesn't flush the profile via `atexit()` when `--skip-atexit` is set, but I'd argue losing a single profile at the end is better than crashing the application even if the application is exiting. So we'd need to require our customers to set `--skip-atexit` when `--lazy-apps` is set. ## Checklist - [x] PR author has checked that all the criteria below are met - The PR description includes an overview of the change - The PR description articulates the motivation for the change - The change includes tests OR the PR description describes a testing strategy - The PR description notes risks associated with the change, if any - Newly-added code is easy to change - The change follows the [library release note guidelines](https://ddtrace.readthedocs.io/en/stable/releasenotes.html) - The change includes or references documentation updates if necessary - Backport labels are set (if [applicable](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting)) ## Reviewer Checklist - [x] Reviewer has checked that all the criteria below are met - Title is accurate - All changes are related to the pull request's stated goal - Avoids breaking [API](https://ddtrace.readthedocs.io/en/stable/versioning.html#interfaces) changes - Testing strategy adequately addresses listed risks - Newly-added code is easy to change - Release note makes sense to a user of the library - If necessary, author has acknowledged and discussed the performance implications of this PR as reported in the benchmarks PR comment - Backport labels are set in a manner that is consistent with the [release branch maintenance policy](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting)
Checklist
Reviewer Checklist