Skip to content

fix(sampling): ensure agent based sampling is not reset after forking and on tracer.configure #13560

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jun 23, 2025

Conversation

mabdinur
Copy link
Contributor

@mabdinur mabdinur commented Jun 2, 2025

Builds on 7fbdc9f

  • Fix: Avoids reinitializing the SpanAggregator on tracer.configure(...) and when an application forks. Instead SpanAggregator.reset() is called. This operation ensures global configurations are re-applied, trace buffer can be reset, and trace writer is recreated. This ensures agent based sampling rules are not reset.

  • Clean up

    • Removes writer parameter from SpanAggregator.__init__(...) with this change the intialization of the global writer is an implementation detail of the SpanAggregator. There is no longer a need to supply the SpanAggregator with a writer on the initialization of the global tracer.
    • Moves all implementation details of resetting the SpanAggregator from Tracer.configure(...) and Tracer._recreate(...) to SpanAggregator.reset(...).
    • Removes the initialization of the SpanAggregator from _default_span_processors_factory. With this change the global tracer's SpanAggregator is never re-created. It's only modified when tracer.configure(..) is used.
    • Rename DatadogSampler._service_based_samplers property to DatadogSampler._agent_based_sampler to improve clarity. These sampling rules are no longer supplied via environment variables or a programatic api, they can only be set by the Datadog Agent.
    • Splits SpanAggregator.trace_proccessors into two properties SpanAggregator.dd_proccessors and SpanAggregator.user_processors. SpanAggregator.users_proccessors is set after application startup via Tracer.configure(..) while SpanAggregator.dd_proccessors is internal to the ddtrace library and should only be set by ddtrace components. This separation allows us to avoid recreating all trace processors when tracer.configure() is called.
  • Adds a more descriptive release note to an unreleased fix.

Checklist

  • PR author has checked that all the criteria below are met
  • The PR description includes an overview of the change
  • The PR description articulates the motivation for the change
  • The change includes tests OR the PR description describes a testing strategy
  • The PR description notes risks associated with the change, if any
  • Newly-added code is easy to change
  • The change follows the library release note guidelines
  • The change includes or references documentation updates if necessary
  • Backport labels are set (if applicable)

Reviewer Checklist

  • Reviewer has checked that all the criteria below are met
  • Title is accurate
  • All changes are related to the pull request's stated goal
  • Avoids breaking API changes
  • Testing strategy adequately addresses listed risks
  • Newly-added code is easy to change
  • Release note makes sense to a user of the library
  • If necessary, author has acknowledged and discussed the performance implications of this PR as reported in the benchmarks PR comment
  • Backport labels are set in a manner that is consistent with the release branch maintenance policy

Copy link
Contributor

github-actions bot commented Jun 2, 2025

CODEOWNERS have been resolved as:

releasenotes/notes/fix-agent-based-sampling-8877694a37053e51.yaml       @DataDog/apm-python
ddtrace/_trace/processor/__init__.py                                    @DataDog/apm-sdk-api-python
ddtrace/_trace/sampler.py                                               @DataDog/apm-sdk-api-python
ddtrace/_trace/tracer.py                                                @DataDog/apm-sdk-api-python
ddtrace/internal/ci_visibility/recorder.py                              @DataDog/ci-app-libraries
ddtrace/internal/writer/writer.py                                       @DataDog/apm-core-python
ddtrace/opentracer/tracer.py                                            @DataDog/apm-sdk-api-python
tests/ci_visibility/test_ci_visibility.py                               @DataDog/ci-app-libraries
tests/integration/test_priority_sampling.py                             @DataDog/apm-sdk-api-python
tests/tracer/test_processors.py                                         @DataDog/apm-sdk-api-python
tests/tracer/test_sampler.py                                            @DataDog/apm-sdk-api-python
tests/tracer/test_writer.py                                             @DataDog/apm-sdk-api-python
tests/utils.py                                                          @DataDog/python-guild

Copy link
Contributor

github-actions bot commented Jun 2, 2025

Bootstrap import analysis

Comparison of import times between this PR and base.

Summary

The average import time from this PR is: 279 ± 3 ms.

The average import time from base is: 283 ± 3 ms.

The import time difference between this PR and base is: -3.6 ± 0.1 ms.

Import time breakdown

The following import paths have grown:

ddtrace.auto 0.224 ms (0.08%)
ddtrace 0.162 ms (0.06%)
ddtrace.trace 0.162 ms (0.06%)
ddtrace._trace.tracer 0.104 ms (0.04%)
ddtrace.internal.schema.processor 0.104 ms (0.04%)
ddtrace._trace.filters 0.059 ms (0.02%)
ddtrace._trace.processor 0.059 ms (0.02%)
ddtrace._trace.sampler 0.023 ms (0.01%)
ddtrace.bootstrap.sitecustomize 0.062 ms (0.02%)
ddtrace.bootstrap.preload 0.062 ms (0.02%)
ddtrace.internal.products 0.062 ms (0.02%)
importlib.metadata 0.062 ms (0.02%)
importlib.metadata._collections 0.062 ms (0.02%)

The following import paths have shrunk:

ddtrace.auto 2.296 ms (0.82%)
ddtrace.bootstrap.sitecustomize 1.476 ms (0.53%)
ddtrace.bootstrap.preload 1.476 ms (0.53%)
ddtrace.internal.remoteconfig.client 0.692 ms (0.25%)
ddtrace.internal.products 0.062 ms (0.02%)
importlib.metadata 0.062 ms (0.02%)
importlib.metadata._itertools 0.062 ms (0.02%)
ddtrace 0.820 ms (0.29%)
ddtrace.trace 0.140 ms (0.05%)
ddtrace._trace.tracer 0.140 ms (0.05%)
ddtrace.settings.peer_service 0.106 ms (0.04%)
ddtrace.internal._unpatched 0.032 ms (0.01%)
json 0.032 ms (0.01%)
json.decoder 0.032 ms (0.01%)
re 0.032 ms (0.01%)
enum 0.032 ms (0.01%)
types 0.032 ms (0.01%)

@pr-commenter
Copy link

pr-commenter bot commented Jun 2, 2025

Benchmarks

Benchmark execution time: 2025-06-23 19:30:16

Comparing candidate commit 7f6185d in PR branch munir/cleanup-agent-based-sampling-fix with baseline commit d49390c in branch main.

Found 0 performance improvements and 1 performance regressions! Performance is the same for 560 metrics, 3 unstable metrics.

scenario:iastaspectsospath-ospathsplit_aspect

  • 🟥 execution_time [+827.082ns; +899.715ns] or [+17.114%; +18.617%]

Copy link
Member

@brettlangdon brettlangdon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would be nice to have a regression test for this, otherwise lgtm

@mabdinur mabdinur force-pushed the munir/cleanup-agent-based-sampling-fix branch from 4d50f44 to 7e14fee Compare June 15, 2025 01:56
@mabdinur mabdinur changed the title chore(sampling): clean up agent based sampling fix fix(sampling): clean up agent based sampling fix Jun 15, 2025
@mabdinur mabdinur changed the title fix(sampling): clean up agent based sampling fix fix(sampling): ensure agent service based sampling is not reset after forks and on tracer.configure Jun 15, 2025
@mabdinur mabdinur changed the title fix(sampling): ensure agent service based sampling is not reset after forks and on tracer.configure fix(sampling): ensure agent service based sampling is not reset after forking and on tracer.configure Jun 15, 2025
@mabdinur mabdinur force-pushed the munir/cleanup-agent-based-sampling-fix branch from 6afec51 to cbfe917 Compare June 15, 2025 17:38
@mabdinur mabdinur changed the title fix(sampling): ensure agent service based sampling is not reset after forking and on tracer.configure fix(sampling): ensure agent based sampling is not reset after forking and on tracer.configure Jun 15, 2025
@mabdinur mabdinur marked this pull request as ready for review June 15, 2025 22:49
@mabdinur mabdinur requested review from a team as code owners June 15, 2025 22:49
Copy link
Collaborator

@emmettbutler emmettbutler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work. Is there a test for how this works in subprocesses that I'm missing?

@mabdinur mabdinur force-pushed the munir/cleanup-agent-based-sampling-fix branch from ef76a21 to 4d311a8 Compare June 17, 2025 00:17
@mabdinur mabdinur force-pushed the munir/cleanup-agent-based-sampling-fix branch from 376430f to 7101cbd Compare June 17, 2025 03:15
@mabdinur mabdinur force-pushed the munir/cleanup-agent-based-sampling-fix branch from 28a4ac2 to 8a5ae1b Compare June 18, 2025 16:51
@mabdinur mabdinur enabled auto-merge (squash) June 23, 2025 18:31
@mabdinur mabdinur force-pushed the munir/cleanup-agent-based-sampling-fix branch from 65cfc53 to 7f6185d Compare June 23, 2025 18:34
@mabdinur mabdinur merged commit ca79351 into main Jun 23, 2025
756 checks passed
@mabdinur mabdinur deleted the munir/cleanup-agent-based-sampling-fix branch June 23, 2025 19:31
github-actions bot pushed a commit that referenced this pull request Jun 23, 2025
… and on tracer.configure (#13560)

Builds on
7fbdc9f

- Fix: Avoids reinitializing the SpanAggregator on
`tracer.configure(...)` and when an application forks. Instead
`SpanAggregator.reset()` is called. This operation ensures global
configurations are re-applied, trace buffer can be reset, and trace
writer is recreated. This ensures agent based sampling rules are not
reset.

- Clean up
- Removes `writer` parameter from `SpanAggregator.__init__(...)` with
this change the intialization of the global writer is an implementation
detail of the SpanAggregator. There is no longer a need to supply the
`SpanAggregator` with a writer on the initialization of the global
tracer.
- Moves all implementation details of resetting the `SpanAggregator`
from `Tracer.configure(...)` and `Tracer._recreate(...)` to
`SpanAggregator.reset(...)`.
- Removes the initialization of the SpanAggregator from
`_default_span_processors_factory`. With this change the global tracer's
SpanAggregator is never re-created. It's only modified when
`tracer.configure(..)` is used.
- Rename `DatadogSampler._service_based_samplers` property to
`DatadogSampler._agent_based_sampler` to improve clarity. These sampling
rules are no longer supplied via environment variables or a programatic
api, they can only be set by the Datadog Agent.
- Splits `SpanAggregator.trace_proccessors` into two properties
`SpanAggregator.dd_proccessors` and `SpanAggregator.user_processors`.
`SpanAggregator.users_proccessors` is set after application startup via
`Tracer.configure(..)` while `SpanAggregator.dd_proccessors` is internal
to the ddtrace library and should only be set by ddtrace components.
This separation allows us to avoid recreating all trace processors when
`tracer.configure()` is called.
 - Adds a more descriptive release note to an unreleased fix.

## Checklist
- [x] PR author has checked that all the criteria below are met
- The PR description includes an overview of the change
- The PR description articulates the motivation for the change
- The change includes tests OR the PR description describes a testing
strategy
- The PR description notes risks associated with the change, if any
- Newly-added code is easy to change
- The change follows the [library release note
guidelines](https://ddtrace.readthedocs.io/en/stable/releasenotes.html)
- The change includes or references documentation updates if necessary
- Backport labels are set (if
[applicable](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting))

## Reviewer Checklist
- [x] Reviewer has checked that all the criteria below are met
- Title is accurate
- All changes are related to the pull request's stated goal
- Avoids breaking
[API](https://ddtrace.readthedocs.io/en/stable/versioning.html#interfaces)
changes
- Testing strategy adequately addresses listed risks
- Newly-added code is easy to change
- Release note makes sense to a user of the library
- If necessary, author has acknowledged and discussed the performance
implications of this PR as reported in the benchmarks PR comment
- Backport labels are set in a manner that is consistent with the
[release branch maintenance
policy](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting)

---------

Co-authored-by: Brett Langdon <[email protected]>
(cherry picked from commit ca79351)
sydney-tung pushed a commit that referenced this pull request Jun 24, 2025
… and on tracer.configure (#13560)

Builds on
7fbdc9f

- Fix: Avoids reinitializing the SpanAggregator on
`tracer.configure(...)` and when an application forks. Instead
`SpanAggregator.reset()` is called. This operation ensures global
configurations are re-applied, trace buffer can be reset, and trace
writer is recreated. This ensures agent based sampling rules are not
reset.

- Clean up
- Removes `writer` parameter from `SpanAggregator.__init__(...)` with
this change the intialization of the global writer is an implementation
detail of the SpanAggregator. There is no longer a need to supply the
`SpanAggregator` with a writer on the initialization of the global
tracer.
- Moves all implementation details of resetting the `SpanAggregator`
from `Tracer.configure(...)` and `Tracer._recreate(...)` to
`SpanAggregator.reset(...)`.
- Removes the initialization of the SpanAggregator from
`_default_span_processors_factory`. With this change the global tracer's
SpanAggregator is never re-created. It's only modified when
`tracer.configure(..)` is used.
- Rename `DatadogSampler._service_based_samplers` property to
`DatadogSampler._agent_based_sampler` to improve clarity. These sampling
rules are no longer supplied via environment variables or a programatic
api, they can only be set by the Datadog Agent.
- Splits `SpanAggregator.trace_proccessors` into two properties
`SpanAggregator.dd_proccessors` and `SpanAggregator.user_processors`.
`SpanAggregator.users_proccessors` is set after application startup via
`Tracer.configure(..)` while `SpanAggregator.dd_proccessors` is internal
to the ddtrace library and should only be set by ddtrace components.
This separation allows us to avoid recreating all trace processors when
`tracer.configure()` is called.
 - Adds a more descriptive release note to an unreleased fix.

## Checklist
- [x] PR author has checked that all the criteria below are met
- The PR description includes an overview of the change
- The PR description articulates the motivation for the change
- The change includes tests OR the PR description describes a testing
strategy
- The PR description notes risks associated with the change, if any
- Newly-added code is easy to change
- The change follows the [library release note
guidelines](https://ddtrace.readthedocs.io/en/stable/releasenotes.html)
- The change includes or references documentation updates if necessary
- Backport labels are set (if
[applicable](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting))

## Reviewer Checklist
- [x] Reviewer has checked that all the criteria below are met 
- Title is accurate
- All changes are related to the pull request's stated goal
- Avoids breaking
[API](https://ddtrace.readthedocs.io/en/stable/versioning.html#interfaces)
changes
- Testing strategy adequately addresses listed risks
- Newly-added code is easy to change
- Release note makes sense to a user of the library
- If necessary, author has acknowledged and discussed the performance
implications of this PR as reported in the benchmarks PR comment
- Backport labels are set in a manner that is consistent with the
[release branch maintenance
policy](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting)

---------

Co-authored-by: Brett Langdon <[email protected]>
mabdinur added a commit that referenced this pull request Jun 27, 2025
… and on tracer.configure [backport 3.9] (#13744)

Backport ca79351 from #13560 to 3.9.

Builds on
7fbdc9f

- Fix: Avoids reinitializing the SpanAggregator on
`tracer.configure(...)` and when an application forks. Instead
`SpanAggregator.reset()` is called. This operation ensures global
configurations are re-applied, trace buffer can be reset, and trace
writer is recreated. This ensures agent based sampling rules are not
reset.

- Clean up
- Removes `writer` parameter from `SpanAggregator.__init__(...)` with
this change the intialization of the global writer is an implementation
detail of the SpanAggregator. There is no longer a need to supply the
`SpanAggregator` with a writer on the initialization of the global
tracer.
- Moves all implementation details of resetting the `SpanAggregator`
from `Tracer.configure(...)` and `Tracer._recreate(...)` to
`SpanAggregator.reset(...)`.
- Removes the initialization of the SpanAggregator from
`_default_span_processors_factory`. With this change the global tracer's
SpanAggregator is never re-created. It's only modified when
`tracer.configure(..)` is used.
- Rename `DatadogSampler._service_based_samplers` property to
`DatadogSampler._agent_based_sampler` to improve clarity. These sampling
rules are no longer supplied via environment variables or a programatic
api, they can only be set by the Datadog Agent.
- Splits `SpanAggregator.trace_proccessors` into two properties
`SpanAggregator.dd_proccessors` and `SpanAggregator.user_processors`.
`SpanAggregator.users_proccessors` is set after application startup via
`Tracer.configure(..)` while `SpanAggregator.dd_proccessors` is internal
to the ddtrace library and should only be set by ddtrace components.
This separation allows us to avoid recreating all trace processors when
`tracer.configure()` is called.
 - Adds a more descriptive release note to an unreleased fix.

## Checklist
- [x] PR author has checked that all the criteria below are met
- The PR description includes an overview of the change
- The PR description articulates the motivation for the change
- The change includes tests OR the PR description describes a testing
strategy
- The PR description notes risks associated with the change, if any
- Newly-added code is easy to change
- The change follows the [library release note
guidelines](https://ddtrace.readthedocs.io/en/stable/releasenotes.html)
- The change includes or references documentation updates if necessary
- Backport labels are set (if
[applicable](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting))

## Reviewer Checklist
- [x] Reviewer has checked that all the criteria below are met 
- Title is accurate
- All changes are related to the pull request's stated goal
- Avoids breaking
[API](https://ddtrace.readthedocs.io/en/stable/versioning.html#interfaces)
changes
- Testing strategy adequately addresses listed risks
- Newly-added code is easy to change
- Release note makes sense to a user of the library
- If necessary, author has acknowledged and discussed the performance
implications of this PR as reported in the benchmarks PR comment
- Backport labels are set in a manner that is consistent with the
[release branch maintenance
policy](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting)

Co-authored-by: Munir Abdinur <[email protected]>
Co-authored-by: Brett Langdon <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants