Skip to content

Conversation

@zhengyu123
Copy link
Contributor

@zhengyu123 zhengyu123 commented Dec 9, 2025

What does this PR do?:
Instrument/profile native threads, the threads that are created/started outside of JVM, on hotspot/non-musl based JVM.

Motivation:
Enhance Java profiler to profiler native threads.

Additional Notes:
The feature is now only enabled for hotspot based JVM running on non-musl Linux platform. The reasons:

  • A crash seen on aarch64/musl /JDK11. It might not be related to this change, but it is hard to confirm. Disable the feature for musl based Linux for now.
  • J9 has issues to walk native only thread stack in release build, it shows only
    .no_java_frame
    while debug build shows correct stack.

How to test the change?:

  • Regular tests
  • JDK tier1 tests with profiler agent. Although, there are failures, but match java-profiler main line.
    There are failures that are expected:
    • HeapMonitor uses agent, which conflicts with profiler agent
    • Compiler frame is not compatible with agent.

For Datadog employees:

  • If this PR touches code that signs or publishes builds or packages, or handles
    credentials of any kind, I've requested a review from @DataDog/security-design-and-guidance.
  • This PR doesn't touch any of that.
  • JIRA: PROF-11577

Unsure? Have a question? Request a review!

@zhengyu123 zhengyu123 marked this pull request as ready for review December 11, 2025 16:21
Copy link
Collaborator

@jbachorik jbachorik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having the libraries patching tied to ctimer (CPU profiling) does not sound right.
We have a bunch of other engines that can be used either for CPU or wallclock profiling and their functionality would be very inconsistent.

I think the library patching should go to a more generic place - and also be called from a more generic place, like eg. profiler.cpp

@zhengyu123
Copy link
Contributor Author

zhengyu123 commented Dec 12, 2025

Having the libraries patching tied to ctimer (CPU profiling) does not sound right. We have a bunch of other engines that can be used either for CPU or wallclock profiling and their functionality would be very inconsistent.

I think the library patching should go to a more generic place - and also be called from a more generic place, like eg. profiler.cpp

Agree. But I want to limit the changes in this PR and address this issue in separate PR.

@jbachorik
Copy link
Collaborator

Having the libraries patching tied to ctimer (CPU profiling) does not sound right. We have a bunch of other engines that can be used either for CPU or wallclock profiling and their functionality would be very inconsistent.

I think the library patching should go to a more generic place - and also be called from a more generic place, like eg. profiler.cpp

Agree. But I want to limit the changes in this PR and address this issue in separate PR.

Ok. But create a followup ticket for that, plz. And let's move to the proper placement asap - we really don't want to get this partial task accidentally released.

@jbachorik
Copy link
Collaborator

How will this work with dynamically loaded libraries? Am I reading the code right that we will not patch those?

@DataDog DataDog deleted a comment from zhengyu123 Dec 12, 2025
Copy link
Collaborator

@jbachorik jbachorik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved with the two followup tickets

@pr-commenter
Copy link

pr-commenter bot commented Dec 12, 2025

Benchmarks [x86_64 wall]

Parameters

Baseline Candidate
config baseline candidate
ddprof 1.34.4 1.35.0-zgu_inst_native_thread-SNAPSHOT
See matching parameters
Baseline Candidate
alloc off off
cpu off off
iterations 5 5
java "11.0.28" "11.0.28"
memleak off off
modes wall wall
wall on on

Summary

Found 0 performance improvements and 2 performance regressions! Performance is the same for 14 metrics, 22 unstable metrics.

scenario Δ mean execution_time Δ mean rss
scenario:renaissance:fj-kmeans worse
[+487.823ms; +628.177ms] or [+2.095%; +2.698%]
unstable
[-240.876MB; +356.547MB] or [-23.161%; +34.283%]
scenario:renaissance:gauss-mix worse
[+751.908ms; +964.092ms] or [+4.165%; +5.340%]
unstable
[-391.951MB; +506.412MB] or [-33.122%; +42.795%]

@pr-commenter
Copy link

pr-commenter bot commented Dec 12, 2025

Benchmarks [x86_64 cpu]

Parameters

Baseline Candidate
config baseline candidate
ddprof 1.34.4 1.35.0-zgu_inst_native_thread-SNAPSHOT
See matching parameters
Baseline Candidate
alloc off off
cpu on on
iterations 5 5
java "11.0.28" "11.0.28"
memleak off off
modes cpu cpu
wall off off

Summary

Found 0 performance improvements and 3 performance regressions! Performance is the same for 12 metrics, 23 unstable metrics.

scenario Δ mean execution_time Δ mean rss
scenario:renaissance:page-rank worse
[+0.843s; +1.681s] or [+1.722%; +3.436%]
unstable
[-125.381MB; +284.402MB] or [-8.547%; +19.387%]
scenario:renaissance:fj-kmeans worse
[+391.875ms; +504.125ms] or [+1.672%; +2.151%]
unstable
[-241.979MB; +363.119MB] or [-23.002%; +34.517%]
scenario:renaissance:gauss-mix worse
[+650.645ms; +941.355ms] or [+3.586%; +5.188%]
unstable
[-396.052MB; +507.154MB] or [-33.263%; +42.594%]

@pr-commenter
Copy link

pr-commenter bot commented Dec 12, 2025

Benchmarks [x86_64 memleak]

Parameters

Baseline Candidate
config baseline candidate
ddprof 1.34.4 1.35.0-zgu_inst_native_thread-SNAPSHOT
See matching parameters
Baseline Candidate
alloc off off
cpu off off
iterations 5 5
java "11.0.28" "11.0.28"
memleak on on
modes memleak memleak
wall off off

Summary

Found 0 performance improvements and 1 performance regressions! Performance is the same for 14 metrics, 23 unstable metrics.

scenario Δ mean execution_time Δ mean rss
scenario:renaissance:gauss-mix worse
[+588.434ms; +879.566ms] or [+3.231%; +4.830%]
unstable
[-395.580MB; +506.232MB] or [-33.256%; +42.559%]

@pr-commenter
Copy link

pr-commenter bot commented Dec 12, 2025

Benchmarks [x86_64 cpu,wall]

Parameters

Baseline Candidate
config baseline candidate
ddprof 1.34.4 1.35.0-zgu_inst_native_thread-SNAPSHOT
See matching parameters
Baseline Candidate
alloc off off
cpu on on
iterations 5 5
java "11.0.28" "11.0.28"
memleak off off
modes cpu,wall cpu,wall
wall on on

Summary

Found 0 performance improvements and 3 performance regressions! Performance is the same for 12 metrics, 23 unstable metrics.

scenario Δ mean execution_time Δ mean rss
scenario:renaissance:finagle-http worse
[+582.695ms; +869.305ms] or [+2.189%; +3.266%]
unstable
[-260.992MB; +377.187MB] or [-19.077%; +27.571%]
scenario:renaissance:dec-tree worse
[+777.105ms; +978.895ms] or [+2.494%; +3.142%]
unstable
[-249.420MB; +341.512MB] or [-17.151%; +23.483%]
scenario:renaissance:gauss-mix worse
[+720.755ms; +827.245ms] or [+3.963%; +4.548%]
unstable
[-402.906MB; +501.871MB] or [-33.702%; +41.980%]

@pr-commenter
Copy link

pr-commenter bot commented Dec 12, 2025

Benchmarks [x86_64 alloc]

Parameters

Baseline Candidate
config baseline candidate
ddprof 1.34.4 1.35.0-zgu_inst_native_thread-SNAPSHOT
See matching parameters
Baseline Candidate
alloc on on
cpu off off
iterations 5 5
java "11.0.28" "11.0.28"
memleak off off
modes alloc alloc
wall off off

Summary

Found 0 performance improvements and 2 performance regressions! Performance is the same for 13 metrics, 23 unstable metrics.

scenario Δ mean execution_time Δ mean rss
scenario:renaissance:scala-kmeans worse
[+421.652ms; +1478.348ms] or [+1.824%; +6.396%]
unstable
[-227.228MB; +342.267MB] or [-22.934%; +34.546%]
scenario:renaissance:gauss-mix worse
[+720.616ms; +839.384ms] or [+3.964%; +4.618%]
unstable
[-402.863MB; +503.579MB] or [-33.641%; +42.051%]

@pr-commenter
Copy link

pr-commenter bot commented Dec 12, 2025

Benchmarks [aarch64 wall]

Parameters

Baseline Candidate
config baseline candidate
ddprof 1.34.4 1.35.0-zgu_inst_native_thread-SNAPSHOT
See matching parameters
Baseline Candidate
alloc off off
cpu off off
iterations 5 5
java "11.0.28" "11.0.28"
memleak off off
modes wall wall
wall on on

Summary

Found 0 performance improvements and 3 performance regressions! Performance is the same for 15 metrics, 20 unstable metrics.

scenario Δ mean execution_time Δ mean rss
scenario:renaissance:future-genetic worse
[+478.924ms; +745.076ms] or [+3.194%; +4.968%]
unstable
[-244.748MB; +563.809MB] or [-28.885%; +66.540%]
scenario:renaissance:fj-kmeans worse
[+473.709ms; +1266.291ms] or [+2.260%; +6.043%]
unstable
[-242.451MB; +347.750MB] or [-23.644%; +33.912%]
scenario:renaissance:scala-kmeans worse
[+404.605ms; +703.395ms] or [+1.681%; +2.922%]
unstable
[-224.543MB; +337.127MB] or [-23.069%; +34.635%]

@pr-commenter
Copy link

pr-commenter bot commented Dec 12, 2025

Benchmarks [aarch64 cpu]

Parameters

Baseline Candidate
config baseline candidate
ddprof 1.34.4 1.35.0-zgu_inst_native_thread-SNAPSHOT
See matching parameters
Baseline Candidate
alloc off off
cpu on on
iterations 5 5
java "11.0.28" "11.0.28"
memleak off off
modes cpu cpu
wall off off

Summary

Found 0 performance improvements and 1 performance regressions! Performance is the same for 15 metrics, 22 unstable metrics.

scenario Δ mean execution_time Δ mean rss
scenario:renaissance:future-genetic worse
[+576.550ms; +703.450ms] or [+3.852%; +4.700%]
unstable
[-251.781MB; +561.880MB] or [-29.458%; +65.738%]

@pr-commenter
Copy link

pr-commenter bot commented Dec 12, 2025

Benchmarks [x86_64 cpu,wall,alloc,memleak]

Parameters

Baseline Candidate
config baseline candidate
ddprof 1.34.4 1.35.0-zgu_inst_native_thread-SNAPSHOT
See matching parameters
Baseline Candidate
alloc on on
cpu on on
iterations 5 5
java "11.0.28" "11.0.28"
memleak on on
modes cpu,wall,alloc,memleak cpu,wall,alloc,memleak
wall on on

Summary

Found 0 performance improvements and 4 performance regressions! Performance is the same for 13 metrics, 21 unstable metrics.

scenario Δ mean execution_time Δ mean rss
scenario:renaissance:finagle-http worse
[+799.445ms; +1064.555ms] or [+3.027%; +4.031%]
unstable
[-278.796MB; +360.943MB] or [-20.209%; +26.163%]
scenario:renaissance:future-genetic worse
[+337.593ms; +1230.407ms] or [+2.128%; +7.755%]
unstable
[-309.646MB; +415.627MB] or [-31.776%; +42.652%]
scenario:renaissance:par-mnemonics worse
[+705.599ms; +1162.401ms] or [+2.719%; +4.479%]
unstable
[-196.383MB; +310.634MB] or [-18.090%; +28.615%]
scenario:renaissance:gauss-mix worse
[+680.559ms; +907.441ms] or [+3.747%; +4.996%]
unstable
[-401.064MB; +505.016MB] or [-33.545%; +42.239%]

@pr-commenter
Copy link

pr-commenter bot commented Dec 12, 2025

Benchmarks [x86_64 memleak,alloc]

Parameters

Baseline Candidate
config baseline candidate
ddprof 1.34.4 1.35.0-zgu_inst_native_thread-SNAPSHOT
See matching parameters
Baseline Candidate
alloc on on
cpu off off
iterations 5 5
java "11.0.28" "11.0.28"
memleak on on
modes memleak,alloc memleak,alloc
wall off off

Summary

Found 0 performance improvements and 2 performance regressions! Performance is the same for 13 metrics, 23 unstable metrics.

scenario Δ mean execution_time Δ mean rss
scenario:renaissance:scala-kmeans worse
[+0.410s; +1.890s] or [+1.789%; +8.250%]
unstable
[-230.178MB; +339.651MB] or [-23.184%; +34.211%]
scenario:renaissance:gauss-mix worse
[+734.479ms; +909.521ms] or [+4.051%; +5.017%]
unstable
[-397.246MB; +506.827MB] or [-33.333%; +42.528%]

@pr-commenter
Copy link

pr-commenter bot commented Dec 12, 2025

Benchmarks [aarch64 memleak,alloc]

Parameters

Baseline Candidate
config baseline candidate
ddprof 1.34.4 1.35.0-zgu_inst_native_thread-SNAPSHOT
See matching parameters
Baseline Candidate
alloc on on
cpu off off
iterations 5 5
java "11.0.28" "11.0.28"
memleak on on
modes memleak,alloc memleak,alloc
wall off off

Summary

Found 0 performance improvements and 3 performance regressions! Performance is the same for 13 metrics, 22 unstable metrics.

scenario Δ mean execution_time Δ mean rss
scenario:renaissance:future-genetic worse
[+473.625ms; +622.375ms] or [+3.145%; +4.133%]
unstable
[-244.830MB; +564.720MB] or [-28.860%; +66.569%]
scenario:renaissance:chi-square worse
[+742.960ms; +1249.040ms] or [+4.732%; +7.955%]
unstable
[-354.940MB; +502.860MB] or [-32.480%; +46.015%]
scenario:renaissance:naive-bayes worse
[+477.032ms; +1366.968ms] or [+3.253%; +9.322%]
unstable
[-293.509MB; +662.374MB] or [-30.402%; +68.610%]

@pr-commenter
Copy link

pr-commenter bot commented Dec 12, 2025

Benchmarks [aarch64 alloc]

Parameters

Baseline Candidate
config baseline candidate
ddprof 1.34.4 1.35.0-zgu_inst_native_thread-SNAPSHOT
See matching parameters
Baseline Candidate
alloc on on
cpu off off
iterations 5 5
java "11.0.28" "11.0.28"
memleak off off
modes alloc alloc
wall off off

Summary

Found 0 performance improvements and 3 performance regressions! Performance is the same for 13 metrics, 22 unstable metrics.

scenario Δ mean execution_time Δ mean rss
scenario:renaissance:future-genetic worse
[+396.862ms; +639.138ms] or [+2.630%; +4.236%]
unstable
[-263.453MB; +526.252MB] or [-29.943%; +59.812%]
scenario:renaissance:chi-square worse
[+684.520ms; +1007.480ms] or [+4.324%; +6.364%]
unstable
[-362.298MB; +474.362MB] or [-32.708%; +42.826%]
scenario:renaissance:naive-bayes worse
[+751.380ms; +980.620ms] or [+5.054%; +6.596%]
unstable
[-260.480MB; +672.238MB] or [-27.154%; +70.077%]

@pr-commenter
Copy link

pr-commenter bot commented Dec 12, 2025

Benchmarks [aarch64 cpu,wall]

Parameters

Baseline Candidate
config baseline candidate
ddprof 1.34.4 1.35.0-zgu_inst_native_thread-SNAPSHOT
See matching parameters
Baseline Candidate
alloc off off
cpu on on
iterations 5 5
java "11.0.28" "11.0.28"
memleak off off
modes cpu,wall cpu,wall
wall on on

Summary

Found 0 performance improvements and 2 performance regressions! Performance is the same for 16 metrics, 20 unstable metrics.

scenario Δ mean execution_time Δ mean rss
scenario:renaissance:future-genetic worse
[+423.380ms; +652.620ms] or [+2.808%; +4.328%]
unstable
[-276.464MB; +492.085MB] or [-30.600%; +54.466%]
scenario:renaissance:fj-kmeans worse
[+328.081ms; +675.919ms] or [+1.549%; +3.191%]
unstable
[-243.479MB; +353.832MB] or [-23.478%; +34.120%]

@pr-commenter
Copy link

pr-commenter bot commented Dec 12, 2025

Benchmarks [aarch64 memleak]

Parameters

Baseline Candidate
config baseline candidate
ddprof 1.34.4 1.35.0-zgu_inst_native_thread-SNAPSHOT
See matching parameters
Baseline Candidate
alloc off off
cpu off off
iterations 5 5
java "11.0.28" "11.0.28"
memleak on on
modes memleak memleak
wall off off

Summary

Found 0 performance improvements and 3 performance regressions! Performance is the same for 13 metrics, 22 unstable metrics.

scenario Δ mean execution_time Δ mean rss
scenario:renaissance:finagle-http worse
[+596.344ms; +747.656ms] or [+1.896%; +2.378%]
unstable
[-211.229MB; +337.623MB] or [-15.248%; +24.372%]
scenario:renaissance:future-genetic worse
[+502.708ms; +757.292ms] or [+3.354%; +5.053%]
unstable
[-267.706MB; +522.957MB] or [-30.333%; +59.256%]
scenario:renaissance:chi-square worse
[+766.182ms; +1069.818ms] or [+4.838%; +6.755%]
unstable
[-376.191MB; +461.550MB] or [-33.695%; +41.340%]

@pr-commenter
Copy link

pr-commenter bot commented Dec 12, 2025

Benchmarks [aarch64 cpu,wall,alloc,memleak]

Parameters

Baseline Candidate
config baseline candidate
ddprof 1.34.4 1.35.0-zgu_inst_native_thread-SNAPSHOT
See matching parameters
Baseline Candidate
alloc on on
cpu on on
iterations 5 5
java "11.0.28" "11.0.28"
memleak on on
modes cpu,wall,alloc,memleak cpu,wall,alloc,memleak
wall on on

Summary

Found 0 performance improvements and 4 performance regressions! Performance is the same for 12 metrics, 22 unstable metrics.

scenario Δ mean execution_time Δ mean rss
scenario:renaissance:future-genetic worse
[+500.612ms; +571.388ms] or [+3.319%; +3.788%]
unstable
[-250.896MB; +563.909MB] or [-29.328%; +65.916%]
scenario:renaissance:chi-square worse
[+653.677ms; +854.323ms] or [+4.090%; +5.346%]
unstable
[-366.047MB; +469.350MB] or [-33.027%; +42.348%]
scenario:renaissance:als worse
[+0.635s; +1.633s] or [+1.674%; +4.307%]
unstable
[-187.536MB; +321.887MB] or [-13.006%; +22.324%]
scenario:renaissance:mnemonics worse
[+0.349s; +1.879s] or [+1.607%; +8.638%]
unstable
[-240.084MB; +353.434MB] or [-23.316%; +34.324%]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants