Support jdk.CPUTimeSample event in ProfileResults for Java 25+ (JEP 509)#15983
Support jdk.CPUTimeSample event in ProfileResults for Java 25+ (JEP 509)#15983iprithv wants to merge 9 commits intoapache:mainfrom
Conversation
mikemccand
left a comment
There was a problem hiding this comment.
Whoa, thank you for picking this up @iprithv -- I left some comments about maybe being less lenient about two sampling methods used a the same time.
| // needed. When available, we skip legacy execution samples to avoid double-counting. | ||
| return name.equals("jdk.CPUTimeSample"); | ||
| } | ||
| return (name.equals("jdk.ExecutionSample") || name.equals("jdk.NativeMethodSample")) |
There was a problem hiding this comment.
I think NativeMethodSample is also a dup, if CPUTime and wall-clock sampling are both enabled?
There was a problem hiding this comment.
Yes, I expanded the comment to say that explicitly so it matches the code. Thanks!
| <setting name="period">1 ms</setting> | ||
| </event> | ||
|
|
||
| <event name="jdk.CPUTimeSample"> |
There was a problem hiding this comment.
Could we turn off the old (wall-clock) sampling (jdk.ExecutionSample, jdk.NativeMethodSample) and just fully commit to CPUTime. In nightly benchmarks we've found the 1 msec sampling to add non-trivial overhead (I think ~5-7% slower).
There was a problem hiding this comment.
The 1msec sampling was necessary for tests in order to find places wasting CPU. most tests just don't run that long.
There was a problem hiding this comment.
Done,
- Added gradle/testing/profiling.linux.jfc: legacy samplers off, jdk.CPUTimeSample on with a 1 ms throttle so short chunks still get dense enough samples.
- Left gradle/testing/profiling.jfc for non-Linux with 1 ms ExecutionSample / NativeMethodSample.
- CodeProfilingPlugin loads the Linux file when os.name contains linux, otherwise profiling.jfc.
Thanks both!
|
|
||
| // Pre-scan to detect if CPU-time samples (Java 25+, JEP 509) are available. | ||
| // If so, prefer them over legacy execution samples to avoid double-counting. | ||
| boolean hasCPUTimeSamples = "cpu".equals(mode) && detectCPUTimeSamples(files); |
There was a problem hiding this comment.
Could we make this stricter? Throw an exception if we see a mix of CPUTime and legacy events? Lucene (and in general anyone profiling, hmm except maybe profiler debuggers heh) should only have one sampler enabled at a time?
There was a problem hiding this comment.
I initially added the strict mixed-sampler check here, but removed it after @rmuir follow up review since the two Lucene JFC files are now mutually exclusive.
So the exclusivity now lives in the selected JFC config instead of a pre-scan in ProfileResults. Thanks!
| * Pre-scan recording files to detect if any jdk.CPUTimeSample events are present. When they are, | ||
| * we prefer them over legacy jdk.ExecutionSample/jdk.NativeMethodSample to avoid double-counting. | ||
| */ | ||
| static boolean detectCPUTimeSamples(List<String> files) throws IOException { |
There was a problem hiding this comment.
Instead of this logic, should we just have profiling.linux.jfc that is loaded for linux, and profiling.jfc used for others?
There was a problem hiding this comment.
Yes, done. Linux uses gradle/testing/profiling.linux.jfc, everyone else gradle/testing/profiling.jfc, selected in CodeProfilingPlugin from the host OS. Thanks!
Signed-off-by: prithvi <prithvisivasankar@gmail.com>
| String name = event.getEventType().getName(); | ||
| switch (mode) { | ||
| case "cpu": | ||
| if (hasCPUTimeSamples) { |
There was a problem hiding this comment.
I don't think we need this boolean parameter nor the if block.
Just something like:
return name.equals("jdk.CPUTimeSample") ||
name.equals("jdk.ExecutionSample") ||
name.equals("jdk.NativeMethodSample");There was a problem hiding this comment.
sure, updated. thanks!
| <!-- | ||
| Collects only execution and method samples at a low interval | ||
| Wall-clock execution samples for CPU profiling on platforms where JEP 509 CPU-time sampling | ||
| is unavailable (non-Linux). On Linux, CodeProfilingPlugin uses profiling.linux.jfc instead. |
There was a problem hiding this comment.
I don't see a profiling.linux.jfc in your PR
There was a problem hiding this comment.
my bad, missed it in adding in previous commit. Thanks!
| * wall-clock samples ({@code jdk.ExecutionSample} / {@code jdk.NativeMethodSample}); only one | ||
| * CPU sampler should be enabled in JFR settings. | ||
| */ | ||
| static boolean resolveCpuTimeSampling(List<String> files) throws IOException { |
There was a problem hiding this comment.
Remove the whole function as the two files are mutually exclusive. This is very verbose and not needed.
There was a problem hiding this comment.
sure, done. thanks!
| } | ||
|
|
||
| // Pre-scan recordings: decide CPU-time vs legacy sampling and reject mixed configurations. | ||
| boolean hasCPUTimeSamples = "cpu".equals(mode) && resolveCpuTimeSampling(files); |
There was a problem hiding this comment.
sure, removed this too. thanks!
Signed-off-by: prithvi <prithvisivasankar@gmail.com>
mikemccand
left a comment
There was a problem hiding this comment.
Thank you for the all the iterations @iprithv -- I think this is close!
| /** | ||
| * JEP 509 CPU-time sampling ({@code jdk.CPUTimeSample}) is implemented on Linux. Use a Linux-only | ||
| * JFR template with CPU-time sampling only; other hosts keep wall-clock execution sampling for | ||
| * short tests. |
There was a problem hiding this comment.
Wait, it's for all tests, not just short tests right? Maybe just remove for short tests?
There was a problem hiding this comment.
Sure, done. Updated. Thanks!
| Provider<Boolean> frametypesOption) {} | ||
|
|
||
| /** | ||
| * JEP 509 CPU-time sampling ({@code jdk.CPUTimeSample}) is implemented on Linux. Use a Linux-only |
There was a problem hiding this comment.
Maybe is currently (Java 25) implemented only on Linux, so we use a Linux-only JFR template...?
There was a problem hiding this comment.
yes, updated. Thanks!
| */ | ||
| private static String profilingSettingsPath() { | ||
| String os = System.getProperty("os.name", "").toLowerCase(Locale.ROOT); | ||
| return os.contains("linux") ? "testing/profiling.linux.jfc" : "testing/profiling.jfc"; |
There was a problem hiding this comment.
Actually, instead of baking linux in the name, could we use cputime? Maybe in the future other OS's can support CPU time sampling too...
There was a problem hiding this comment.
Sure, makes sense. Thanks!
| --> | ||
| <configuration version="2.0" label="TestProfilingLinux" description="CPU-time sampling for unit tests (Linux)" provider="Apache"> | ||
| <event name="jdk.ExecutionSample"> | ||
| <setting name="enabled">false</setting> |
There was a problem hiding this comment.
Oh, are these events enabled by default, so we have to have these entries (jdk.ExecutionSample and jdk.NativeMethodSample) to explicitly then disable them?
There was a problem hiding this comment.
i think you are correct, I don't believe it is strictly necessary to have them in this file with enabled: false.
There was a problem hiding this comment.
removed the explicit disabled entries. The CPU time template now only enables jdk.CPUTimeSample plus the allocation events used by heap profiling. Thanks!
|
|
||
| Build | ||
| --------------------- | ||
| * GITHUB#15926: Support jdk.CPUTimeSample event (Java 25+, JEP 509) in ProfileResults. |
There was a problem hiding this comment.
Can we word a bit more user-friendly-y? Maybe Use the new (Java 25+, JEP 509) CPU time sampling profiler when available (currently just on Linux), and fall back to the legacy wall-clock...?
There was a problem hiding this comment.
sure, done. thanks!
Signed-off-by: prithvi <prithvisivasankar@gmail.com>
Signed-off-by: prithvi <prithvisivasankar@gmail.com>
Description
Java 25 added CPU-time profiling (JEP 509) which emits jdk.CPUTimeSample
events. ProfileResults did not recognize this event type, causing empty
profiling reports when the CPU-time profiler is active.
Resolves #15926