JVM sustaining engineer at Oracle.
Photo by Alex Motoc.
Quite a while ago I got a question on Twitter X asking whether its possible to trigger a flight recording when the CPU usage goes up. The tweeter mentioned it is possible to accomplish this using JDK Mission Control – but that requires that you always have an instance of JDK Mission Control running with access to the JVM. Enter instead the JDK Flight Recorder’s RecordingStream
, available since JDK 14. With this you can easily hook into the JFR stream of events.
The functionality we want to implement is roughly this:
while (true) {
var cpuLoad = getCpuLoad();
if (cpuLoad > 0.6) {
dumpJFR();
}
}
We open a recording stream from the current JVM, enable the jdk.CPULoad
event to be triggered every 5 seconds. For each CPU load event the “systemTotal” attribute is checked, and when it exceeds 60% the default JFR recording is written to disk.
try (RecordingStream stream = new RecordingStream()) {
stream.enable("jdk.CPULoad").withPeriod(Duration.ofSeconds(5));
stream.onEvent("jdk.CPULoad", (event) -> {
float cpuLoad = event.getFloat("systemTotal");
if(cpuLoad > 0.6) {
dumpJFR("recording.jfr");
}
});
stream.start();
}
The RecordingStream
API has a dump
method so we can simply dump the recording to disk.
if(cpuLoad > 0.6) {
stream.dump("recording.jfr");
}
If we only ran this code as is, we would indeed get a recording.jfr
file. However, with only the jdk.CPULoad
event being enabled, we would only get jdk.CPULoad
events in the recording. If the application was started with the JVM argument -XX:StartFlightRecording=name=default
, a recording with all the default events would be dumped.
This highlights one important thing to notice when working with JFR and recordings – there is really only one instance of the flight recorder in each JVM, despite how many recordings we might have running. This is also apparent when we issue the dump()
method on the stream. The name is not save(), or something friendlier, because it really does dump the entirety of what the flight recorder has in its storage. This means that if we were to do a new dump() right after, it’d be virtually empty.
As mentioned, the JFR recording stream was added to JDK 14. If you’re still stuck on JDK 8, are you out of luck? No, for this particular use-case, where you want to monitor CPU usage, you can use the javax Management API. It’s a wee bit more complex, and less intuitive – but it works.
import java.lang.management.ManagementFactory;
import javax.management.MBeanServer;
import javax.management.ObjectName;
public class CPUMonitor {
public double getCpuLoad() throws Exception {
final MBeanServer mrBean = ManagementFactory.getPlatformMBeanServer();
final ObjectName os = ObjectName.getInstance("java.lang:type=OperatingSystem");
// Get wanted attribute; SystemCpuLoad or ProcessCpuLoad
final Double cpuLevel = (double)mrBean.getAttribute(os, "SystemCpuLoad");
return cpuLevel;
}
}
You get the Platform MBean Server, retrieve the operating system object, which has the SystemCpuLoad
attribute showing the CPU load.
With the JFR stream API its easy to define how often the event should be created with the enable(event).withPeriod(period)
call. In JDK 8 we can instead manually poll the CPU load using a ScheduledExecutorService
.
ScheduledExecutorService executor = Executors.newScheduledThreadPool(1);
executor.scheduleAtFixedRate(()->{
double cpuLoad = getCpuLoad();
if(cpuLoad > 0.6) {
dumpJFR("recording.jfr");
}
}, 0, 5, TimeUnit.SECONDS);
Using a JFR stream in later JDK versions its easy to dump a recording. In JDK 8 you can instead invoke the jfrDump
diagnostics command.
Although requiring quite a bit of ceremony, its not overly complex. You get the DiagnosticCommand object from the platform mbean server, and invoke the jfrDump command.
private void dumpJFR(File jfrFile) throws Exception {
final MBeanServer mrBean = ManagementFactory.getPlatformMBeanServer();
final String[] signature = {"[Ljava.lang.String;"};
final ObjectName name = ObjectName.getInstance("com.sun.management:type=DiagnosticCommand");
final Object[] params = new Object[1];
params[0] = new String[]{"name=default", "filename="+jfrFile};
mrBean.invoke(name, "jfrDump", params, signature);
}
The Javadoc for getSystemCpuLoad states it returns an average “over the recent time period being observed”.
double getSystemCpuLoad()
This means that the value returned is based on when you called this method the last time; getSystemCpuLoad is not an idempotent call. The last value is global for the entire JVM, so you can’t query the CPU load from different threads or places at different times, and expect to get values that are deterministic.
Time | Consumer 1 | Consumer 2 |
---|---|---|
CPU Load interval | CPU Load interval | |
10:00 | 5 | |
10:01 | . | |
10:02 | . | |
10:03 | . | |
10:04 | . | |
10:05 | 5 | |
10:06 | . | |
10:07 | 2 | |
10:08 | . | |
10:09 | . | |
10:10 | 3 |
The example shows how one consumer tries to repeatedly read the CPU load during the last 5 seconds, when a second consumer comes in and reads the CPU load after 2 seconds, manifesting the observer effect, leaving only 3 seconds for consumer 1.
Worth noting, is that the JFR CPU load event has its own state. Since the JFR CPU load event is also triggered using the JFR API, it is easier to assert which resolution you get – there are no other possibilities to query the JFR event, whereas the getSystemCpuLoad
can in theory be called unnoticed by other code, or even from another JVM over JMX.
With the solution using one API for JDK 8, and another one for later JDKs, I’ve packaged the entire solution in a Multi-JAR. Multi-JAR allows for different Java source files targetting different JDK releases to be included in the same JAR; if executed on JDK 8, or earlier, the standard “classes” dir in the JAR are used, but if it’s running with a later JDK, classes are also loaded from that JDK-specific classes dir in the JAR, f.i. “classes-17”.
For more details on the entire solution, the resulting code is available on my github.
tags: HotSpot - JFR