JFR and Equality: A tale of many objects

In the last blog post, I showed you how to silence JFR’s startup messages. This week’s blog post is also related to JFR, and no, it’s not about the JFR Events website, which got a simple search bar. It’s a short blog post on comparing objects from JFR recordings in Java and why this is slightly trickier than you might have expected.

Example

Getting a JFR recording is simple; just use the RecordingStream API. We do this in the following to record an execution trace of a tight loop using JFR and store it in a list:

List<RecordedEvent> events = new ArrayList<>();
// Know when to stop the loop
AtomicBoolean running = new AtomicBoolean(true);
// We obtain one hundred execution samples 
// that have all the same stack trace
final long currentThreadId = Thread.currentThread().threadId();
try (RecordingStream rs = new RecordingStream()) {
    rs.enable("jdk.ExecutionSample").with("period", "1ms");
    rs.onEvent("jdk.ExecutionSample", event -> {
        if (event.getThread("sampledThread")
                 .getJavaThreadId() != currentThreadId) {
            return; // don't record other threads
        }
        events.add(event);
        if (events.size() >= 100) {
            // we can signal to stop
            running.set(false);
        }
    });
    rs.startAsync();
    int i = 0;
    while (running.get()) { // some busy loop to produce sample
        for (int j = 0; j < 100000; j++) {
            i += j;
        }
    }
    rs.stop();
}

We collected these events because we wanted to analyze them. Of course, this is a simplified example, but it’s too far away from what I currently work on: analyzing JFR files using the Java API.

A simple analysis is to check the top methods and frames on top of the stack to get the hot methods:

public static List<RecordedFrame> 
getFramesOnTop(List<RecordedEvent> event) {
    return event.stream()
            .map(RecordedEvent::getStackTrace)
            .map(stackTrace -> stackTrace.getFrames().get(0))
            .toList();
}

This is kind of pointless here because we seemingly know the answer already: All top methods are the same, as we just ran in the main methods.

The Number of Unique Methods and Frames

But let’s be thorough and look at the top frame and top method of every stack trace and check how many unique frames and methods there are:

 // First we check the methods
List<RecordedMethod> uniqueMethods = getMethodsOnTop(events);
Set<RecordedMethod> uniqueSet = new HashSet<>(uniqueMethods);
System.out.println("Total methods on top: " + 
  uniqueMethods.size() + ", unique: " + uniqueSet.size());

// Now the frames
List<RecordedFrame> framesOnTop = getFramesOnTop(events);
Set<RecordedFrame> uniqueFrames = new HashSet<>(framesOnTop);
System.out.println("Total frames on top: " + 
  framesOnTop.size() + ", unique: " + uniqueFrames.size());

This results in:

Total methods on top: 777, unique: 1
Total frames on top: 777, unique: 777

Frames and the Safepoint Bias

As expected, we only have one distinct method at the top. But how do the frames look? Let’s look at the first ten:

Method main at 29
Method main at 29
Method main at 29
Method main at 29
Method main at 29
Method main at 29
Method main at 29
Method main at 29
Method main at 29
Method main at 29

Slight tangent: You might wonder whether all top frames, including the line number, are the same. After all, we’re executing four lines of code (excluding the curly brace). However, the inner loop, looping 100.000 vastly dominates the execution compared to the simple atomic variable access in the while loop, so we only see the inner loop in the probabilistic execution sample. The line number 29 for context is the for-loop header. But why aren’t we seeing the body of the loop appear? Every OpenJDK profiler has a safepoint bias because it doesn’t, by default, have the necessary debug information between safepoints to map from program counter to program line. So it defaults to the nearest safepoint, which is the loop header. To improve this, we can add -XX:+UnlockDiagnosticVMOptions -XX:+DebugNonSafepoints to our program execution and get:

Method main at 29
Method main at 30
Method main at 29
Method main at 30
Method main at 29
Method main at 29
Method main at 30
Method main at 30
Method main at 29
Method main at 30

But that’s not the point of this blog post, as it would only change the number of expected stack frames from one to two.

Coming back to our initial output:

Total methods on top: 777, unique: 1
Total frames on top: 777, unique: 777

That doesn’t make sense for the frames. We would expect to get only one frame (with the safepoint bias), but we don’t. But why?

Frames and Equality

We used a simple HashSet to compute the number of unique frames using the hashCode and equals methods. Let’s compare the first top frame with the following ten frames:

Method main at 29, hashCode: 1010931249, equal to first frame true   
Method main at 29, hashCode: 1099855928, equal to first frame false 
Method main at 29, hashCode: 1629687658, equal to first frame false 
Method main at 29, hashCode: 1007880005, equal to first frame false 
Method main at 29, hashCode: 215219944,  equal to first frame false   
Method main at 29, hashCode: 1043208434, equal to first frame false 
Method main at 29, hashCode: 1192171522, equal to first frame false 
Method main at 29, hashCode: 1661081225, equal to first frame false 
Method main at 29, hashCode: 1882554559, equal to first frame false 
Method main at 29, hashCode: 1049817027, equal to first frame false

This means that, apparently, the implementation of hashCode and equals is different for recorded methods and recorded frames.

In the following, we will solely focus on how both types are implemented differently in serialization, not in the JFR file format. You can read more about the file format itself in Gunnar Morling’s excellent blog post.

Main Difference

TL;DR: Frames are inlined structs/composites of other types, and methods are stored as references in the frames. You can see this difference even in the event definition (via the JFR Events website):

Being an inlined struct means that every stack frame in the Java API is its own object. However, all method references point to the same Method object (even across chunks if I correctly understand the ChunkParser source code). The methods themselves are only written out at the end of every chunk.

The recorded objects (like frames and methods) in the Java API of JFR don’t implement the hashCode and equals methods, so object identity is all we have. That’s a pity, but there is a solution:

A Wrapper to Compare Them All

We need to create a simple wrapper record Wrapper(RecordedObject object) that implements the two methods based on the object’s contents and then uses the wrapped version of the recorded frames.

Let’s look at the implementation of RecordedObject:

public sealed class RecordedObject
        permits /* ... */ {

    final Object[] objects;
    final ObjectContext objectContext;

    public final <T> T getValue(String name) {
        // ...
    }

    public List<ValueDescriptor> getFields() {
        return objectContext.fields;
    }
    
    public final boolean getBoolean(String name) {
        // ...
    }
    
    // ...
    
    @Override
    public final String toString() {
        // ...
    }

}

The ObjectContext cannot be accessed easily, as it’s an internal class. But in my use cases, it’s enough to compare objects based on their content. We have two basic options to access the fields.

Compare all fields and the recording
Access the fields directly via reflection

The problem with the first is the implementation of the underlying getValue method:

private Object getValue(String name, boolean allowUnsigned) {
    Objects.requireNonNull(name, "name");
    int index = 0;
    for (ValueDescriptor v : objectContext.fields) {
        if (name.equals(v.getName())) {
        // ...
        }
   }
   // ...
}

This means that this method is slow. So we’re using reflection via

Field objectsField = RecordedObject.class.getDeclaredField("objects");
objectsField.setAccessible(true);

// later in hashCode and equals
objectsField.get(object)

But is comparing just the Object[]‘s directly via Object::equals enough? Yes, because we usually don’t have nested inlined RecordedObjects (and I solve this issue when it comes up).

But if you compile the wrapper code now, you get something like:

Caused by: java.lang.reflect.InaccessibleObjectException: Unable to make field final jdk.jfr.internal.consumer.ObjectContext jdk.jfr.consumer.RecordedObject.objectContext accessible: module jdk.jfr does not “opens jdk.jfr.consumer” to unnamed module @724b939e

This is due to the module system; the solution is already given in the error message: pass --add-opens jdk.jfr/jdk.jfr.consumer=ALL-UNNAMED on each JVM execution.

The code for the example is on GitHub. Feel free to use it and suggest improvements via PRs or issues.

Conclusion

Comparing the objects returned by the Java JFR API is finicky, but we can get a good workaround with a bit of reflection. The defined wrapper class is helpful for many applications dealing with JFR objects. I, for one, will use it in an upcoming update to my An Experimental Front-End for JFR Queries blog post.

Thanks for coming so far. I’ll see you in a week or two for a blog post on something different, possibly related to JFR queries.

This blog post is part of my work in the SapMachine team at SAP, making profiling easier for everyone.

P.S.: Sometimes both solutions lead home.

Author

Johannes Bechberger

Johannes Bechberger is a JVM developer working on profilers and their underlying technology in the SapMachine team at SAP. This includes improvements to async-profiler and its ecosystem, a website to view the different JFR event types, and improvements to the FirefoxProfiler, making it usable in the Java world. His work today comprises many open-source contributions and his blog, where he regularly writes on in-depth profiling and debugging topics. He also works on hello-ebpf, the first eBPF library for Java. His most recent contribution is the new CPU Time Profiler in JDK 25.

View all posts

New posts like these come out at least every two weeks, to get notified about new posts, follow me on BlueSky, Twitter, Mastodon, or LinkedIn, or join the newsletter:

Mostly nerdless

Every two weeks a text on profiling, debugging or eBPF