Welcome back to my blog, this time for a blog post on profiling your Java applications in Cloud Foundry and the tool I helped to develop to make it easier.
Cloud Foundry “is an open source, multi-cloud application platform as a service (PaaS) governed by the Cloud Foundry Foundation, a 501(c)(6) organization” (Wikipedia). It allows you to run your workloads easily in the cloud, including your Java applications. You just need to define a manifest.yml, like for example:
---
applications:
- name: sapmachine21
random-route: true
path: test.jar
memory: 512M
buildpacks:
- sap_java_buildpack
env:
TARGET_RUNTIME: tomcat
JBP_CONFIG_COMPONENTS: "jres: ['com.sap.xs.java.buildpack.jdk.SAPMachineJDK']"
JBP_CONFIG_SAP_MACHINE_JDK : "{ version: 21.+ }"
JBP_CONFIG_JAVA_OPTS: "[java_opts: '-XX:+UnlockDiagnosticVMOptions -XX:+DebugNonSafepoints']"
But how would you profile this application? This and more is the topic of this blog post.
I will not discuss why you might want to use Cloud Foundry or how you can deploy your own applications. I assume you came this far in the blog post because you already have basic Cloud Foundry knowledge and want to learn how to profile your applications easily.
The Java Plugin
Cloud Foundry has a cf CLI with a proper plugin system with lots of plugins. A team at SAP, which included Tim Gerrlach, started to develop the Java plugin many years ago at SAP. It’s a plugin offering utilities to gain insights into JVMs running in your Cloud Foundry app.
You can simply install it via the official plugin repository (more in the README):
cf install-plugin java
It started with support for heap-dumps and thread-dumps:
> cf java heap-dump $APP_NAME -> ./$APP_NAME-heapdump-$RANDOM.hprof
This remotely creates a heap dump and downloads it. You can view these files using tools like the Eclipse Memory Analyzer.
> cf java thread-dump $APP_NAME ... Full thread dump OpenJDK 64-Bit Server VM ... ...
This command obtains and prints a thread dump that you can use to analyze the currently running threads of your applications. You can visualize these dumps with tools like Samurai.
Please be aware that this only works if you use a SAPJVM/SapMachine JRE/JDK or a non-SapMachine JDK. Only SapMachine JREs are guaranteed to include the necessary Java command-line tools that the Java plugin requires.
Profiling your Applications
But wouldn’t it be nice to profile your applications directly via the Java plugin? This is why the SapMachine Team took over the development of the CF plugin. We wanted to make it as easy as possible to record profiles without having to use cf ssh into the Java application only to then manually download the recordings. This is cumbersome and error-prone.
So with the newest releases of the CF plugin, you have the power of JFR and Async-Profiler at your fingertips.
But please remember that we focus on SapMachine here.
Profiling with JFR
To profile a Java application and obtain the JFR file, you can simply use the jfr-* commands of the plugin (via cf java --help):
jfr-start: Start a basic Java Flight Recorder profile on a running Java applicationjfr-start-profile: Start a recording with theprofile.jfcsetting, that records more information and has shorter profiling intervalsjfr-start-gc: Start a recording withgc.jfc(SapMachine only) which focuses on garbage collection eventsjfr-start-gc-details: More garbage collection detailsjfr-stop: Stop a Java Flight Recorder recording on a running Java application and download the recordingjfr-dump:jfr-stopwithout stopping the applicationjfr-status: Check whether there is a running Java Flight Recorder recording
To, for example, check the status of the recording, you can run:
> cf java thread-dump $APP_NAME No available recordings. Use jcmd ... JFR.start to start a recording.
A simple profiling workflow usually looks like the following:
# Start recording > cf java jfr-start $APP_NAME # Interact with the application ... # Stop and download the recording > cf java jfr-stop $APP_NAME -> creates a JFR file in your current local folder
You can then view the JFR files, either with the OpenJDK’s own jfr tool, JDK Mission Control, our Plugin for IntelliJ, as well as many other Java profiling tools.
The jfr-* commands work with non-SapMachine OpenJDK distributions’ JDK too.
Profiling with Async-Profiler
Last year, the SapMachine team decided to include a build of Async-Profiler directly into our JDK and JRE builds to make profiling Java applications as easy as possible. It has another benefit too: We can use Async-Profiler’s native version of the Java tool jcmd, so we don’t need to start another JVM just to trigger a heap dump or thread dump. But back to the profiling. The CF plugin offers commands for Async-Profiler that are similar to the ones for JFR:
asprof-start-cpu: Start a CPU-time Async-Profiler recording on a running Java application, creating a JFR fileasprof-start-wall: Start a wall-clock recording (similar to JFR method profiling)asprof-start-alloc: Start an allocation profile that helps you to find heap polluting classesasprof-start-lock: Start a lock profile that tracks whenever code obtains a lockasprof-status: Check whether there is a running Async-Profiler recording
A typical profiling workflow looks not too different from the previous JFR profiling workflow:
# Start recording > cf java asprof-start-cpu $APP_NAME # Interact with the application ... # Stop and download the recording > cf java asprof-stop $APP_NAME -> creates a JFR file in your current local folder
There is also the asprof command that allows you to use Async-Profiler directly. But I would not recommend using it if any of the other commands are sufficient, as getting it right on the first try is pretty complicated and challenging.
The asprof-* commands only work on non-SapMachines if your Java distribution includes the asprof binary.
Other Commands
Three other CF plugin commands might be interesting:
> cf java vm-info $APP_NAME # # JRE version: OpenJDK Runtime Environment SapMachine (21.0.7+6) (build 21.0.7+6-LTS) # Java VM: OpenJDK 64-Bit Server VM SapMachine (21.0.7+6-LTS, mixed mode, sharing, tiered, compressed oops, compressed class ptrs, serial gc, linux-amd64) ...
vm-info gives you lots of information on the current state of the JVM running your application.
> cf java vm-version $APP_NAME OpenJDK 64-Bit Server VM version 21.0.7+6-LTS JDK 21.0.7
vm-version gives you information on the current OpenJDK version that you’re running.
> cf java vm-version $APP_NAME
Vitals:
------------system------------
avail: Memory available without swapping [host] [krn]
comm: Committed memory [host]
crt: Committed-to-Commit-Limit ratio (percent) [host]
swap: Swap space used [host]
si: Number of pages swapped in [host] [delta]
so: Number of pages pages swapped out [host] [delta]
p: Number of processes
...
vm-vitals prints many JVM process statistics collected over the last 60 minutes and more coarsely over the previous days, but only on SapMachines.
There is also the jcmd command that allows you to use the jcmd Java utility directly, but I would, as with the asprof command, advise against using it directly.
Limitations
Of course, the plugin has limitations. For example, to get accurate profiles, you want to add the following to your application definition YAML:
JBP_CONFIG_JAVA_OPTS: "[java_opts: '-XX:+UnlockDiagnosticVMOptions -XX:+DebugNonSafepoints']"
Adapted from the plugin README:
The filesystem available to the container also limits the capability of creating heap dumps and profiles. The cf java heap-dump, cf java asprof-stop and cf java jfr-stop commands trigger a write to the file system, read the content of the file over the SSH connection, and then remove the file from the container’s file system (unless you have the -k flag set).
The amount of filesystem space available to a container is set for the entire Cloud Foundry landscape with a global configuration. The size of a heap dump is roughly linear with the allocated memory of the heap, and the profile size is related to the length of the recording. So, it could be that, in case of large heaps, long profiling durations, or the filesystem having too much stuff in it, there is not enough space on the filesystem for creating the file. In that case, the creation of the heap dump or profile recording, and thus the command, will fail.
From the perspective of integration in workflows and overall shell-friendliness, the plugin suffers from some shortcomings in the current cf-cli plugin framework:
- There is no distinction between
stdoutandstderroutput from the underlyingcf sshcommand (see this issue on thecf-cliproject)- The plugin will, however, mostly exit with status code
1when the underpinningcf sshcommand fails - If split between
stdoutandstderris needed, you can run thecf javaplugin in dry-run mode (--dry-runflag) and execute its output instead
- The plugin will, however, mostly exit with status code
Of course, running the commands has side effects. Please refer to the CF plugin README.
Please be aware that the CF Java plugin is mainly used (and tested) in combination with SapMachine; none of the plugin’s features currently work with non-SapMachine/SAPJVM JREs, and only some work with non-SapMachine JDKs.
Testing a Cloud Foundry CLI Plugin
By now, you’re hopefully convinced that the CF Java plugin is really useful. But how is the plugin tested to make it works?
Before the significant rewrite that added the profiling features, the plugin was tested by mocking some functions and checking that the executed SSH commands were the expected ones. This worked well enough while the plugin was small, but hard-coding all the used SSH commands became unsustainable with the growing list of commands.
Therefore, we nowadays test the plugin directly without mocking with a small Java application. The CF plugin is written in Go, but the test cases are deliberately written in Python to use it as a black box. The test code includes a custom black-box testing framework for CF plugins.
To give you an example: The test case that checks that the Async-Profiler workflow from above works is written as follows
class TestAsprofBasic(TestBase):
# ...
@test(no_restart=True)
def test_basic_profile(self, t, app):
"""Test basic async-profiler profile start and stop."""
# Start profiling
t.run(f"asprof-start-cpu {app}") \ # run a command
.should_succeed() \ # error code 0
.should_contain(f"Use 'cf java asprof-stop {app}'") \
.no_files() # no local or remote files created
# Clean up
t.run(f"asprof-stop {app}") \
.should_succeed() \
.should_create_file("*.jfr") \ # a JFR file is created locally
.should_create_no_remote_files()
The goal for the tests is to be as readable as possible, so they also function as documentation. Feel free to peek into the test suite to find all the tested workflows.
Edge Cases
The test suite contains over a hundred tests and should test all of the common workflows, all of the flags and options, and most of the combinations, as well as many edge cases. It includes, for example, test cases that check how the plugin behaves with a non-SapMachine JRE:
class TestJRE21CommandFailures(TestBase):
"""Test that JRE21 app fails for all commands requiring JDK tools."""
@test("jre21", no_restart=True)
def test_heap_dump_fails(self, t, app):
"""Test that heap-dump fails on JRE21."""
t.run(f"heap-dump {app}")
.should_fail()
.should_contain("jvmmon or jmap are required for generating heap dump")
@test("jre21", no_restart=True)
def test_thread_dump_fails(self, t, app):
"""Test that thread-dump fails on JRE21."""
t.run(f"thread-dump {app}")
.should_fail()
.should_contain("jvmmon or jmap are required for")
@test("jre21", no_restart=True)
def test_vm_info_fails(self, t, app):
"""Test that vm-info fails on JRE21."""
t.run(f"vm-info {app}")
.should_fail()
.should_contain("jcmd not found")
# ...
Or how the plugin behaves with a full disk on the remote:
class TestDiskFull(TestBase):
"""Tests for disk full scenarios."""
@test(no_restart=True)
def test_heap_dump(self, t, app):
"""Test JFR functionality with disk full simulation."""
with DiskFullContext(app):
t.run(f"heap-dump {app}").should_fail().should_contain("No space left on device").no_files()
# ...
With so many tests, the project consists of three times as much Python code as Go code. But this makes me quite confident that there shouldn’t be too many bugs in there. And even if there are bugs, it should be quite easy to add a small reproduction to the test suite. This is the perfect segway to the next section…
Issues?
The CF plugin is open to feature requests/suggestions, bug reports, etc. via GitHub issues. Contribution and feedback are encouraged and always welcome. Just be aware that this plugin is limited in scope to keep it maintainable. For more information about how to contribute, the project structure, and additional contribution information, see our Contribution Guidelines.
If you find a bug that may be a security problem, please follow the instructions in our security policy on how to report it. Please do not create GitHub issues for security-related doubts or problems.
Conclusion
The Cloud Foundry CLI Java plugin lets you easily profile your Java application and obtain heap and thread dumps. It builds on top of the SapMachine to make running applications, like those based on CAP, easier in Cloud Foundry cloud environments.
See you next week for the fourth part of my series on Java’s new CPU-time profiler.
This blog post is part of my work in the SapMachine team at SAP, making profiling easier for everyone.





