Hello eBPF: Recording data in event buffers (3)

Please be aware that this blog post uses the discontinued libbcc-based API in hello-ebpf.

Welcome back to my blog series on eBPF. Last week, I showed you how the eBPF program and Java application can communicate using eBPF maps. This allowed us to write an application that counts the number of execve calls per user.

This week, I’ll show you briefly how to use another kind of eBPF maps, the perf event buffer, and run tests with docker and JUnit 5.

This blog post is shorter than the previous one as I’m preparing for the OpenJDK committers workshop in Brussels and my Python and Java DevRoom talks at FOSDEM. I’m happy to meet my readers; say hi when you’re there.

Perf Event Buffer

Data structures, like the hash map described in the previous blog post, are great for storing data but have their limitation when we want to pass new bits of information continuously from the eBPF program to our user-land application. This is especially pertinent when recording performance events. So, in 2015, the Linux kernel got a new map type: BPF_MAP_TYPE_PERF_EVENT_ARRAY. This map type functions as a fixed-size ring buffer that can store elements of a given size and is allocated per CPU. The eBPF program submits data to the buffer, and the user-land application retrieves it. When the buffer is full, data can’t be submitted, and a drop counter is incremented.

Perf Event Buffers have their issues, as explained by Andrii Nakryiko, so in 2020, eBPF got ring buffers, which have less overhead. Perf Event Buffers are still used, as only Linux 5.8 and above supports ring buffers. It doesn’t make a difference for our toy examples, but I’ll show you how to use ring buffers in a few weeks.

You can read more about Perf Event Buffers in the Learning eBPF book by Liz Rice, pages 24 to 28.

Example

Now, to a small example, called chapter2.HelloBuffer, which records for every execve call the calling process id, the user id, and the current task name and transmits it to the Java application:

> ./run.sh chapter2.HelloBuffer
2852613 1000 code Hello World  # vs code
2852635 1000 code Hello World
2852667 1000 code Hello World
2852690 1000 code Hello World
2852742 1000 Sandbox Forked Hello World  # Firefox
2852760 1000 pool-4-thread-1 Hello World
2852760 1000 jspawnhelper Hello World    # Java ProcessBuilder
2852760 1000 jspawnhelper Hello World
2852760 1000 jspawnhelper Hello World
2852760 1000 jspawnhelper Hello World
2852760 1000 jspawnhelper Hello World
2852760 1000 jspawnhelper Hello World
2852760 1000 jspawnhelper Hello World
2852760 1000 jspawnhelper Hello World

This gives us already much more information than the simple counter from my last blog post. The eBPF program to achieve this is as follows:

BPF_PERF_OUTPUT(output);                                                 
                                                                         
struct data_t {                                                          
    int pid;                                                             
    int uid;                                                             
    char command[16];                                                    
    char message[12];                                                    
};                                                                       
                                                                         
int hello(void *ctx) {                                                   
    struct data_t data = {};                                             
    char message[12] = "Hello World";                                    
    
    // obtain process and user id                                                                     
    data.pid = bpf_get_current_pid_tgid() >> 32;                         
    data.uid = bpf_get_current_uid_gid() & 0xFFFFFFFF;                   
    
    // obtain the current task/thread/process name, 
    // without the folder, of the task that is currently
    // running                                                                     
    bpf_get_current_comm(&data.command, 
        sizeof(data.command));
    // "Safely attempt to read size bytes from kernel space
    //  address unsafe_ptr and store the data in dst." (man-page)           
    bpf_probe_read_kernel(&data.message, 
        sizeof(data.message), message); 
    
    // try to submit the data to the perf buffer                                                                     
    output.perf_submit(ctx, &data, sizeof(data));                        
                                                                         
    return 0;                                                            
}                                                                        

You can get more information on bpf_get_current_com, bpf_probe_read_kernel in the bpf-helpers(7) man-page.

The Java application that reads the buffer and prints the obtained information is not too dissimilar from the example in my previous blog post. We first define the Data type:

record Data(
   int pid, 
   int uid, 
   // we model char arrays as Strings
   // with a size annotation
   @Size(16) String command,
   @Size(12) String message) {}                                                                                                                              
 
// we have to model the data type as before                                                                                                                              
static final BPFType.BPFStructType<Data> DATA_TYPE = 
   new BPFType.BPFStructType<>("data_t",                              
        List.of(                                                                                                               
                new BPFType.BPFStructMember<>("pid", 
                     BPFType.BPFIntType.INT32, 0, Data::pid),                                  
                new BPFType.BPFStructMember<>("uid", 
                     BPFType.BPFIntType.INT32, 4, Data::uid),                                  
                new BPFType.BPFStructMember<>("command", 
                     new BPFType.StringType(16), 8, Data::command),                        
                new BPFType.BPFStructMember<>("message", 
                     new BPFType.StringType(12), 24, Data::message)),                      
        new BPFType.AnnotatedClass(Data.class, List.of()),                                                                     
            objects -> new Data((int) objects.get(0), 
                                (int) objects.get(1), 
                                (String) objects.get(2),
                                (String) objects.get(3)));

You might recognize that the BPF types now have the matching Java type in their type signature. I added this to have more type safety and less casting.

To retrieve the events from the buffer, we first have to open it and pass in a call-back. This call-back is called for every available event when we call PerfEventArray#perf_buffer_poll:

try (var b = BPF.builder("""                                                                                                    
        ...                                                                                                                     
        """).build()) {                                                                                                         
    var syscall = b.get_syscall_fnname("execve");                                                                               
    b.attach_kprobe(syscall, "hello");                                                                                          
                                                                                                                                
    BPFTable.PerfEventArray.EventCallback<Data> print_event = 
      (/* PerfEventArray instance */ array, 
       /* cpu id of the event */     cpu, 
       /* event data */              data, 
       /* size of the event data */  size) -> {                                     
        var d = array.event(data);                                                                                              
        System.out.printf("%d %d %s %s%n", 
            d.pid(), d.uid(), d.command(), d.message());                                         
    };                                                                                                                          
                                                                                                                                
    try (var output = b.get("output", 
         BPFTable.PerfEventArray.<Data>createProvider(DATA_TYPE))
             .open_perf_buffer(print_event)) { 
        while (true) {
            // wait till packages are available,
            // you can a timeout in milliseconds                                                                                                          
            b.perf_buffer_poll();                                                                                               
        }                                                                                                                       
    }                                                                                                                           
}                                                                                                                               
                                                                                                                                
                                                                                                                                
                                                                                                                                

Tests

I’m happy to announce that hello-ebpf now has its own test runner, which uses virtme and docker to run all tests in their own runtime with their own kernel. All this is wrapped in my testutil/bin/java wrapper so that you can run the tests using mvn test:

mvn -Djvm=testutil/bin/java

And the best part? All tests are written using plain JUnit 5. As an example, here is the HelloWorld test:

public class HelloWorldTest {
    @Test
    public void testHelloWorld() throws Exception {
        try (BPF b = BPF.builder("""
                int hello(void *ctx) {
                   bpf_trace_printk("Hello, World!");
                   return 0;
                }
                """).build()) {
            var syscall = b.get_syscall_fnname("execve");
            b.attach_kprobe(syscall, "hello");
            Utils.runCommand("uname", "-r");
            // read the first trace line
            var line = b.trace_readline();
            // assert its content
            assertTrue(line.contains("Hello, World!"));
        }
    }
}

There are currently only two tests, but I plan to add many more.

Conclusion

In this blog post, we learned about Perf Event Buffers, a valuable data structure for repeatedly pushing information from the eBPF program to the user-land application. Implementing this feature, we’re getting closer and closer to completing chapter 2 of the Learning eBPF book. Truth be told, the implementation in the GitHub repository supports enough of the BCC to implement the remaining examples and even the exercises from Chapter 2.

In the next part of the hello-ebpf series, I’ll show you how to tail call in eBPF to other eBPF functions and how to write your first eBPF application that uses the hello-ebpf library as a dependency.

Thanks for joining me on this journey to create a proper Java API for eBPF. Feel free to try the examples for yourself or even write new ones and join the discussions on GitHub. See you in my next blog post or at FOSDEM.

This article is part of my work in the SapMachine team at SAP, making profiling and debugging easier for everyone.

Hello eBPF: Recording data in basic eBPF maps (2)

Please be aware that this blog post uses the discontinued libbcc-based API in hello-ebpf.

Welcome back to my blog series on eBPF. Last week, I introduced eBPF, the series, and the project and showed how you can write a simple eBPF application with Java that prints “Hello World!” whenever a process calls execve:

public class HelloWorld {
  public static void main(String[] args) {
    try (BPF b = BPF.builder("""
            int hello(void *ctx) {
               bpf_trace_printk("Hello, World!");
               return 0;
            }
            """).build()) {
      var syscall = b.get_syscall_fnname("execve");
      b.attach_kprobe(syscall, "hello");
      b.trace_print();
    }
  }
}

But what if we want to send more information from our eBPF program to our userland application than just some logs? For example, to share the accumulated number of execve calls, the processes of a specific user called and transmits information akin to:

record Data(
     /** user id */
     @Unsigned long uid,
     /** group id */
     @Unsigned long gid, 
     /** count of execve calls */
     @Unsigned int counter) {}

This is what this week’s blog post is all about.

Communication

When two regular programs want to share information, they either send data via sockets or use shared memory that both programs can access:

eBPF uses none of the above two approaches: Working with sockets makes a shared state hard to maintain, and using shared memory is difficult because the eBPF program lives in the kernel and the Java program in userland. Accessing any userland memory from eBPF at all is deemed to be experimental, according to the official BPF Design Q&A:

Q: Can BPF overwrite arbitrary user memory?

A: Sort-of.

Tracing BPF programs can overwrite the user memory of the current task with bpf_probe_write_user(). Every time such program is loaded the kernel will print warning message, so this helper is only useful for experiments and prototypes. Tracing BPF programs are root only.

BPF Design Q&A

But how can we then communicate? This is where eBPF maps come in:

BPF ‘maps’ provide generic storage of different types for sharing data between kernel and user space. There are several storage types available, including hash, array, bloom filter and radix-tree. Several of the map types exist to support specific BPF helpers that perform actions based on the map contents.

BPF maps are accessed from user space via the bpf syscall, which provides commands to create maps, lookup elements, update elements and delete elements.

LINUX Kernel Documentation

These fixed-size data structures form the backbone of every eBPF application, and their support is vital to creating any non-trivial tool.

Using basic eBPF maps

Using these maps, we can implement our execve-call-counter eBPF program. We start with the simple version that just stores the counter in a simple user-id-to-counter hash map:

// macro to create a uint64_t to uin64_t hash map
BPF_HASH(counter_table);

// u64 (also known as uint64_t) is an unsigned
// integer with a width of 64 bits
// in Java terms, it's the unsigned version
// of long

int hello(void *ctx) {
   u64 uid;
   u64 counter = 0;
   u64 *p;

   uid = bpf_get_current_uid_gid() & 0xFFFFFFFF;
   p = counter_table.lookup(&uid);
   // p is null if the element is not in the map
   if (p != 0) {
      counter = *p;
   }
   counter++;
   counter_table.update(&uid, &counter);
   return 0;
}

This example is from the Learning eBPF book by Liz Rice, pages 21 to 23, where you can find a different take. And if you’re wondering why we’re using u64 instead of the more standard uint64_t, this is because the Linux kernel predates the definition of u64 (and other such types) in stdint.h (see StackOverflow), although today it’s possible to use both.

In this example, we first create a hash called counter_table using the bcc macro BPF_HASH. We can access the hash map using the bcc-only method lookup and update, which are convenience wrappers for void *bpf_map_lookup_elem(struct bpf_map *map, const void *key) and long bpf_map_update_elem(struct bpf_map *map, const void *key,const void *value, u64 flags) (see the bpf-helpers man-page). Additionally, we use bpf_get_current_uid_gid() to get the current user-id:

u64 bpf_get_current_uid_gid(void)

Description Get the current uid and gid.

Return A 64-bit integer containing the current GID and UID, and created as such: current_gid << 32 | current_uid.

bpf-helpers man-page

A side note regarding naming: “table” and “map” are used interchangeably in the bcc Python-API and related examples, which I carried over into the Java-API for consistency.

Now to the userland program: The hello-ebpf Java API offers methods to access these maps and can be used to write a userland program, HelloMap, that prints the contents of the maps every few seconds:

public class HelloMap {
    public static void main(String[] args) 
      throws InterruptedException {
        try (var b = BPF.builder("""
                ...
                """).build()) {
            var syscall = b.get_syscall_fnname("execve");
            // attach the eBPF program to execve
            b.attach_kprobe(syscall, "hello");
            // create a mirror for the hash table eBPF map
            BPFTable.HashTable<Long, Long> counterTable = 
               b.get_table("counter_table", 
                           UINT64T_MAP_PROVIDER);
            while (true) {
                Thread.sleep(2000);
                // the map mirror implements the Java Map
                // interface with methods like 
                // Map.entrySet
                for (var entry : counterTable.entrySet()) {
                    System.out.printf("ID %d: %d\t", 
                                      entry.getKey(), 
                                      entry.getValue());
                }
                System.out.println();
            }
        }
    }
}

This program attaches the eBPF program to the execve system call and uses the HashTable map mirror to access the map counter_table.

You can run the example using the run.sh script (after you built the project via the build.sh script) as root on an x86 Linux:

> ./run.sh chapter2.HelloMap
ID 0: 1 ID 1000: 3
ID 0: 1 ID 1000: 3
ID 0: 1 ID 1000: 4
ID 0: 1 ID 1000: 11
ID 0: 1 ID 1000: 11
ID 0: 1 ID 1000: 12
...
ID 0: 22 ID 1000: 176

Here, user 0 is the root user, and user 1000 is my non-root user, I called ls in the shell with both users a few times to gather some data.

But maybe my map mirror is broken, and this data is just a fluke? It’s always good to have a way to check the content of the maps. This is where bpftool-map comes into play: We can use

> bpftool map list
2: prog_array  name hid_jmp_table  flags 0x0
        key 4B  value 4B  max_entries 1024  memlock 8512B
        owner_prog_type tracing  owner jited
40: hash  name counter_table  flags 0x0
        key 8B  value 8B  max_entries 10240  memlock 931648B
        btf_id 142

> bpftool map dump name counter_table
[{
        "key": 1000,
        "value": 163
    },{
        "key": 0,
        "value": 22
    }
]

We can see that our examples are in the correct ballpark.

To learn more about the features of bpftool, I highly recommend reading the article “Features of bpftool: the thread of tips and examples to work with eBPF objects” by Quentin Monnet.

Storing simple numbers in a map is great, but what if we want to keep more complex information as values in the map, like the Data record with user-id, group-id, and counter from the beginning of this article?

The most recent addition to the hello-ebpf project is the support of record/struct values in maps:

Storing more complex structs in maps

The eBPF code for this example is a slight extension of the previous example:

// record Data(
//    @Unsigned long uid, 
//    @Unsigned long gid, 
//    @Unsigned int  counter
// ){}
struct data_t {
   u64 uid;
   u64 gid;
   u32 counter;
};
                
// u64 to data_t map
BPF_HASH(counter_table, u64, struct data_t);
                
int hello(void *ctx) {
   // get user id
   u64 uid = bpf_get_current_uid_gid() & 0xFFFFFFFF;
   // get group id
   u64 gid = bpf_get_current_uid_gid() >> 32;
   // create data object 
   // with uid, gid and counter=0
   struct data_t info = {uid, gid, 0};
   struct data_t *p = counter_table.lookup(&uid);
   if (p != 0) {
      info = *p;
   }
   info.counter++;
   counter_table.update(&uid, &info);
   return 0;
}

The Java application is slightly more complex, as we have to model the data_t struct in Java. We start by defining the record Data as before:

record Data(
     /** user id */
     @Unsigned long uid,
     /** group id */
     @Unsigned long gid, 
     /** count of execve calls */
     @Unsigned int counter) {}

The @Unsigned annotation is part of the ebpf-annotations module and allows you to document type properties that aren’t present in Java.

The mirror BPFType for structs in hello-ebpf BPFType.BPFStructType:

/**
 * Struct
 *
 * @param bpfName     name of the struct in BPF
 * @param members     members of the struct, 
 *                    order should be the same as 
 *                    in the constructor
 * @param javaClass   class that represents the struct
 * @param constructor constructor that takes the members 
 *                    in the same order as 
 *                    in the constructor
 */
record BPFStructType(String bpfName, 
                    List<BPFStructMember> members, 
                    AnnotatedClass javaClass,
                    Function<List<Object>, ?> constructor) 
    implements BPFType

Which model struct members as follows:

/**
 * Struct member
 *
 * @param name   name of the member
 * @param type   type of the member
 * @param offset offset from the start of the struct in bytes
 * @param getter function that takes the struct and returns the member
 */
record BPFStructMember(String name, 
                       BPFType type, 
                       int offset, 
                       Function<?, Object> getter)

With these classes, we can model our data_t struct as follows:

BPFType.BPFStructType DATA_TYPE = 
    new BPFType.BPFStructType("data_t",
        List.of(
          new BPFType.BPFStructMember(
            "uid", 
            BPFType.BPFIntType.UINT64, 
            /* offset */ 0, (Data d) -> d.uid()),
          new BPFType.BPFStructMember(
            "gid", 
            BPFType.BPFIntType.UINT64, 
            8, (Data d) -> d.gid()),
          new BPFType.BPFStructMember(
            "counter", 
            BPFType.BPFIntType.UINT32, 
            16, (Data d) -> d.counter())),
        new BPFType.AnnotatedClass(Data.class, List.of()),
            objects -> 
              new Data((long) objects.get(0), 
                       (long) objects.get(1), 
                       (int) objects.get(2)));

This is cumbersome, I know, but it will get easier soon, I promise.

The DATA_TYPE type can then be passed to the BPFTable.HashTable to create the UINT64T_DATA_MAP_PROVIDER:

BPFTable.TableProvider<BPFTable.HashTable<@Unsigned Long, Data>> 
    UINT64T_DATA_MAP_PROVIDER =
        (/* BPF object */ bpf, 
         /* map id in eBPF */ mapId, 
         /* file descriptor of the map */ mapFd, 
         /* name of the map */ name) ->
                new BPFTable.HashTable<>(
                     bpf, mapId, mapFd, 
                     /* key type */   BPFType.BPFIntType.UINT64, 
                     /* value type */ DATA_TYPE, 
                     name);

We use this provider to access the map with BPF#get_table:

public class HelloStructMap {

    // ...

    public static void main(String[] args) 
      throws InterruptedException {
        try (var b = BPF.builder("""
                // ...
                """).build()) {
            var syscall = b.get_syscall_fnname("execve");
            b.attach_kprobe(syscall, "hello");

            var counterTable = b.get_table("counter_table", 
                 UINT64T_DATA_MAP_PROVIDER);
            while (true) {
                Thread.sleep(2000);
                for (var value : counterTable.values()) {
                    System.out.printf(
                       "ID %d (GID %d): %d\t", 
                       value.uid(), value.gid(), 
                       value.counter());
                }
                System.out.println();
            }
        }
    }
}

We can run the example and get the additional information:

> ./run.sh own.HelloStructMap
ID 0 (GID 0): 1 ID 1000 (GID 1000): 3
ID 0 (GID 0): 1 ID 1000 (GID 1000): 9
...
ID 0 (GID 0): 1 ID 1000 (GID 1000): 13
ID 0 (GID 0): 5 ID 1000 (GID 1000): 14

> bpftool map dump name counter_table
[{
        "key": 0,
        "value": {
            "uid": 0,
            "gid": 0,
            "counter": 5
        }
    },{
        "key": 1000,
        "value": {
            "uid": 1000,
            "gid": 1000,
            "counter": 13
        }
    }
]

Granted, it doesn’t give you more insights into the observed system, but it is a showcase of the current state of the map support in hello-ebpf.

Conclusion

eBPF maps are the primary way to communicate information between the eBPF program and the userland application. Hello-ebpf gained with this blog post support for basic eBPF hash maps and the ability to store structures in these maps. But of course, hash maps are not the only type of maps; we’ll add support for other map types, like perf maps and queues, in the next blog posts, as well as making the struct definitions a little bit easier. So stay tuned.

Thanks for joining me on this journey to create a proper Java API for eBPF. Feel free to try the examples for yourself or even write new ones and join the discussions on GitHub. See you in my next blog post.

This article is part of my work in the SapMachine team at SAP, making profiling and debugging easier for everyone. Thanks to Mohammed Aboullaite for answering my many questions.

Hello eBPF: Developing eBPF Apps in Java (1)

Please be aware that this blog post uses the discontinued libbcc-based API in hello-ebpf.

eBPF allows you to attach programs directly to hooks in the Linux kernel without loading kernel modules, like hooks for networking or executing programs. This has historically been used for writing custom package filters in firewalls. Still, nowadays, it is used for monitoring and tracing, becoming an ever more critical building block of modern observability tools. To quote from ebpf.io:

Historically, the operating system has always been an ideal place to implement observability, security, and networking functionality due to the kernel’s privileged ability to oversee and control the entire system. At the same time, an operating system kernel is hard to evolve due to its central role and high requirement towards stability and security. The rate of innovation at the operating system level has thus traditionally been lower compared to functionality implemented outside of the operating system.

eBPF changes this formula fundamentally. It allows sandboxed programs to run within the operating system, which means that application developers can run eBPF programs to add additional capabilities to the operating system at runtime. The operating system then guarantees safety and execution efficiency as if natively compiled with the aid of a Just-In-Time (JIT) compiler and verification engine. This has led to a wave of eBPF-based projects covering a wide array of use cases, including next-generation networking, observability, and security functionality.

Today, eBPF is used extensively to drive a wide variety of use cases: Providing high-performance networking and load-balancing in modern data centers and cloud native environments, extracting fine-grained security observability data at low overhead, helping application developers trace applications, providing insights for performance troubleshooting, preventive application and container runtime security enforcement, and much more. The possibilities are endless, and the innovation that eBPF is unlocking has only just begun.

Writing eBPF apps

On the lowest level, eBPF programs are compiled down to eBPF bytecode and attached to hooks in the kernel via a syscall. This is tedious; so many libraries for eBPF allow you to write applications using and interacting with eBPF in C++, Rust, Go, Python, and even Lua.

But there are none for Java, which is a pity. So… I decided to write bindings using the new Foreign Function API (Project Panama, preview in 21) and bcc, the first and widely used library for eBPF, which is typically used with its Python API and allows you to write eBPF programs in C, compiling eBPF programs dynamically at runtime.

That’s why I wrote From C to Java Code using Panama a few weeks ago.

Anyway, I’m starting my new blog series and eBPF library hello-ebpf:

Let’s discover eBPF together. Join me on the journey to write all examples from the Learning eBPF book (get it also from Bookshop.org, Amazon, or O’Reilly) by Liz Rice and more in Java, implementing a Java library for eBPF along the way, with a blog series to document the journey. I highly recommend reading the book alongside my articles; for this blog post, I read the book till page 18.

The project is still in its infancy, but I hope that we can eventually extend the overview image from ebpf.io with a duke:

Goals

The main goal is to provide a library (and documentation) for Java developers to explore eBPF and write their own eBPF programs without leaving their favorite language and runtime.

The initial goal is to be as close to the BCC Python API as possible to port the book’s examples to Java easily. You can find the Java versions of the examples in the src/main/me/bechberger/samples and the API in the src/main/me/bechberger/bcc directory in the GitHub repository.

Implementation

The Python API is just a wrapper around the bcc library using the built-in cffi, which extends the raw bindings to improve usability. The initial implementation of the library is a translation of the Python code to Java 21 code with Panama for FFI.

For example the following method of the Python API

    def get_syscall_fnname(self, name):
        name = _assert_is_bytes(name)
        return self.get_syscall_prefix() + name

is translated into Java as follows:

    public String get_syscall_fnname(String fnName) {
        return get_syscall_prefix() + fnName;
    }

This is the reason why the library has the same license as the Python API, Apache 2.0. The API is purposefully close to the Python API and only deviates where absolutely necessary, adding a few helper methods to improve it slightly. This makes it easier to work with the examples from the book and speeds up the initial development. But finishing a translation of the Python API is not the end goal:

Plans

A look ahead into the future so you know what to expect:

  • Implement the full API so that we can recreate all bcc examples from the book
  • Make it adequately available as a library on Maven Central
  • Support the newer libbpf library
  • Allow writing eBPF programs in Java

These plans might change, but I’ll try to keep this current. I’m open to suggestions, contributions, and ideas.

Contributing

Contributions are welcome; just open an issue or a pull request. Discussions take place in the discussions section of the GitHub repository. Please spread the word if you like it; this greatly helps the project.

I’m happy to include more example programs, API documentation, helper methods, and links to repositories and projects that use this library.

Running the first example

The Java library is still in its infancy, but we are already implementing the most basic eBPF program from the book that prints “Hello World!” every time a new program is started via the execve system call:

> ./run.sh bcc.HelloWorld
           <...>-30325   [042] ...21 10571.161861: bpf_trace_printk: Hello, World!
             zsh-30325   [004] ...21 10571.164091: bpf_trace_printk: Hello, World!
             zsh-30325   [115] ...21 10571.166249: bpf_trace_printk: Hello, World!
             zsh-39907   [127] ...21 10571.167210: bpf_trace_printk: Hello, World!
             zsh-30325   [115] ...21 10572.231333: bpf_trace_printk: Hello, World!
             zsh-30325   [060] ...21 10572.233574: bpf_trace_printk: Hello, World!
             zsh-30325   [099] ...21 10572.235698: bpf_trace_printk: Hello, World!
             zsh-39911   [100] ...21 10572.236664: bpf_trace_printk: Hello, World!
 MediaSu~isor #3-19365   [064] ...21 10573.417254: bpf_trace_printk: Hello, World!
 MediaSu~isor #3-22497   [000] ...21 10573.417254: bpf_trace_printk: Hello, World!
 MediaPD~oder #1-39914   [083] ...21 10573.418197: bpf_trace_printk: Hello, World!
 MediaSu~isor #3-39913   [116] ...21 10573.418249: bpf_trace_printk: Hello, World!

This helps you track the processes that use execve and lets you observe that Firefox (via MediaSu~isor) creates many processes and see whenever a Z-Shell creates a new process.

The related code can be found in chapter2/HelloWorld.java:

public class HelloWorld {
  public static void main(String[] args) {
    try (BPF b = BPF.builder("""
            int hello(void *ctx) {
               bpf_trace_printk("Hello, World!");
               return 0;
            }
            """).build()) {
      var syscall = b.get_syscall_fnname("execve");
      b.attach_kprobe(syscall, "hello");
      b.trace_print();
    }
  }
}

The eBPF program appends a “Hello World” trace message to the /sys/kernel/debug/tracing/trace DebugFS file via bpf_trace_printk everytime the hello method is called. The trace has the following format: “<current task, e.g. zsh>-<process id> [<CPU id the task is running on>] <options> <timestamp>: <appending ebpf method>: <actual message, like 'Hello World'>“. But bpf_trace_printk is slow, it should only be used for debugging purposes.

The Java code attaches the hello method to the execve system call and then prints the lines from the /sys/kernel/debug/tracing/trace file. The program is equivalent to the Python code from the book. But, of course, many features have not yet been implemented and so the programs you can write are quite limited.

Conclusion

eBPF is an integral part of the modern observability tech stack. The hello-ebpf Java library will allow you to write eBPF applications directly in Java for the first time. This is an enormous undertaking for a side project so it will take some time. With my new blog series, you can be part of the journey, learning eBPF and building great tools.

I plan to write a blog post every few weeks and hope you join me. You wouldn’t be the first: Mohammed Aboullaite has already entered and helped me with his eBPF expertise. The voyage will hopefully take us from the first hello world examples shown in this blog post to a fully fledged Java eBPF library.

This article is part of my work in the SapMachine team at SAP, making profiling and debugging easier for everyone. Thank you to Martin Dörr and Lukas Werling who helped in the preparation of this article.