Hello eBPF: Write your eBPF application in Pure Java (12)

Welcome back to my series on ebpf. In the last post, I told you about BTF and generating Java classes for all BPF types. This week, we’re using these classes to write a simple packet blocker in pure Java. This is the culmination of my efforts that started in my post Hello eBPF: Generating C Code (8), to reduce the amount of C code that you have to write to create your eBPF application.

This blog post took again longer than expected, but you’ll soon see why. And I dropped libbcc support along the way.

After my last blog post, you still had to write the eBPF methods in a String embedded in the Java application. So if you wanted to write a simple XDP-based packet blocker that blocks every third incoming packet, you wrote the actual XDP logic into a String-typed field named EBPF_PROGRAM. But we already can define the data types and global variables in Java, generating C code automatically. Can we do the same for the remaining C code? We can now. Introducing the new Java compiler plugin, that allows to you write the above in “pure” Java, using Java as a DSL for C (GitHub):

@BPF(license = "GPL") // define a license
public abstract class XDPDropEveryThirdPacket 
  extends BPFProgram implements XDPHook {
    
    // declare the global variable
    final GlobalVariable<@Unsigned Integer> count = 
        new GlobalVariable<>(0);

    @BPFFunction
    public boolean shouldDrop() {
        return count.get() % 3 == 1;
    }

    @Override // defined in XDPHook, compiled to C
    public xdp_action xdpHandlePacket(Ptr<xdp_md> ctx) {
        // update count
        count.set(count.get() + 1);
        // drop based on count
        return shouldDrop() ? xdp_action.XDP_DROP : xdp_action.XDP_PASS;
    }

    public static void main(String[] args) 
      throws InterruptedException {
        try (XDPDropEveryThirdPacket program = 
             BPFProgram.load(XDPDropEveryThirdPacket.class)) {
            program.xdpAttach(XDPUtil.getNetworkInterfaceIndex());
            while (true) {
                System.out.println("Packet count " + 
                                   program.count.get());
                Thread.sleep(1000);
            }
        }
    }
}

The xdpHandlePacket method is specified in XDPHook as

/**
 * XDP hook function that get's passed all incoming packets
 * @param ctx XDP context which includes the network packet
 * @return what to do with the packet 
 *         ({@link xdp_action#XDP_PASS}, ...)
 */
@BPFFunction(section = "xdp")
@NotUsableInJava
xdp_action xdpHandlePacket(Ptr<xdp_md> ctx);

and its implementation is translated into the following C code, alongside the shouldDrop method:

bool shouldDrop() {
  return (count % 3) == 1;
}

SEC("xdp") enum xdp_action xdpHandlePacket(struct xdp_md *ctx) {
  count = (count + 1);
  return count.get() % 3 == 1 ? XDP_DROP : XDP_PASS;
}

The generated C file can be found by looking for a XDPDropEveryThirdPacketImpl.c class in the target folder.

To learn more about XDP, please read Hello eBPF: XDP-based Packet Filter (9) and Hello eBPF: Global Variables (10) to learn about how global variables work.

But this doesn’t end there, you can call all BPF helper functions and find them via your normal IDE autocompletion:

As well as documentation (thanks Dylan):

With more convenient wrappers defined in the BPFJ class, like a wrapper for bpf_trace_printk:

/**
 * Print a message to the trace log
 * <p>
 * Example: {@snippet :
 *     BPFJ.bpf_trace_printk("Hello, %s from BPF and more!", "World");
 *}
 * @param fmt format string
 * @param args arguments to the format string
 */
@BuiltinBPFFunction("bpf_trace_printk($arg1, sizeof($arg1), $args2_)")
@NotUsableInJava
public static void bpf_trace_printk(String fmt, Object... args) {
    throw new MethodIsBPFRelatedFunction();
}

Calling this method in the eBPF program part in Java, results in the application of the call template, rendering bpf_trace_printk("Hi") into bpf_trace_printk("Hi", sizeof("Hi")) in C. You can read more about the template further down.

SystemCallHooks, Pointers and Maps

And you can do more than just writing XDP hooks, you can for example implement a method of the SystemCallHooks interface to implement a BPF program that logs every call of openat2 and stores it in a hash map per process name (see HashMapSample on GitHub):

@BPF
public abstract class HashMapSample 
  extends BPFProgram implements SystemCallHooks {
    
    private static final int TASK_COMM_LEN = 16;

    @BPFMapDefinition(maxEntries = 256)
    BPFHashMap<@Size(TASK_COMM_LEN) String, @Unsigned Integer> map;

    @Override
    public void enterOpenat2(int dfd, String filename, 
                             Ptr<open_how> how) {
        @Size(TASK_COMM_LEN) String comm = "";

        // Read the current process name
        // use the constant defined in Java
        bpf_get_current_comm(Ptr.of(comm), TASK_COMM_LEN);

        // increment the counter at map[comm]
        Ptr<@Unsigned Integer> counter = map.bpf_get(comm);
        if (counter == null) {
            @Unsigned int one = 1;
            // using the same API as you would in Java
            map.put(comm, one);
        } else {
            counter.set(counter.val() + 1);
        }
    }
 
    public static void main(String[] args) 
      throws InterruptedException {
        try (HashMapSample program = 
          BPFProgram.load(HashMapSample.class)) {
            program.autoAttachPrograms();
            while (true) {
                System.out.println("OpenAt's per process:");
                // use the same map as in the eBPF program
                for (var entry : program.map) {
                    System.out.printf("%16s: %4d\n", 
                        entry.getKey(), entry.getValue());
                }
                Thread.sleep(1000);
            }
        }
    }
}

Learn more about maps and the code generation for @BPFMapDefinition in Hello eBPF: Generating C Code (8).

This application implements the enterOpenat2 method, which is defined as:

@BPFFunction(
    // template to generate the proper function header
    headerTemplate = "int BPF_PROG($name, int dfd, const char* filename, struct open_how* how)",
    // last statement, because the return value is always ignored
    // so our Java version can return void
    lastStatement = "return 0;",
    section = "fentry/do_sys_openat2",
    // supports attaching via BPFProgram#autoAttachPrograms() 
    autoAttach = true
)
default void enterOpenat2(int dfd, String filename, Ptr<OpenDefinitions.open_how> how) {
    throw new MethodIsBPFRelatedFunction();
}

And makes use of the Ptr class which models C pointers:

public class Ptr<T> {
    /** Dereference this pointer */
    @BuiltinBPFFunction("*($this)")
    @NotUsableInJava
    public T val() { ... }

    /** Create a pointer of the passed value,
     * <p>
     *  Has to be a proper l-value (?) that has a place in memory,
     *  e.g. {@code Ptr.of(3)} is not allowed.
     */
    @BuiltinBPFFunction("&($arg1)")
    @NotUsableInJava
    public static <T> Ptr<T> of(@Nullable T value) {
        throw new MethodIsBPFRelatedFunction();
    }
   
    /* ... */
}

And the BPF map methods that can be used from both the eBPF program and the Java user-land code, like BPFBaseMap#put:

@BuiltinBPFFunction("!bpf_map_update_elem(&($this), $pointery$arg1, $pointery$arg2, BPF_ANY)")
public boolean put(K key, V value) {
    return put(key, value, PutMode.BPF_ANY);
}

With this in mind, the example code eBPF program code is translated to:

#define TASK_COMM_LEN 16

struct {
    __uint (type, BPF_MAP_TYPE_HASH);
    __uint (key_size, sizeof(char[16]));
    __uint (value_size, sizeof(u32));
    __uint (max_entries, 256);
} map SEC(".maps");

SEC("fentry/do_sys_openat2") int BPF_PROG(enterOpenat2, int dfd, 
 const char* filename, struct open_how* how) {
  char comm[16] = "";
  bpf_get_current_comm(comm, TASK_COMM_LEN);
  u32 *counter = bpf_map_lookup_elem(&(map), &comm);
  if ((counter == NULL)) {
    u32 one = 1;
    !bpf_map_update_elem(&(map), &comm, &(one), BPF_ANY);
  } else {
    *(counter) = (*(counter) + 1);
  }
  return 0;
}

But how does this work under the hood?

Implementation of the Translation

The translation is implemented using an annotation processor (see Hello eBPF: Generating C Code (8) for more) and a new Java compiler plugin for the methods themself:

Compiler plugins are a way to process and modify the results different compilation phases (like parsing, analysis and byte code generation), to implement new features, like the C code and eBPF compilation in our case.

Our compiler plugin essentially translates the annotated syntax tree of every @BPFFunction annotated method, using simple per tree-node rules. This is possible because the Java code and the generated C code mirror each other closely. The annotated syntax tree is the result of the analysis phase of the compiler and contains type information and the resolved methods. You can find the code in the bpf-compiler-plugin submodule.

To use the compiler plugin, you have to pass its name (BPFCompilerPlugin) to used javac. This can be done in maven as follows:

<plugin>                                                                                   
  <groupId>org.apache.maven.plugins</groupId>                                              
  <artifactId>maven-compiler-plugin</artifactId>                                           
  <version>3.8.0</version>                                                                 
  <configuration>                                                                          
    <annotationProcessors>                                                                 
      <annotationProcessor>me.bechberger.ebpf.bpf.processor.Processor</annotationProcessor>
    </annotationProcessors>                                                                
    <compilerArgs>                                                                         
      <arg>-Xplugin:BPFCompilerPlugin</arg>                                                
    </compilerArgs>                                                                        
  </configuration>                                                                         
</plugin>

Be aware that the compiler plugin only works in conjunction with the annotation processor and that both have to be on the classpath. Furthermore, both require the opening of internal jdk.compiler modules, as they need access to many internal classes (see .mvn/jvm.config). Of course this makes the compiler plugin slightly less stable and it will make supporting higher JDK versions slightly more difficult, but this the price to pay for the convenience the preprocessing offers.

Function Templates

One of the cornerstones of the translation are the templates defined in the @BuiltinBPFFunction annotation of every method that we want to call in our eBPF program, but which we don’t define in the eBPF program itself. You saw these templates before on the map and pointer methods. The template “language” is pretty simple, the following placeholders are sequentially replaced with their actual values whenever the compiler plugin evaluates a method call node:

$return: C version of the return type of the method
$name: Name of the method
$args: Arguments of the method, comma-separated
$argN: Argument N, starting at one
$argsN_: Arguments N to the last argument, comma-separated
$this: The expression on which the method is called if the method is not a static method
$T1, $T2, …: Type parameters of the call (like Integer in ptr.<Integer>cast())
$C1, $C2, …: Type parameters of the type of $this
$strlen$this: Length of $this interpreted as a string literal
$strlen$argN: Length of the $argN interpreted as a string literal
$str$argN: Asserts that $argN is a string literal
$pointery$argN: If $argN is not a pointer (or an array or a String), then prefix it with & and assume that it is a lvalue (like a variable), so the resulting expression produces always a pointer

In the following a few simple examples:

@BuiltinBPFFunction("$name($args)")                  
void func(int a, int b);                             
func(1, 2)                                           
// will be translated to                             
func(1, 2)                                             
                                                     
@BuiltinBPFFunction("func($arg1, $args2_, $arg1)")   
void func(int a, int b, int c);                      
func(1, 2, 3)                                        
// will be translated to                             
func(1, 2, 3, 1)                                     
                                                     
@BuiltinBPFFunction("func($str$arg1, sizeof($arg1))")
void func(String a);                                 
func("abc")                                          
// will be translated to                             
func("abc", sizeof("abc"))   

@BuiltinBPFFunction("$arg1 + $arg2")
<T> T add(T a);                                 
add(1, 2)                                          
// will be translated to                             
1 + 2

This is a really powerful tool, as it allows us to easily model C code in Java, modifying the generated code without modifying the compiler plugin itself.

Limitations

Of course there are limitations of this approach. The compiler plugin only supports a limited amount of Java features, most notably: variable definitions, for and while loops, if conditions, most operators, function calls, array expressions and constructor calls. This list might grow by the time you read it, so just look into the samples and compiler plugin tests to get a glimpse of the current language support.

The written Java code should look like as close as possible like proper Java code. But there are of course differences. Arrays have to have a @Size annotation if they are not directly declared, the same for other types, similar to the definition of data types.

Another issue is automatic type inference for the var keyword and automatic generic type parameter instantation, which I can’t make use of in the compiler plugin. With generic type erasure in casts causing similar problems. Therefore you can’t write

Ptr<Integer> intPtr = (Ptr<Integer>)Ptr.ofNull()
// or
Ptr<Integer> intPtr = Ptr.ofNull()
// or
Ptr<Integer> intPtr = Ptr.ofNull().cast()

But you have to explicitly state the generic type parameter:

Ptr<Integer> intPtr = Ptr.ofNull().<Integer>cast()

Besides the issues in the code generation, there might issues with the generated C code. For example, functions can’t be used before they are defined. But the C compiler messages should be easy to understand.

Another issue is that even valid C code, might compile to eBPF code that the eBPF verifier in the kernel has issues with. For example unbounded loops or unchecked pointer accesses are problematic, because the verifier has to be sure that the program is valid.

And to close this section on the limitations, all eBPF program methods have to be defined in the eBPF program class. This should change in the future, but it currently hampers the ability to compose eBPF programs.

Conclusion

Being able to write eBPF programs directly in Java, is important to make eBPF more accessible to Java developers, allowing them to write basic programs without switching to C code. Knowledge of eBPF is of course still required and it helps to understand the generated C code when developing more complex programs.

I hope you like what you saw and thank you for joining me on the journey to create a Java library for eBPF. See you in a couple of weeks for the next eBPF related blog post.

This article is part of my work in the SapMachine team at SAP, making profiling and debugging easier for everyone.

Author

Johannes Bechberger

Johannes Bechberger is a JVM developer working on profilers and their underlying technology in the SapMachine team at SAP. This includes improvements to async-profiler and its ecosystem, a website to view the different JFR event types, and improvements to the FirefoxProfiler, making it usable in the Java world. He started at SAP in 2022 after two years of research studies at the KIT in the field of Java security analyses. His work today is comprised of many open-source contributions and his blog, where he writes regularly on in-depth profiling and debugging topics, and of working on his JEP Candidate 435 to add a new profiling API to the OpenJDK.
View all posts

New posts like these come out at least every two weeks, to get notified about new posts, follow me on Twitter, Mastodon, or LinkedIn, or join the newsletter:

Mostly nerdless

Every two weeks a text on profiling, debugging or eBPF