Welcome back to my hello-ebpf series. Last week, we learned about lottery schedulers and how to implement one with sched-ext and hello-ebpf. But there was one major issue when writing the scheduler in Java: hello-ebpf lacked support for the bpf_for_each
macro. This is what this weeks short blog post is all about.
To loop over all elements in the scheduling queue, to pick a random one, we had to write inline C code. A simplified version without scheduling priorities looks like the following:
@Override public void dispatch(int cpu, Ptr<TaskDefinitions.task_struct> prev) { String CODE = """ // pick a random task struct index s32 random = bpf_get_prandom_u32() % scx_bpf_dsq_nr_queued(SHARED_DSQ_ID); struct task_struct *p = NULL; bpf_for_each(scx_dsq, p, SHARED_DSQ_ID, 0) { // iterate till we get to this number random = random - 1; // and try to dispatch it if (random <= 0 && tryDispatching(BPF_FOR_EACH_ITER, p, cpu)) { return 0; } }; return 0; """; }
This surely works, but inline C code is not the most readable and writing it is error-prone. I’ve written on this topic in my post Hello eBPF: Write your eBPF application in Pure Java (12) before. So what was keeping us from supporting the bpf_for_each
macro? Support for lambdas as arguments to built-in bpf functions.
Lottery Scheduler in Pure Java
May I introduce you know the support for directly passed lambda expressions: This let’s us define a bpf_for_each_dsq
function and to write a simple lottery scheduler in pure Java:
@BPF(license = "GPL") @Property(name = "sched_name", value = "lottery_stopping_scheduler") public abstract class LotteryScheduler extends BPFProgram implements Scheduler { private static final int SHARED_DSQ_ID = 0; @Override public int init() { return scx_bpf_create_dsq(SHARED_DSQ_ID, -1); } @Override public void enqueue( Ptr<TaskDefinitions.task_struct> p, long enq_flags) { var sliceLength = ((@Unsigned int) 5_000_000) / scx_bpf_dsq_nr_queued(SHARED_DSQ_ID); scx_bpf_dispatch(p, SHARED_DSQ_ID, sliceLength, enq_flags); } @BPFFunction @AlwaysInline public boolean tryDispatching( Ptr<BpfDefinitions.bpf_iter_scx_dsq> iter, Ptr<TaskDefinitions.task_struct> p, int cpu) { // check if the CPU is usable by the task if (!bpf_cpumask_test_cpu(cpu, p.val().cpus_ptr)) { return false; } return scx_bpf_dispatch_from_dsq(iter, p, SCX_DSQ_LOCAL_ON.value() | cpu, SCX_ENQ_PREEMPT.value()); } /** * Dispatch tasks * <p/> * Pick a random task from the shared DSQ and * try to dispatch it */ @Override public void dispatch(int cpu, Ptr<TaskDefinitions.task_struct> prev) { Box<Integer> random = Box.of(bpf_get_prandom_u32() % scx_bpf_dsq_nr_queued(SHARED_DSQ_ID)); Ptr<TaskDefinitions.task_struct> p = null; bpf_for_each_dsq(SHARED_DSQ_ID, p, iter -> { random.set(random.val() - 1); if (random.val() <= 0 && tryDispatching(iter, p, cpu)) { return; } }); } public static void main(String[] args) throws Exception { try (LotteryScheduler scheduler = BPFProgram.load(LotteryScheduler.class)) { scheduler.attachScheduler(); while (scheduler.isSchedulerAttachedProperly()) { Thread.sleep(1000); } } } }
To run it yourself, clone the hello-ebpf project, follow the setup instructions of the taskcontrol repository (at least a 6.12 kernel) and run the following:
sudo -s PATH=$PATH ./run.sh LotteryScheduler
bpf_for_each_dsq Macro Support
But how is the new bpf_for_each_dsq
function, modelling one instance of the bpf_for_each
macro modeled? It uses the templates of the @BuiltinBPFFunction
annotation and is implemented in the Scheduler class:
/** * Iterate over all tasks in the DSQ, using the {@code bpf_for_each} * macro. * <p> * This inserts the body of the lambda into the macro, * so {@code return} works differently, * for {@code break;} use {@link BPFJ#_break()} and for * {@code continue;} use {@link BPFJ#_continue()}. * Use {@link me.bechberger.ebpf.type.Box} for * using non-final variables from outside the lambda. * @param dsq_id queue id * @param cur current task in the loop * @param body lambda to execute for each task */ @BuiltinBPFFunction(""" bpf_for_each(scx_dsq, $arg2, $arg1, 0) { $lambda3:param1:type $lambda3:param1:name = BPF_FOR_EACH_ITER; $lambda3:code } """) default void bpf_for_each_dsq(int dsq_id, Ptr<TaskDefinitions.task_struct> cur, Consumer<Ptr<BpfDefinitions.bpf_iter_scx_dsq>> body) { throw new MethodIsBPFRelatedFunction(); }
The template might look cryptic, but it uses the newest additions to the template placeholders, introduced in Hello eBPF: Write your eBPF application in Pure Java (12):
$lambdaM:code
: the code of the M-th lambda (starting at one), use Box for using variables inside and outside$lambdaM:paramN
: variable declaration for parameter N of the M-th lambda$lambdaM:paramN:type
: type of the parameter N of the M-th lambda$lambdaM:paramN:name
: name of the parameter N of the M-th lambda- for the sake of completeness:
$argN
: Argument N
So $lambda3:param1:type $lambda3:param1:name
is thereby equivalent to “[type of the first parameter of the lambda argument 3 or body consumer] [name of the same parameter]”.
Box Class, Local Variables and Lambdas
But what is the Box class that is mentioned and used with this function? It’s a new built-in wrapper class that solves a fundamental problem with lambdas:
Any local variable, formal parameter, or exception parameter used but not declared in a lambda expression must either be
Section 15.27.2 of the JaVa Language Specificationfinal
or effectively final (§4.12.4), as specified in §6.5.6.1.
The reason for this lies in the implementation of lambdas in Java, we can approximate the lambda from the lottery scheduler as follows:
class LotteryLambda implements Consumer<task_struct> { private Box<Integer> random; private int cpu; private Ptr<task_struct> p; public LotteryLambda(Box<Integer> random, int cpu, Ptr<task_struct> p) { this.random = random; this.cpu = cpu; this.p = p; } @Override public void accept(Ptr<BpfDefinitions.bpf_iter_scx_dsq> iter) { random.set(random.val() - 1); if (random.val() <= 0 && tryDispatching(iter, p, cpu)) { return; } } } // lambda application new LotteryLambda(random, cpu, p).accept(iter)
Altering any value of a LotteryLambda inside the accept
method doesn’t change its value outside of the lambda. But boxing the value means that both the code inside and outside the accept method access the same Box object all the time. Because the Box object is effectively final, but its content isn’t.
The rules of the language specification prevent any confusion related to local variables, to make it as consistent as possible.
In hello-ebpf, we take our liberty with semantics, as with the function templates, we just replace $lambdaM:code
with the actual body of the lambda, so no such rules are needed in the C code. The Box class alleviates this by wrapping any value (and I used a similar class in many other projects):
@Type public class Box<T> { private T value; @BuiltinBPFFunction("$arg1") public Box(T value) { this.value = value; } @BuiltinBPFFunction("$arg1") public static <T> Box<T> of(T value) { return new Box<>(value); } @BuiltinBPFFunction("$this") public T val() { return this.value; } @BuiltinBPFFunction("$this = $arg1") public void set(T value) { this.value = value; } }
Be aware that the Box is specially handled in the compiler plugin of hello-ebpf, because the @Type
annotation does not support template arguments. Any usage of the Box type is just replaced by its containing value and type, when generating C code:
Box<Integer> box = Box.of(42); int value = box.val(); box.set(43);
Results in the following C code:
s32 box = 42; s32 value = box; box = 43;
Be aware that you cannot use the Box outside of method bodies, so don’t pass it directly to methods or store it in fields or global variables.
Conclusion
This blog post showed, how we can implement a scheduler in pure Java, extending the function template mechanism along the way. This shows how powerful the @BuiltinBPFFunction
annotation with its templates is and how it can be used to both model built-in functions and macros.
Thank you for joining me on this journey to bring eBPF and sched-ext to the Java world.