I’m not going to lie, I have a strong hatred towards the Berkeley Packet Filter (BPF). There are a lot of reasons mainly having to do with having to support BPF on a network monitoring tool. There’s also the challenge of writing BPF filters and the weird way they work. So when I first heard about eBPF, I was more than a little reluctant to be excited. As I dug in further, I became much more excited about the technology and the benefits it can bring.
So, what is eBPF then? Well, in the words of the eBPF Foundation:
eBPF (which is no longer an acronym for anything) is a revolutionary technology with origins in the Linux kernel that can run sandboxed programs in a privileged context such as the operating system kernel. It is used to safely and efficiently extend the capabilities of the kernel without requiring to change kernel source code or load kernel modules.
We now know a few things. First, we know that eBPF, while related to BPF in some ways, is a whole new beast. Secondly, we know that eBPF is related to running programs directly in a system kernel (such as the Linux kernel). But what does that mean, really?
The Linux kernel is a highly important piece of software. It, quite literally, is what makes Linux work. The kernel contains all the code needed for systems to work and acts as the interface between the physical hardware on which it is running and the software processes using that hardware. The kernel is responsible for handling all the communications between these two things and ensuring that they work correctly. For a more detailed look at exactly what the kernel is, take a look at Red Hat’s description.
In an ideal world, the kernel would be entirely invisible to the end user. You turn your computer on, launch your programs and they just work. Also keep in mind that all computers are dependent on some sort of kernel for interfacing between software and hardware.
Of course, the kernel is not the end all of things in the computing world. There are three main things to keep in mind:
These issues, at least in the Linux world, resulted in the concept of modules. Modules are little bits of code containing things like device drivers that can be loaded on the fly. Effectively, modules enable you to add functionality to the kernel without having to rebuild the kernel with all new code. There are some benefits to this:
Great you say, so what does all this have to do with eBPF? Well, there are some challenges with kernel modules. The biggest challenge with Linux kernel modules is that there is no consistency within the kernel itself. This means that a version of the module built for and that works with one specific version of the Linux kernel will not work with a different kernel version. Basically, you need to rebuild each module for every kernel version you want to use it with. And since kernels are released regularly, this can be a lot of work. Wouldn’t it be nice if we had something better? Enter eBPF!
eBPF allows developers to write and run code directly in the Linux kernel. This may sound scary (and it can be) but the kernel and eBPF development communities have spent a lot of time coming up with safe approaches to this. When eBPF code is deployed in the kernel, it runs in a sandbox (basically a special space within the kernel dedicated to the eBPF program from which it should not be able to escape). From within the sandbox, the program interfaces with the rest of the kernel via standard Application Programming Interfaces (API’s). This means a program written for eBPF will work with every version of the kernel (unless the eBPF API’s change – something that is usually communicated well in advance).
This all sounds great! Now that you know a bit about eBPF you may be asking yourself “great, so what can I actually do with this thing?” and I am so glad you asked. While the technology is really just getting started (an important thing to remember) here are a few areas eBPF can bring benefit to today:
There are organizations and software using eBPF today. A few examples include:
Sadly, things are not all sunshine and roses. Because of where eBPF programs run and their unprecedented access to kernel-level data, it is also possible to write and deploy nefarious eBPF programs. While there are no known malicious eBPF programs in the wild – as of the time of this writing – there are examples available.
The takeaway from all this is that eBPF is an interesting, relatively new technology that is going to allow for some amazing improvements in the way we interact with data on systems of all types (even Windows)!