Learning eBPF: Programming the Linux Kernel for Enhanced Observability, Networking, and Security

Speaker 1

00:00

Imagine for a moment, being able to fundamentally alter how the very core of your operating system functions, not just patching it, but dynamically changing its behavior, you know, without rebooting or even recompiling a single line of code.

Speaker 2

00:14

Yeah, that sounds pretty wild.

Speaker 1

00:16

That almost sounds like a superpower, right, the ability to peer into and even reshape your Linux kernel on the fly. How is that even possible?

Speaker 2

00:23

It's certainly a profound change in how we interact with the kernel. And that transformative technology is called eBPF EBPFKA. This isn't just some minor upgrade. It's really a revolutionary platform. It's fundamentally changing how we build high performance, security, observability and networking tools.

Speaker 1

00:42

And that revolutionary aspect is what we're here to unpack today. Good plan for this deep dive, our mission is to explore the power behind eBPF. We'll uncover what it is, why it's become so incredibly important, especially you know, in recent years with cloud native stuff, and how it actually delivers on this promise of un paralleled insight and control. All our insights today, by the way, are drawn from the excellent book Learning EVPF by Liz.

Speaker 2

01:07

Rice A great resource. And you know, even if you listening aren't planning to write eBPF code yourself, just understanding the fundamentals that gives you a significant advantage. How so, well, you're likely to come across eBPF based tooling more and more. It's rapidly becoming standard in software infrastructure roles, right, and knowing how it works underneath will help you use those tools much more effectively. It gives you a real edge.

Speaker 1

01:30

Okay, so let's unpack this. To really get eBPF, we probably need to start with the Linux kernel itself. How would you describe its role for someone maybe not deep in OS internals.

Speaker 2

01:40

Sure, think of the kernel as the essential software layer, sitting directly between your applications and the physical hardware.

Speaker 1

01:49

The mediator exactly.

Speaker 2

01:50

It's the traffic cop for all interactions. Immediates every request from userspace apps through things called system calls or ciscalls, like reading files, network stuff files, sending network traffic, even just accessing memory. The kernel handles it all. Everything ultimately goes through the kernel.

Speaker 1

02:07

And because applications rely so heavily on the kernel for well everything, being able to observe those interactions must give you incredible visibility into what an app is doing.

Speaker 2

02:17

That's the key.

Speaker 1

02:18

But where did this idea of hooking into the kernel start. It had pretty humble origins, didn't they. Oh?

Speaker 2

02:23

Absolutely. The BPF and EDPF originally stood for Berkeley Packet Filter, started way back.

Speaker 1

02:29

In nineteen ninety two, ninety two, Okay.

Speaker 2

02:31

Yeah, simply as a way to efficiently filter network packets. We're talking in general purpose thirty two bit instructions looked a lot like assembly.

Speaker 1

02:38

Language, so like for TCP dump or something.

Speaker 2

02:40

Precisely, a tiny piece of BPF code could say filter out non ippaydance from an Ethernet stream, letting GZP dump capture only what was relevant. That efficiency was groundbreaking for network analysis back then.

Speaker 1

02:52

So the original BPF came into Lenux in ninety seven kernel two point one one seventy five for that purpose. But I guess the packet filter part of the name started to make less sense over time. It did.

Speaker 2

03:02

It became much more than just packets.

Speaker 1

03:04

What was the turning point? When did it become more versatile?

Speaker 2

03:07

Well, the first step beyond just network packets was probably SUCCUMBPF and twenty twelve Kernel three point.

Speaker 1

03:13

Five sacked and TTF right for ciscall filter exactly.

Speaker 2

03:17

It let BPS programs decide whether to allow or deny user space apps from making specific system calls, so you could restrict what an application could do at the kernel level.

Speaker 1

03:28

A security thing.

Speaker 2

03:29

Yeah, foundational for containers security later on. But the real extended revolution, the birth of eBPF as we know it, began in kernel three point one eight. That was twenty fourteen.

Speaker 1

03:39

Okay, twenty fourteen. And what changed then? It sounds like it was more than just an incremental update.

Speaker 2

03:44

Oh, it was a complete overhaul. The instruction set became sixty four bit, the interpreter was rewritten, Crucial new features were added. That's just This is where we got eBPF maps for sharing data between programs and user space.

Speaker 1

03:56

Ah okay, that sounds.

Speaker 2

03:57

Important, hugely important. Also the VPF system call for userspace interaction, and maybe most critically for safety, the eBPF FAIRI fire was introduced. We'll definitely need to talk more about that.

Speaker 1

04:08

Absolutely. So if adding new kernel features is traditionally you know, a massive undertaking months, maybe years to get changes upstream, how does eBPF change that? What makes it such a game changer today?

Speaker 2

04:22

That's the core of its appeal. Really Yeah, eBPF fundamentally changes how we can extend the kernel. It allows for dynamic kernel functionality, dynamic meaning meaning eBPF programs can be loaded and unloaded on demand. This dramatically speeds up adding new kernel features. We're talking maybe months, not the years it usually takes for upstream kernel changes.

Speaker 1

04:42

Wow.

Speaker 2

04:42

So you can create and deploy custom kernel functionality incredibly quickly without needing you know, the entire Linux world to agree and merge it into the main kernel.

Speaker 1

04:51

And it's not just about speed of deployment, is it. Performance is key.

Speaker 2

04:54

Too, absolutely crucial. Once an eBPF program is loaded and JIT compiled, I'm compiled right, it runs as native machine instructions directly on the CPU. This avoids those costly back and forth transitions between kernel space and user space for every single event.

Speaker 1

05:10

Which makes it incredibly efficient.

Speaker 2

05:12

Remarkably efficient. It sounds almost counterintuitive for something running inside the kernel, but yeah, it allows EVPF tools to operate with minimal overhead.

Speaker 1

05:20

That efficiency sounds like a massive win, especially in cloud native environments. We often see that sidecar model in kubernettes for injecting things like logging tracing.

Speaker 2

05:30

Security yep, the sidecar pattern.

Speaker 1

05:33

How does EVPF compare. Is it a better alternative?

Speaker 2

05:36

In many ways? Yes, The sidecar approach, while useful, often means modifying application yamo files, restarting.

Speaker 1

05:43

Pods, right operational friction.

Speaker 2

05:45

Exactly, and if something goes wrong, the sidecar might not even get injected, leaving you blind. Plus, sidecars can potentially be bypassed by bad actors. How so well, imagine an attacker deploys an uninstrumented app like a crypto minor sidecar based security tool which relies on injection, might not even see it connecting out to its mining pool.

Speaker 1

06:05

Ah I see, and eBPF avoids that Precisely.

Speaker 2

06:08

eBPF based tools run on the host and can police all traffic, all processes on that machine, including everything inside containers. They're much harder to.

Speaker 1

06:17

Side steps, so instant complete visibility.

Speaker 2

06:20

Pretty much without needing app modifications or restarts. That's a game changer for infrastructure teams wanting comprehensive coverage.

Speaker 1

06:28

Okay, but these programs are running directly inside the kernel. That immediately raises safety concerns. What stops a buggy eBPF program from crashing the whole system or reading memory. It shouldn't.

Speaker 2

06:40

That's the million dollar question, and the answer is the eBPF verifier.

Speaker 1

06:45

The kernel's unseen guardian you called it.

Speaker 2

06:47

That's a good way to put it. Before an eBPF program is ever loaded, the verifier performs an exhaustive static analysis. It checks every single possible execution path, every path, every path to ensure its safe and won't harm the kernel. It's like a super strict bouncer at the kernel's door. Honestly, it's a marvel of engineering that makes this whole thing possible safely.

Speaker 1

07:08

So what kinds of checks does this bouncer perform? It must be incredibly rigorous, it really is.

Speaker 2

07:14

It tracks the state of each register their types, value ranges to prevent invalid operations. It enforces trick memory access, making sure programs only touch memory they're allowed to.

Speaker 1

07:24

No arbitrary reads are writes none.

Speaker 2

07:27

It mandates checks for null pointers before they're reference, preventing common crashes. Very importantly, it guarantees programs run to completion.

Speaker 1

07:36

How does it do that?

Speaker 2

07:37

It rejects programs with loops that it can't prove will terminate. There's also a maximum instruction limit, currently a million instructions for privileged users a million.

Speaker 1

07:47

Okay, that's quite a lot.

Speaker 2

07:49

Actually, it gives you scope for complex logic. The verifier also checks that helper functions called by the eBPF program are allowed for that specific program type, and it even checks for GPL compatible licenses if the program uses certain restricted helpers.

Speaker 1

08:04

That's incredibly thorough. It really builds confidence.

Speaker 2

08:06

It has to be. The kernel stability is.

Speaker 1

08:08

Paramount, Okay, beyond safety. You mentioned eBPF maps earlier as a key differentiator from the original BPF. How did maps change the game?

Speaker 2

08:17

Maps was a fundamental breakthrough. They are essentially key value stores, data structures that act as a bridge, a bridge between between the kernel and user space, or even between different eBPF programs running in the kernel. They allow communication and state.

Speaker 1

08:30

Sharing, which wasn't possible before.

Speaker 2

08:32

Right. Classic BPF was stateless, just processing individual packets. Maps allow eBPF programs to store state across multiple events or invocations, so you can use.

Speaker 1

08:42

Them for things like counters sending metrics from the kernel up to a userspace monitoring tool exactly.

Speaker 2

08:48

That's a common use case, or storing configuration data pushed down from user space, or maintaining state across multiple program executions like global variables.

Speaker 1

08:58

And you mentioned tail calls too. What are those?

Speaker 2

09:00

Tail calls allow one eBPF program to effectively jump to another eBPF program, replacing itself. It's a way to chain programs together.

Speaker 1

09:08

Ah okay, so you can build more complex logic flows.

Speaker 2

09:12

Precisely, you can chain up to thirty three tail calls currently, combined with that million instruction limit per program, it gives you a lot of flexibility for sophisticated in kernel logic.

Speaker 1

09:21

So maps and tail calls really elevated eBPF from just a filter to a proper programmable platform.

Speaker 2

09:27

Absolutely, they are key to building that sophisticated eBPF based applications we see today.

Speaker 1

09:32

Now, if you're thinking about building with eBPF, what are the language choices for the kernel side code itself? I assume it's pretty low level.

Speaker 2

09:40

Yes, For the code running in the kernel. Most eBPF programs are written in C or Rust. Why those Because they are compilers like Clang LLVM for C can directly target the EDPF bytecode instruction set. They give you that necessary low level control.

Speaker 1

09:56

Makes sense? And what about the user space side? The applications that load and manage these kernel programs.

Speaker 2

10:01

There you have much more flexibility. User space applications can be written in various languages using specific libraries to interact with the kernel's eBPF capabilities.

Speaker 1

10:10

When eBPF was first taking off, I remember BCC being very popular.

Speaker 2

10:14

Yes, BCC, the BPF compiler collection. It supports Python, Lua, C plus plus C. It was fantastic for learning and experimentation.

Speaker 1

10:22

Why was it good for learning?

Speaker 2

10:24

It abstracted away a lot of the low level complexity, making EDPF accessible to people who weren't necessarily deep kernel hackers. You could write Python scripts to load and interact with CEBPF code.

Speaker 1

10:34

But it had drawbacks for production it did.

Speaker 2

10:36

Its main issue was a compile the EDPFC code at runtime on the target machine.

Speaker 1

10:41

Oh right, so you needed the compiler toolchain LVM CLIANG kernel headers on every single server exactly.

Speaker 2

10:48

That meant bigger dependencies, potential installation headaches, and noticeable startup delays while the code compiled plus higher resource consumption not ideal for large scale production deployments.

Speaker 1

10:59

So there wasn't need something more robust, more portable for enterprise use. That leads us to KEOII, right, compile once, run everywhere.

Speaker 2

11:07

Yes. Keyoor is the modern approach and a huge leap forward. It's really what made eBPF practical for widespread production use.

Speaker 1

11:16

How does it achieve that run everywhere magic? It sounds tricky, with different kernel versions having slightly different data structures, It is clever.

Speaker 2

11:23

The key is BTF BPF type format. During compilation, detailed type information about the kernel structures the eBPF program uses is embedded into the compiled object file itself. Okay, Then when you load that program on a target machine, a userspace library uses that BTF data to understand the specific kernel version running there. It can then automatically adjust memory offsets and structure accesses on the fly, so it.

Speaker 1

11:45

Adapts the pre compiled code at load time.

Speaker 2

11:47

Essentially, yes, it relocates the memory accesses based on the actual kernel's layout. This means you can compile your EVPF program once on your rig machine and then deploy that same binary across a whole fleet running different kernel versions within limits.

Speaker 1

12:03

Of course, that's a massive operational win. No more compiling on every node huge.

Speaker 2

12:07

It makes deployment much much simpler.

Speaker 1

12:09

And the core c library that enables score is LIBPF.

Speaker 2

12:14

That's right. Lip APF is maintained alongside the kernel source tree, and it's the reference implementation. It uses auto generated BPF skeletons c headers that make loading and managing the eBPF programs and maps much cleaner from user space.

Speaker 1

12:28

And the benefits over BCC are.

Speaker 2

12:30

Lower memory footprint, no runtime compilation dependency, no startup delay, and that crucial keore portability. If you're building a serious eBPF application today, especially in C, LIBPF is the way to go.

Speaker 1

12:42

But it's not just C anymore, is it. The ecosystem seems to be growing definitely.

Speaker 2

12:46

The community has built excellent libraries for other popular languages. For Go developers, there's silly MEPF and also libbed PFG, which provides Go bindings for lipef.

Speaker 1

12:57

And for Rust. Rust seems like a natural for systems programming.

Speaker 2

13:01

It does, and there are several options there too, like libpfrs red, BPF, and AYA. Each has slightly different approaches and trade offs, but they all bring eBPF development to Rust programmers, leveraging Rust safety features.

Speaker 1

13:15

So lots of options depending on your team's preferred language exactly.

Speaker 2

13:19

It's much more accessible now than it was just a few years ago.

Speaker 1

13:21

Okay, so we have this powerful, safe portable technology, how is it actually being used? Let's talk real world applications. Observability seems like a prime candidate.

Speaker 2

13:32

Absolutely because eBPF programs can attach to so many points ciscolls, kernel trace points, kernel functions, eight K probes, even user space functions uprobes. They provide incredibly powerful tracing capabilities.

Speaker 1

13:43

Deeper than traditional monitoring, much.

Speaker 2

13:45

Deeper and often with less overhead. You get unparalleled insight into system behavior, network disc io function calls within applications, all without modifying or restarting the application itself. It's like having x ray vision into your running systems.

Speaker 1

14:01

And in networking. You mentioned BPF's origins there, but eBPF takes it much further, especially with XDP.

Speaker 2

14:07

Yes XDP the express data path. This is really powerful. XDP allows eBPF programs to run at the earliest possible point in the network stack, right in the driver, even before the kernel allocates memory for the packet BSKB.

Speaker 1

14:22

So super fast packet process extremely fast.

Speaker 2

14:26

This lets you do things like high performance firewalling, dropping malicious packets for dedas mitigation, or implementing custom load balancing all at line rate in many cases.

Speaker 1

14:34

Before the packet even really hits the main kernel networking exactly.

Speaker 2

14:38

You can make decisions and potentially drop packets with minimal overhead. Some network cards even support offloading XDP programs directly onto the NIC hardware itself.

Speaker 1

14:46

Wow, hardware offload. That's serious performance it is.

Speaker 2

14:49

It allows you to handle enormous amounts of traffic. Think about mitigating those packet of death vulnerabilities. XDP can drop the malicious packet before it can even cause harm in the network stack.

Speaker 1

15:01

And this has big implications for Kubernetes networking too, doesn't it replacing things like.

Speaker 2

15:05

Iptables fundamentally changing it. Yes. Traditional Kubernetes networking often relies heavily on iptables for implementing network policies and services and contract for connection.

Speaker 1

15:15

Tracking, and those have scaling issues, right they do.

Speaker 2

15:18

Iptable's rules are processed linearly, so performance degrades is the number of rules or services grows. That's o n complexity. Contract can also become a bottleneck.

Speaker 1

15:28

How does eBPF help.

Speaker 2

15:30

eBPF based networking solutions like Cilium replace Iptable's rules and contract with EVPF programs and maps they use efficient hash maps for lookups.

Speaker 1

15:39

Hash maps the lookups are faster.

Speaker 2

15:40

Much faster oh one or constant time lookups for network policies, service load balancing, and connection tracking regardless of the number of rules or connections. This is critical for performance and scalability in large Kubernetes clusters.

Speaker 1

15:53

That's a significant architectural shift. Now here's a capability that sounds almost like magic transparent ryption insight. How can eBPF see inside encrypted traffic without keys?

Speaker 2

16:06

It's a clever trick. It doesn't actually break the encryption itself. Instead, eBPF uses uprobes to hook into userspace SSLTLS libraries like open ssl ortls.

Speaker 1

16:16

So it attaches to the applications encryption functions precisely.

Speaker 2

16:19

It attaches to functions like SSL read and SSL write. It can then intercept the data after the library has decrypted it for SSL rate or before the library encrypts it for SSL right.

Speaker 1

16:30

Ah. So it catches the plaintext data inside the application's memory just before encryption or just after decryption exactly.

Speaker 2

16:37

You get visibility into the clear text data flow without needing access to the private keys or certificates. It's incredibly powerful for security monitoring, troubleshooting, and even compliance without disrupting the encryption itself.

Speaker 1

16:49

That's fascinating. Okay, let's shift fully to security. eBPF seems poised to move security from just detection to actual prevention.

Speaker 2

16:57

That's a key trend for years. Second PPF provided a basic layer by filtering ciscoles, which is used heavily in Docker and Kubernetes.

Speaker 1

17:06

Right limiting what a container can do.

Speaker 2

17:07

Yes, and some eBPF tools like inspector gadget can even help generate those secom profiles by observing what ciscoles an application legitimately uses.

Speaker 1

17:17

But generating perfect profiles is hard right capturing all error pads and edge cases.

Speaker 2

17:22

It's very challenging, and traditional detection often suffers from that Tatau problem time of check to time of use the race condition. Exactly by the time a separate security agent detects a malicious ciscle or file access and decides to act, the damage might already be done. The malicious operation might have completed.

Speaker 1

17:39

So how does eBPF enable prevention closing that time gap?

Speaker 2

17:43

This is where a specific eBPF program type comes in bpfpot typos. These programs attached to hooks within the Linux Security Module Framework.

Speaker 1

17:53

LSM, like a Parmer or Cylenux.

Speaker 2

17:55

Use exactly the same framework. These eBPF LSM programs can perform he checks directly inside the kernel before a sensitive operation happens, and crucially, they can return.

Speaker 1

18:05

An air code and that error code.

Speaker 2

18:07

It prevents the operation from proceeding. The kernel respects the error code returned by the LSM hook, stopping the action before it even starts. It happens synchronously within the context of the original operation.

Speaker 1

18:17

So it's in kernel enforcement, not just after the fact detection.

Speaker 2

18:21

Precisely, it's a fundamental shift from reactive detection to proactive synchronous prevention.

Speaker 1

18:25

And tools like Cilium Tetragon build on this.

Speaker 2

18:28

They do. Tetragen uses EVPF attaching to k probes and trace points, not just LSM hooks, to get rich context about security events. It can filter these events efficiently in the kernel, reducing overhead, and.

Speaker 1

18:41

Can it actively stop threats Yes.

Speaker 2

18:44

Based on his policies, it can use helpers like bpfen signal to for example, send a sig kill signal to terminate a malicious process immediately synchronously when a violation is detected.

Speaker 1

18:55

So detect and terminate right there in the kernel context exactly.

Speaker 2

18:58

It moves security beyond just observing and alerting to actively enforcing policy and preventing harm in real time.

Speaker 1

19:05

So to kind of wrap this deep dive up, eBPF has really come a long way from just filtering packets.

Speaker 2

19:09

An incredible journey.

Speaker 1

19:11

It's evolved into this powerful, versatile platform that's genuinely transforming observability, networking, and security, giving us unprecedented control and insight right at the kernel level.

Speaker 2

19:23

It really feels like we're still just scratching the surface of what's possible with it. The innovation is happening incredibly fast, and.

Speaker 1

19:30

The future looks bright. What's next?

Speaker 2

19:32

Well, The eBPF Foundation, which includes major players like Google, Meta, Microsoft, Netflix, isovalent. They're working on standardizing.

Speaker 1

19:42

eBPF standardization is always good, yeah.

Speaker 2

19:44

And even exploring its adoption for Windows, which could potentially allow eBPF programs written for Linux to run on Windows too.

Speaker 1

19:51

Wow. Cross platform eBPF.

Speaker 2

19:53

That's the long term vision for some. It's a rapidly evolving field. New kernel features are landing all the time, though it's worth remembering there's always a lag before the very latest features make it into stable Linux distributions.

Speaker 1

20:06

You might be running sure the usual adoption cycle.

Speaker 2

20:08

Right, but the pace of development is really impressive.

Speaker 1

20:11

It certainly seems like it's going to shape how many of our tools work now and in the future. So maybe a final thought for our listeners, EDPF seems to be shifting from this cutting edge niche thing to a fundamental platform for infrastructure tooling.

Speaker 2

20:27

Seems that way so well.

Speaker 1

20:29

Understanding how this magic works become as essential for developers and operators today as say, understanding networking protocols or distributed systems.

Speaker 2

20:38

That's a really interesting question to ponder. The tools built on EEDPF are definitely becoming ubiquitous, things like Cilium, Falco, Pixie, Tetric.

Speaker 1

20:46

Carey'll encounter them more and more.

Speaker 2

20:47

You will, and having even that foundational understanding of what's happening under the hood how eBPF enables them, it definitely gives you a significant advantage in using them effectively, troubleshooting them, and just understanding your systems better.

Speaker 1

21:01

Well. Hopefully this deep dive gave you a good starting point. We encourage you to think about how these eBPF based tools might shape your work and maybe explore some of the projects or Liz Rice's book if you want to go deeper.

Speaker 2

21:13

It's a fascinating space to watch.

Speaker 1

21:15

Absolutely thanks for joining us for this deep dive.

Transcript source: Provided by creator in RSS feed: download file

Episode description

Transcript