Tuxscope Lab 1: Hello eBPF · Steven Foerster

This is the first lab in the Tuxscope series, a set of tutorials that build a Linux kernel observability toolkit powered by eBPF and Rust. By the end of the series you will have a single binary that can trace syscalls, observe file I/O, and monitor network connections; all without modifying the kernel or loading out-of-tree modules.

In this lab you will write and run the simplest possible eBPF program: one that attaches to the write syscall tracepoint and sends a small event to userspace every time any process calls write(). The goal is not to build something production-useful yet; it is to understand the full pipeline from kernel-space probe to userspace consumer.

The complete source code is at gitlab.com/sfoerster/tuxscope.

Note

Prerequisites You need a Linux system running kernel 5.8 or later, root access (or sudo), and the tuxscope binary built from source. See the repository README for build instructions. The labs are developed and tested on x86_64, ARM64 should work but is not regularly tested.

What is eBPF?

eBPF (extended Berkeley Packet Filter) is a technology that lets you run sandboxed programs inside the Linux kernel without changing kernel source code or loading kernel modules. Originally designed for packet filtering, it has evolved into a general-purpose in-kernel virtual machine used for tracing, security, and networking.

The key properties that make eBPF useful for observability:

Safe: programs are verified before loading: the kernel rejects anything that could crash, loop forever, or access invalid memory.
Fast: programs run in kernel context with near-zero overhead. No context switches, no copying data through /proc.
Dynamic: attach and detach probes at runtime. No reboot, no recompile.

An eBPF observability tool follows this pipeline:

 Kernel Space                          User Space
┌──────────────────────┐              ┌──────────────────────┐
│                      │              │                      │
│  Event fires         │              │  Poll buffer         │
│  (syscall, kprobe)   │              │  (perf/ring)         │
│         │            │              │         │            │
│         v            │              │         v            │
│  eBPF program runs   │              │  Deserialize event   │
│  - read context      │              │  - format output     │
│  - build event       │              │  - filter/aggregate  │
│  - push to buffer ───┼──────────────┼→ - display           │
│                      │              │                      │
└──────────────────────┘              └──────────────────────┘

The kernel-side program fires on an event (a tracepoint, a kprobe, etc.), reads context about what happened, constructs a small event struct, and pushes it into a shared buffer. The userspace program polls that buffer, deserializes the events, and does something useful with them.

Why Rust for eBPF?

Most eBPF tooling uses C for the kernel-side programs and Python or Go for userspace. Tuxscope uses Rust for both sides via the Aya framework. The advantages:

Single language for kernel and userspace code. The event struct is defined once and shared.
No libbpf dependency. Aya compiles eBPF programs to BPF bytecode using the Rust toolchain and loads them directly.
Type safety catches mistakes at compile time that would be runtime bugs in C.

Aya is not the only Rust eBPF framework (libbpf-rs wraps libbpf in Rust), but it is the most ergonomic for writing both sides in Rust.

The tuxscope architecture

Tuxscope is a Cargo workspace with three crates:

tuxscope/
├── tuxscope-common/     # Shared types (event structs, constants)
│   └── src/lib.rs
├── tuxscope-ebpf/       # eBPF programs (compiled to BPF bytecode)
│   └── src/
│       ├── hello.rs
│       ├── syscall.rs
│       ├── fileio.rs
│       └── net.rs
├── tuxscope/            # Userspace CLI (loads probes, reads events)
│   └── src/
│       ├── main.rs
│       └── ...
├── Cargo.toml
└── xtask/               # Build helper for cross-compiling eBPF

tuxscope-common defines the event structs shared between kernel and userspace. These structs are #[repr(C)] so their memory layout is identical on both sides.
tuxscope-ebpf contains the eBPF programs. Each file is a separate BPF program that gets compiled to BPF bytecode targeting bpfel-unknown-none (BPF little-endian, no OS).
tuxscope is the userspace binary. It loads the compiled BPF programs into the kernel, attaches them to the right hooks, reads events from shared buffers, and formats output.

The xtask crate handles the cross-compilation step, eBPF programs must be compiled for the BPF target, not the host architecture.

The event struct

Every lab starts here. The event struct defines what data flows from kernel to userspace. For the hello lab, it is minimal:

// tuxscope-common/src/lib.rs

#[repr(C)]
#[derive(Clone, Copy)]
pub struct HelloEvent {
    pub pid: u32,
    pub timestamp_ns: u64,
    pub comm: [u8; 16],
}

pid, the process ID that triggered the write syscall.
timestamp_ns, nanosecond timestamp from bpf_ktime_get_ns().
comm, the 16-byte process name (what shows up in ps). The kernel truncates names longer than 15 characters.

The #[repr(C)] attribute is critical. Without it, Rust is free to reorder and pad fields. Since the eBPF program writes this struct in kernel space and the userspace program reads it, the layout must be identical on both sides.

The eBPF program

The kernel-side program attaches to the syscalls/sys_enter_write tracepoint. Every time any process calls write(), this code runs:

// tuxscope-ebpf/src/hello.rs

use aya_ebpf::{
    macros::{map, tracepoint},
    maps::PerfEventArray,
    programs::TracePointContext,
    helpers::{bpf_get_current_pid_tgid, bpf_ktime_get_ns, bpf_get_current_comm},
};
use tuxscope_common::HelloEvent;

#[map]
static EVENTS: PerfEventArray<HelloEvent> = PerfEventArray::new(0);

#[tracepoint]
pub fn hello(ctx: TracePointContext) -> u32 {
    match try_hello(&ctx) {
        Ok(()) => 0,
        Err(_) => 1,
    }
}

fn try_hello(ctx: &TracePointContext) -> Result<(), i64> {
    let pid = (bpf_get_current_pid_tgid() >> 32) as u32;
    let timestamp_ns = unsafe { bpf_ktime_get_ns() };
    let comm = bpf_get_current_comm().map_err(|e| e as i64)?;

    let event = HelloEvent {
        pid,
        timestamp_ns,
        comm,
    };

    EVENTS.output(ctx, &event, 0);
    Ok(())
}

The key pieces:

#[map] declares a BPF map: a data structure shared between kernel and userspace. PerfEventArray is a per-CPU ring buffer optimized for streaming events.
#[tracepoint] marks the function as a BPF program that attaches to a kernel tracepoint.
bpf_get_current_pid_tgid() returns a 64-bit value: the thread group ID (PID) in the upper 32 bits and the thread ID (TID) in the lower 32 bits. We shift right to get the PID.
bpf_get_current_comm() fills a 16-byte array with the current process name.
EVENTS.output() pushes the event into the PerfEventArray for userspace to consume.

Note

Why PerfEventArray? This lab uses PerfEventArray because it is the simplest buffer type. Later labs switch to RingBuf, which has better ordering guarantees and simpler buffer management. Starting with PerfEventArray lets you see both approaches.

The userspace handler

The userspace side loads the BPF program, attaches it to the tracepoint, and polls for events:

// tuxscope/src/main.rs (simplified)

use aya::programs::TracePoint;
use aya::maps::perf::AsyncPerfEventArray;
use bytes::BytesMut;
use tuxscope_common::HelloEvent;

// Load the compiled eBPF object file
let mut bpf = aya::Ebpf::load(aya::include_bytes_aligned!(
    concat!(env!("OUT_DIR"), "/hello")
))?;

// Attach to the tracepoint
let program: &mut TracePoint = bpf.program_mut("hello").unwrap().try_into()?;
program.load()?;
program.attach("syscalls", "sys_enter_write")?;

// Open the perf event array
let mut perf_array = AsyncPerfEventArray::try_from(bpf.take_map("EVENTS").unwrap())?;

// Read events from each CPU buffer
let mut buf = [BytesMut::with_capacity(4096)];
loop {
    let events = perf_array.read_events(&mut buf).await?;
    for i in 0..events.read {
        let event: HelloEvent = unsafe { *(buf[i].as_ptr() as *const HelloEvent) };
        let comm = core::str::from_utf8(&event.comm)
            .unwrap_or("<invalid>")
            .trim_end_matches('\0');
        println!("{:<8} {:<16} {}", event.pid, comm, event.timestamp_ns);
    }
}

The flow is: load the BPF object file (compiled by xtask), find the program by name, attach it to syscalls/sys_enter_write, open the map by name, and loop reading events. The unsafe block is necessary because we are interpreting raw bytes as a struct: the #[repr(C)] layout guarantee makes this safe in practice.

Note

Simplified code The snippet above omits per-CPU buffer setup and async runtime details for clarity. See the full source in the repository for the compilable version.

Running it

Build and run with:

cargo xtask build
sudo tuxscope hello

Or during development:

cargo xtask run -- hello

You will see a stream of events as processes write to file descriptors:

PID      COMM             TIMESTAMP_NS
1842     bash             9823741029384
1842     bash             9823741031205
3217     sshd             9823741045891
1        systemd          9823741098234
4501     Xwayland         9823741102847
1842     bash             9823741156023
3217     sshd             9823741198412

The output is noisy because every write() call on the system fires the tracepoint. A busy desktop will produce thousands of events per second. Filter to a specific process with --pid:

sudo tuxscope hello --pid 1842

PID      COMM             TIMESTAMP_NS
1842     bash             9823741029384
1842     bash             9823741031205
1842     bash             9823741156023

For machine-readable output, use JSON format:

sudo tuxscope hello --pid 1842 --format json

{"pid":1842,"comm":"bash","timestamp_ns":9823741029384}
{"pid":1842,"comm":"bash","timestamp_ns":9823741031205}
{"pid":1842,"comm":"bash","timestamp_ns":9823741156023}

Warning

Event volume Without a PID filter, the hello probe fires on every write() across the entire system. On a busy machine this can be tens of thousands of events per second. Always filter by PID when you do not need system-wide visibility, or pipe the output to a file.

What just happened

Step through what happened when you ran sudo tuxscope hello:

The tuxscope binary loaded the compiled BPF bytecode into the kernel via the bpf() syscall.
The kernel verifier checked the program; no out-of-bounds access, no infinite loops, no illegal instructions.
The program was JIT-compiled to native machine code and attached to the syscalls/sys_enter_write tracepoint.
Every time any process called write(), the kernel ran our BPF function, which captured the PID, timestamp, and comm name, then pushed the event into the PerfEventArray.
The userspace tuxscope process polled the PerfEventArray, deserialized each event, and printed it.

When you hit Ctrl-C, tuxscope detaches the program and the tracepoint goes back to normal. No kernel modification persists.

Exercises

Modify the event struct to include the file descriptor number being written to. The sys_enter_write tracepoint provides the fd as its first argument, read it with ctx.read_at::<i64>(16) (offset 16 in the tracepoint args). Update both the eBPF program and the userspace formatter.
Attach to a different tracepoint. Try syscalls/sys_enter_read instead of sys_enter_write. What changes in the output? Which processes read more than they write?
Add a simple filter in the eBPF program. Instead of filtering in userspace with --pid, add an early return in the BPF program that skips events where the PID does not match a value stored in a BPF array map. This is how production tools minimize overhead, filter in kernel space, not userspace.
Count events instead of streaming them. Replace the PerfEventArray with a BPF HashMap that maps PID to a count. In the eBPF program, increment the count on each write. In userspace, periodically read and display the map contents instead of streaming individual events. This is the foundation of tools like bpftop.

What’s next

In Lab 2: Syscall Tracing, you will move beyond a single syscall and trace all system calls. You will switch from PerfEventArray to RingBuf, resolve syscall numbers to human-readable names, and start building a picture of what processes are actually asking the kernel to do.