kerntrace
eBPF · CO-RE · zero instrumentation

Kerntrace attaches eBPF probes to the running kernel and streams every syscall, page fault, TCP retransmit, and off-CPU stall straight off your fleet — no agent to install, no sidecar, no recompile. Overhead measured in nanoseconds, proven safe by the kernel's own verifier before a single byte runs.

  • CO-RE: compile once, run on every kernel 5.4 → 6.x
  • No kernel modules, no instrumentation, no app restart
  • The verifier proves each probe safe before it loads
kerntrace — live syscall stream
$ kerntrace attach --probe=syscall:openat --where 'comm == "checkout-svc"'
[verifier] program accepted — 412 insns, 0 unbounded loops, 184B stack
[loaded]   kprobe/sys_enter_openat attached to 38 hosts in 1.2s

ts            host        pid    latency   ret      path
12:04:07.118  api-3       2241     14µs    0        /etc/ssl/certs/ca.pem
12:04:07.119  api-3       2241    9.8ms    ESTALE   /var/lib/secrets/db.key  ←
12:04:07.121  worker-7    8830     11µs    0        /tmp/upload-7f3a.part

[anomaly]  openat p99 +1,920% on api-3  —  ESTALE on a stale NFS mount, not your code
[culprit]  blocking read against a dead NFS handle — 9.8ms/call, 4.1k calls/min
→ off-CPU flamegraph?   trace the fd leak?   pin this probe to Grafana?

Running in production at teams that live below the syscall boundary

Northgate CloudHalyardCoreseamBitfrostTidewallLattice Systems
What you can see

Everything the kernel knows,nothing your app has to tell it.

Application metrics stop at the edge of your process. Kerntrace starts where they end — in the kernel, where the real stalls, drops, and blocked I/O actually live.

Every syscall, attributed

kprobes and tracepoints on the syscall boundary capture openat, read, write, connect, futex, and the other 300-plus calls — each tagged with PID, container, cgroup, and the exact argument that hurt.

Off-CPU, not just on-CPU

Most profilers only show you CPU burning. Kerntrace shows the time your threads spend asleep — blocked on locks, disk, or the network — and stacks it into an off-CPU flamegraph built from scheduler events.

The network, kernel-side

Trace TCP retransmits, connection resets, and SYN-to-accept latency straight from the socket layer. Catch the tail latency that never shows up in your application's request timer.

Userspace probes too

uprobes and USDT markers attach to your own binaries and to libssl, libc, or the JVM — read a function's arguments and return value with no debugger, no restart, no recompile.

Page faults & memory pressure

Watch major faults, OOM-kill decisions, and slab churn as the kernel makes them. Find the service quietly thrashing the page cache before it takes a node down with it.

File-descriptor & I/O leaks

Correlate an open with no matching close, sockets wedged in CLOSE_WAIT, and growing fd tables back to the line of code and the request that opened them.

What 'production-safe' actually measures

87 ns
Overhead per syscall probe
1.4M/s
Events streamed per host, zero drops
5.4 → 6.x
Kernel range, one CO-RE binary
< 1s
Probe attach across the whole fleet
How it stays safe

Tracing the kernelused to mean risking it.

Kernel modules can panic a box. ptrace stops the world. Kerntrace runs inside the eBPF sandbox the kernel itself enforces — every program is proven bounded before it is ever allowed to execute.

The verifier is your safety net

Before a probe loads, the in-kernel eBPF verifier walks every instruction path and proves it terminates, never reads out-of-bounds memory, and respects a fixed stack. A program that can't be proven safe simply does not load. Kerntrace surfaces the verifier's verdict — instruction count, bounded loops, stack depth — so every probe you ship carries the kernel's own guarantee.

CO-RE, not a module per kernel

Compile Once, Run Everywhere uses BTF type information to relocate field offsets at load time. One binary runs across kernel 5.4 through 6.x — no headers to chase, no module to rebuild per host.

Ring buffers, never the wire

Events stream through the perf ring buffer in-kernel and are aggregated on the host before anything leaves it. Cardinality and volume are bounded at the source — not on your network bill.

Read-only by default

Probes observe; they don't mutate. No kernel module to taint your kernel, no LD_PRELOAD shim, nothing that can wedge a running process.

Graceful under load

When a host gets hot, sampling backs off on its own and the kernel falls back to a counter instead of a full event — observability that never becomes the incident.

From red graph to root cause

The walk from symptom to syscall.

Every Kerntrace investigation ends at a specific kernel event, a specific PID, and the exact argument that caused it. Here is the path it walks for you.

01 · symptom

A latency spike with no app-side cause

p99 on checkout jumps to 9ms. Your APM shows the request handler sitting idle the whole time — the latency is being spent somewhere your instrumentation can't see.

02 · attach

Drop below the syscall boundary

Kerntrace attaches a syscall probe scoped to that one service and immediately catches the threads blocked in a read() against a stale NFS mount — not in your code at all.

03 · prove

Off-CPU flamegraph confirms the stall

The off-CPU flamegraph shows 94% of wall-clock time asleep in the VFS layer, stacked under a single blocking read. The kernel was the bottleneck the whole time.

04 · verify

Fixed, then watched

Remount with a sane timeout and the same probe shows openat latency back at 14µs. The alert auto-resolves, and the probe stays pinned to catch the next regression before a human does.

From the on-call channel

The engineers who page at 3 a.m. run Kerntrace.

We spent two weeks blaming our own code for a latency spike. Kerntrace found it in twenty minutes — a stale NFS mount blocking in read(). It was never our code. It was always the kernel.

P
Priya Venkat
Staff SRE, Northgate Cloud

No agent, no restart, no recompile. We attached a uprobe to our running payment binary in production and watched the exact argument that was timing out. That used to be a deploy-and-pray exercise.

M
Marcus Hale
Principal Engineer, Halyard

The verifier output is what got it past our kernel team. They could read, line by line, that the program was bounded before it ever loaded. That is how you trace prod without flinching.

D
Dana Okwu
Platform Lead, Coreseam
Pricing

Priced per host. Probe as deep as you want.

We don't meter events or charge per gigabyte — that would punish you for tracing more. You pay for the hosts you run, and the kernel does the aggregation before anything costs you a cent.

Hacker

For homelabs, side projects, and learning eBPF.

$0/mo
  • Up to 5 hosts
  • Syscall, network & off-CPU probes
  • bpftrace-compatible one-liners
  • 7-day event retention
  • Community Discord
Most popular

Fleet

For teams running real production at scale.

$24/host/mo
  • Unlimited hosts & probes
  • uprobes, USDT & userspace tracing
  • Off-CPU flamegraphs & FD-leak tracking
  • 13-month retention
  • Grafana, PagerDuty & Slack
  • Business-hours support

Kernel

For regulated, air-gapped, and multi-region fleets.

Custom
  • Fully in-VPC or air-gapped deploy
  • Custom retention & data residency
  • SSO, SCIM & immutable audit log
  • Signed-probe allowlisting
  • Dedicated kernel engineer
  • 24/7 priority support

Straight answers, kernel-deep.

Will eBPF probes slow down or crash production?

No. Every program passes the in-kernel verifier, which proves it is bounded, terminating, and memory-safe before it loads — an unsafe program never runs. Probes are read-only, overhead is around 87 nanoseconds per syscall, and sampling backs off automatically under load. There is no kernel module to taint or panic your kernel.

Do I have to instrument my code or install an agent?

No. Kerntrace attaches directly to the running kernel via kprobes, tracepoints, and uprobes. There is nothing to add to your application, no library to import, no sidecar, and no restart. You attach a probe to a process that is already running and start seeing events immediately.

Which kernels and distros are supported?

Any Linux kernel from 5.4 onward with BTF enabled — which covers modern Ubuntu, Debian, RHEL/Rocky, Amazon Linux, and most managed Kubernetes nodes. Thanks to CO-RE, a single Kerntrace binary relocates against each host's own type information, so you never compile or ship a module per kernel version.

How is this different from an APM or a metrics tool?

APMs trace inside your process and stop at the syscall boundary. Kerntrace starts there. It sees the time spent off-CPU, the blocking I/O, the TCP retransmits, and the page faults your application timer is blind to — the parts of a slow request that aren't your code at all.

Can I use my existing bpftrace and BCC scripts?

Yes. Kerntrace speaks bpftrace-style one-liners, so the probes you already write in an ad-hoc session become saved, fleet-wide probes with retention and alerting. You keep the language you know and get a control plane around it.

Where does my trace data go?

Events are aggregated in-kernel through the perf ring buffer and rolled up on each host before anything leaves it. You can run Kerntrace entirely inside your own VPC or fully air-gapped, pin data residency by region, and export raw events whenever you like.

The next 3 a.m. page is already in the kernel.

Attach your first probe in under a minute — no agent, no restart, no recompile. Watch the syscalls your dashboards have never shown you.