Tracebit — Distributed tracing & OpenTelemetry pipeline

tracebit

OTLP in · OTLP out · zero lock-in

Tracebit is the OpenTelemetry pipeline that sits between your services and your backend. It tail-samples on the full trace, collapses runaway cardinality, redacts what should never leave the cluster, and routes the spans worth keeping to any backend you choose — dropping ingest volume 10-30x before a single byte lands on a per-gigabyte invoice.

Drop-in OTLP gateway — your collectors point at us, nothing else changes
Tail-based sampling on the whole trace, not one span at a time
Route the same span to two backends and migrate off either, anytime

Overview

Live

$2.4M

Volume

+18.2%

Growth

99.99%

Uptime

Runs in front of the backends platform teams already pay for

HalyardNorthgate CloudDriftwaveCoreseamTidewallLattice SystemsHalyardNorthgate CloudDriftwaveCoreseamTidewallLattice Systems

The data plane

A pipeline,not anotherplace to store traces.

Tracebit doesn't want to be your tracing backend. It wants to be the layer in front of it — the one that decides what's signal, strips what's dangerous, and hands the rest to whatever you already pay for.

Speaks pure OTLP, both ends

Your services and collectors export OpenTelemetry over OTLP exactly as they do today. Tracebit receives it, transforms it, and emits OTLP downstream. No proprietary SDK, no agent to swap, no instrumentation to rewrite.

Stateless gateway you can scale flat

The pipeline runs as a horizontal fleet of stateless gateways. Add replicas to absorb a traffic spike, drain them on deploy, run them in your own VPC. There is no database to babysit and nothing that pins a trace to one node.

Backpressure that never drops blind

When a downstream backend slows or a burst arrives, Tracebit buffers to disk and applies pressure upstream gracefully — it sheds the low-value tail first and keeps every errored trace, so you never lose the spans you actually came for.

Config as code, diffed and shadowed

Every pipeline is a versioned config you can plan, diff, and shadow against live traffic before it ships. See exactly which spans a rule would drop, on real data, before you apply it to production.

What moving the decision upstream actually buys

10-30x

Less ingest volume to your backend

100%

Of error & slow traces kept whole

< 9ms

Added p99 latency through the gateway

Lines of app code changed to adopt

Inside the pipeline

Sampling is a decision.Make it on thewhole trace.

Head-based sampling guesses at the first span and throws away the trace that turned out to matter. Tracebit holds the trace until it's complete, then decides with the full picture — error, latency, route, and tenant all in hand.

Tail-based sampling, done right

Tracebit assembles every span of a trace in an in-memory window, waits for it to finish, and only then chooses to keep or drop it. Errors are kept unconditionally. Anything past your latency threshold is kept. Healthy traffic is sampled to a rate you set per route. A slow checkout never gets discarded because its first span looked boring — the decision is made once, on the trace as a whole, with zero orphaned spans downstream.

Cardinality control

Template runaway attributes at the source — collapse 40,000 raw http.url values into 38 route patterns, bucket unbounded IDs, drop labels that only ever blew up your index.

Redaction before the wire

PII never leaves the cluster. Drop or hash emails, tokens, card numbers, and auth headers inside the pipeline, so sensitive fields never reach a third-party backend at all.

Span enrichment

Attach deploy SHA, region, tenant tier, and Kubernetes metadata to every span as it passes through — so the trace that lands downstream already carries the context you'd otherwise join after the fact.

Live tap, full fidelity

Need every span for one tenant while you debug? Open a live tap on the pipeline and stream 100% of matching traces to your terminal — without touching the sampling everyone else is running.

The walk from firehose to bill

Where the 98% goes.

Most teams send everything to an expensive backend and pay to store traces no one will ever open. Here's what the pipeline does to that firehose before it reaches a meter.

01 · firehose

Eight million spans a minute, all priced

Every service exports every span straight to a per-gigabyte backend. 97% of it is healthy, fast, identical traffic that will never be queried — and all of it is on the invoice.

02 · tail-sample

Hold the trace, then judge it

Tracebit buffers each trace until it's complete, keeps every error and every slow outlier whole, and samples the healthy remainder to 1%. The decision is made on the full trace, not a coin-flip at span one.

03 · transform

Collapse the cardinality, strip the PII

Forty thousand raw URLs become a few dozen route templates, unbounded IDs get bucketed, and emails and tokens are dropped inside the cluster — so nothing dangerous or index-busting ever leaves.

04 · route

Route the keepers, archive the rest

Error traces go to your premium backend, a full unsampled copy lands in object storage for cents, and the bill drops by an order of magnitude — with every trace that mattered still intact and queryable.

From the platform channel

The teams who own the telemetry bill run Tracebit.

“Our observability invoice was growing faster than our traffic. We put Tracebit in front of the same backend, turned on tail sampling, and the bill dropped 91% in a week. We didn't lose a single error trace — we just stopped paying to store success.”

Priya Venkat

Platform Lead, Halyard

“We were locked into one vendor because re-instrumenting everything to leave was a six-month project. Tracebit speaks OTLP both ends, so migrating became one line in a route file. We ran the new backend in shadow for a day and cut over with zero downtime.”

Marcus Hale

Staff Engineer, Northgate Cloud

“Security stopped blocking our tracing rollout the day they saw the redaction diff. PII gets dropped inside our own cluster before anything reaches a third party, and the config review shows them exactly which fields go. That review passing is what shipped the whole project.”

Dana Okwu

Principal SRE, Coreseam

Pricing

Priced on what you keep, not what you fire.

Charging per ingested gigabyte punishes you for instrumenting more — that's the bill Tracebit exists to kill. You pay for the spans the pipeline decides to keep, after sampling. The 98% it drops is free, because you should never have paid to store it.

Solo

For side projects and proving out the pipeline.

$0/mo

Up to 5M sampled spans/mo
Tail-based sampling & cardinality control
1 downstream backend route
OTLP in, OTLP out
Community Discord

Platform

For teams that own a real telemetry bill.

$0.18/M kept spans

Unlimited ingest, pay only on kept spans
Fan-out routing to unlimited backends
Redaction, enrichment & live tap
Per-route budgets & cost guardrails
Shadow mode & config diffs
Business-hours support

Enterprise

For regulated, multi-region, high-volume fleets.

Custom

Fully in-VPC or air-gapped gateways
Cold-storage replay & data residency
SSO, SCIM & immutable audit log
Signed-pipeline approvals
Dedicated pipeline engineer
99.9% uptime SLA, 24/7 support

Straight answers about the pipeline.

Is Tracebit a replacement for my tracing backend?

No — it sits in front of it. Tracebit is the OpenTelemetry pipeline: your collectors export OTLP to it, it samples, transforms, and redacts, then it emits OTLP to whatever backend you already run. You keep Honeycomb, Tempo, Datadog, Jaeger, or ClickHouse and put a smart, vendor-neutral data plane ahead of it.

How is tail-based sampling different from what my collector already does?

Head-based sampling decides at the first span, before it knows whether the trace errored or ran slow — so it routinely throws away the traces you most needed. Tracebit buffers the whole trace until it completes, then decides with full context: every error and every latency outlier is kept intact, and only healthy traffic is sampled down. No orphaned spans, no coin-flip at span one.

Will I lose the traces that actually matter?

That's the part the pipeline is built to guarantee. Error traces and traces over your latency threshold are kept whole, head to tail, every time. Sampling only ever thins the healthy, fast, repetitive traffic that no one queries. You can also keep a full unsampled copy in cheap object storage and replay it later if you want belt and suspenders.

How much will this actually cut my bill?

Most teams drop ingest volume to their backend by 10 to 30 times, because the overwhelming majority of production traces are healthy and duplicative. Run the pipeline in shadow mode against your live traffic and the plan shows you the exact projected bill — on your real spans — before you apply a single rule.

Does adopting it mean re-instrumenting my services?

No. Tracebit speaks native OpenTelemetry over OTLP on both ends, so the only change is pointing your existing collectors or exporters at the Tracebit endpoint. Zero lines of application code change, no SDK swap, and if it isn't earning its keep you point your collectors back — your telemetry is never held hostage.

Can sensitive data be stripped before it leaves my environment?

Yes. Redaction runs inside the pipeline, in your own cluster. Drop or hash emails, tokens, card numbers, and auth headers before any span is emitted downstream, so PII never reaches a third-party backend. The config diff shows reviewers precisely which fields are dropped, which is usually what gets a tracing rollout past security.

Point one collector at us.Watch the bill fall.

Stand up a gateway, send it a copy of live traffic, and run it in shadow against your real spans. In an afternoon you'll see the exact projected invoice — before you change a single line of production config.

Your tracesare 98% noise.Stop payingto store it.