grepline
ingest · index · grep

Grepline ships every log line off every host, container, and Lambda into one full-text index — then lets you grep all of it in real time. The regex that worked on one box now works across ten thousand. First result before you finish typing the pattern.

  • One agent, every source — syslog to stdout
  • Sub-second search over a trillion lines
  • Pay per host, never per ingested gigabyte
grepline — live search
$ grep '5\d\d' service=checkout last=15m | stats count by host
scanning  1.2T lines  across  9,840 hosts ............ 312ms

  host                  status   count   Δ vs 1h
  checkout-7f4 (ca-1)    503      1,204   ▲ 1,180
  checkout-2b9 (ca-1)    503        118   ▲   96
  checkout-d31 (us-3)    200          —   nominal

[pattern]  all 503s carry  trace_id  + same build  v4.18.2
[narrow]   add  | where build=v4.18.2  → 1,322 lines
[live]     streaming new matches ......... ⏵ tailing
✓ saved as alert  "checkout 5xx spike"  · notify #oncall

On-call engineers grep here when prod gets loud

NorthwindHexbyteCobalt SystemsDriftwaveOrbit CloudTidemarkNorthwindHexbyteCobalt SystemsDriftwaveOrbit CloudTidemark
the index

grep was never the problem.The other 9,999 boxes were.

You already know how to find a line. Grepline just makes the haystack the whole fleet instead of one ssh session — fully indexed, always live, and fast enough to iterate on the pattern instead of waiting for it.

Full-text index, not a sampled bucket

Every line is tokenized and indexed on arrival — no nightly rollup, no sampling that drops the one log you needed. A raw substring, a regex, a field filter, or a structured query all return the same way: in milliseconds, over the entire retention window. The pattern you'd run on a single file runs against a trillion lines and comes back before your cursor blinks.

LQL — grep with a stats engine

Pipe matches into count, percentile, group-by, and rate without leaving the search bar. `| stats p99(latency) by route` works mid-incident — no export, no notebook, no waiting on a query job.

Live tail across the fleet

`tail -f` every matching host at once. New lines stream into the same query you're already reading — no refresh, no re-run.

Patterns auto-cluster the noise

Grepline collapses millions of near-identical lines into the handful of distinct shapes behind them, so a flood reads as five patterns, not five million rows.

Save a query, get an alert

Any search becomes a live alert. Match count crosses your threshold and it fires to Slack, PagerDuty, or a webhook — with the matching lines attached, not just a number.

What a real incident query looks like

312ms
Median search over 1T lines
9.8M
Lines ingested per second
94%
Raw bytes shed by compression
30d
Hot, fully searchable retention
one agent in

Point it at stdout.Walk away.

A single binary tails everything a host emits and ships it structured, compressed, and back-pressured. No per-source config marathon, no log-shipping sidecar zoo — just one agent and one endpoint.

Tails everything by default

Files, journald, syslog, Docker, Kubernetes, and Lambda — auto-discovered. New containers start streaming the moment they boot, no manifest edits.

Parses as it ships

JSON, logfmt, and common access logs are field-extracted at the edge, so `status`, `latency`, and `trace_id` are queryable the instant they land.

Never drops a line under load

Local disk buffering and adaptive back-pressure ride out network blips and ingest spikes. The agent degrades gracefully; it does not silently lose logs.

Lives in your shell too

The whole query engine ships as a CLI. Pipe a live Grepline search into `grep`, `jq`, `awk`, or a dashboard — same results, no browser required.

queries that earn their keep

Stop ssh-ing. Start asking.

Real questions on-call asks at 3 a.m., answered in one Grepline line instead of a war room.

incident

Find the build behind the 5xx wave

`grep '5\d\d' last=30m | stats count by build` — the regression names the release before anyone opens the deploy log.

debug

Trace one request across every service

`field trace_id=9f31c2` returns every line that request touched, in order, across all hosts — one timeline, zero tab-switching.

perf

Catch the slow query before users do

`source=postgres | where duration>2s | stats count by query` surfaces the full-table scan the moment it starts hurting.

security

Audit who touched the secrets store

`source=vault action=read path~='secret/prod/*'` — a searchable, retained access trail, no SIEM project required.

release

Watch a deploy roll out, live

`tail build=v4.19.0` streams matching lines as the new version reaches each host — promote or roll back on evidence.

hygiene

Quantify the noise before you mute it

`level=warn last=24h | stats count by pattern` ranks the loudest log lines so you fix the worst offenders, not the random ones.

from the incident channel

The first query usually ends the page.

We used to fan out four engineers across four ssh sessions, each tailing a different box. Now one person greps the whole fleet from one bar and reads the answer out loud. Our 5xx incidents went from a war room to a one-liner.

P
Priya Anand
Staff SRE, Northwind

The pattern clustering is the feature nobody believes until a flood hits. Two million error lines collapsed into five shapes, and four of them were the same null check. We shipped the fix before the alert even re-fired.

M
Marcus Reid
Platform Lead, Hexbyte

Our old log bill scaled with traffic, so we sampled — and of course we sampled away the exact lines we needed during the worst outage of the year. Grepline charges per host, so we index everything now and just grep it.

D
Dana Okafor
Director of Infrastructure, Cobalt Systems
pricing

Priced per host. grep all you want.

Per-gigabyte log pricing makes you choose between visibility and your budget. We charge for the hosts you run, so you can index every line and never watch the meter mid-incident.

Hobby

For side projects and a handful of boxes.

$0/forever
  • Up to 5 hosts
  • Full-text search & live tail
  • 3-day retention
  • LQL + CLI
  • Community Discord
Most popular

Team

For teams that live on the page.

$22/host/mo
  • Unlimited ingest per host
  • 30-day hot retention
  • Pattern clustering
  • Unlimited saved-search alerts
  • Slack, PagerDuty & webhooks
  • Business-hours support

Enterprise

For regulated, multi-region, high-volume fleets.

Custom
  • In-VPC or single-tenant deploy
  • Custom retention & data residency
  • Bring-your-own S3 / R2 / GCS
  • SSO, SCIM & audit logs
  • Dedicated reliability engineer
  • 99.9% uptime SLA

Straight answers for the on-call.

Is this just hosted grep, or a real index?

Both, and that's the point. Every line is tokenized and indexed the moment it arrives, so a raw substring, a regex, or a structured field filter all return in milliseconds over the full retention window — not just over the last few minutes you happened to be tailing. You get grep's mental model with an index's speed across the entire fleet.

How do I get my logs in?

One agent. A single binary auto-discovers files, journald, syslog, Docker, Kubernetes, and Lambda, parses JSON and logfmt at the edge, and ships everything compressed and back-pressured to one endpoint. New containers start streaming the moment they boot — there's no per-source config to maintain and no shipping sidecar to babysit.

Why per host instead of per gigabyte?

Because per-gigabyte pricing punishes you for the visibility you need most, exactly when you need it. Teams on usage-based plans sample logs to control cost and then lose the one line that mattered during an outage. We price per host so you can index every line from every box and your bill never spikes with your traffic.

What is LQL?

Grepline's query language — grep semantics with a stats engine bolted on. Start with a pattern or a field filter, then pipe matches into count, group-by, percentiles, and rates: `grep timeout last=1h | stats p99(latency) by route`. It reads like the shell pipeline you'd already write, runs live across the whole fleet, and any LQL query can be saved as an alert or run from the CLI.

How fast is search at real scale?

Median full-text queries return in roughly 300 milliseconds over a trillion indexed lines, and live tail streams new matches in real time. It's fast enough to iterate on the pattern interactively — narrow, widen, re-group — instead of submitting a query and waiting for a job to finish.

Can I keep my data in my own environment?

Yes. Run Grepline fully inside your VPC or in an isolated single-tenant cloud, point storage at your own S3, R2, or GCS bucket, and pin data residency by region. Logs are encrypted in transit and at rest, you can purge any stream on demand, and Enterprise adds SSO, SCIM, and full audit logs.

Your next outage is going to generate the logs anyway.

Drop one agent on one host and grep it in about five minutes. Free up to five boxes, no card, no sales call — keep it if the first query saves you a war room.