Lineage — Data catalog & column-level lineage

Lineage

The map of your data

Lineage reads your warehouse, your dbt models, and your BI layer, then draws the path every column travels — from raw source to the number on the dashboard. So the next time someone asks "where does this metric come from" or "what breaks if I drop this column," the answer is a graph you can click through, not a Slack thread that dies unanswered.

Connects in an afternoon, read-only
Column-level, not just table-level
Your data never leaves your cloud

Overview

Live

$2.4M

Volume

+18.2%

Growth

99.99%

Uptime

Mapping the warehouses behind data teams that ship daily

BrightwaterCobalt LogisticsMeridian RetailNorthwind AnalyticsHalcyon HealthFeldspar

The platform

A catalog that knowshow every column connects.

A data catalog that only lists tables is a phone book with no map. Lineage parses the SQL itself, so the catalog doesn't just tell you a column exists — it shows you where it was born, every transform it passed through, and everything downstream that depends on it.

Column-level lineage, parsed from your SQL

Lineage reads the actual queries behind your models and views and resolves the path of every individual column — including joins, CTEs, window functions, and case statements. You see that revenue_usd is built from stripe.charges.amount through four transforms, not just that two tables are vaguely related.

Impact analysis before you merge

Before you drop a column or rename a model, Lineage shows you everything downstream that would break — every dbt model, dashboard, and reverse-ETL sync that touches it. Ship the change knowing the blast radius, instead of finding out when the CFO's report goes blank.

A search box for your whole warehouse

Type a metric, a column, or a half-remembered table name and find it across every schema, with its definition, owner, freshness, and lineage attached. The answer to "do we already have this number" takes seconds, so analysts stop rebuilding tables that already exist.

Docs that write themselves

Lineage harvests descriptions from dbt, column comments, and the queries themselves, then keeps them in sync as the warehouse changes. Documentation stops being a quarter-old wiki nobody trusts and becomes a live reflection of what's actually running.

Trust signals on every asset

Certify the tables your team should build on, flag the ones being deprecated, and surface freshness and test status right in the catalog. People learn which numbers to trust at a glance, instead of guessing between three tables that all claim to hold revenue.

Popularity that points you to truth

Lineage reads query history to show which tables and columns actually get used and by whom. The asset 200 queries depend on rises to the top; the abandoned copy nobody has touched in a year stops masquerading as a source of truth.

What changes when lineage is a graph, not a guess

Column

Lineage resolved to the field, not just table to table

< 3s

To trace any column from dashboard back to source row

Every PR

Gets a downstream-impact check before it merges

~1 hr

From first read-only connection to a live graph

Sits beside your warehouse, not in front of it

Metadata in.Never your data.

Lineage is read-only by design. It connects to the systems you already run, parses query logs and schemas to build the map, and works entirely with metadata — your rows never leave your environment.

Connects to what you run

Native connectors for Snowflake, BigQuery, Databricks, Redshift, and Postgres, plus deep dbt, Looker, Tableau, and Fivetran integrations. Point Lineage at a warehouse and the graph builds itself from your existing query history.

Reads metadata, not rows

Lineage parses schemas, query logs, and model definitions to draw the map. It never reads, copies, or stores the contents of your tables — the actual data stays exactly where it lives.

Lives in your workflow

A GitHub check posts the downstream impact of every pull request, a Slack app answers lineage questions inline, and a typed API and webhooks wire the catalog into the rest of your platform.

Stays current on its own

The graph refreshes as your warehouse changes — new models, renamed columns, dropped tables — so the catalog reflects production today, not the day someone last ran the crawler.

Use cases

One graph, every question a data team dreads.

The hard questions in a data team rarely have a fast answer. Lineage turns each one into a click-through on the same map of how your data actually flows.

Impact analysis

"What breaks if I change this?"

See every downstream model, dashboard, and sync that depends on a column before you touch it, so a refactor stops being a leap of faith.

Root-cause tracing

"Where does this number come from?"

Click any metric and walk its lineage all the way back to the raw source row, resolving disputes about whose revenue figure is right in minutes.

Incident triage

"Why is this dashboard wrong?"

When a number looks off, trace upstream to the model or source that broke and find the failed load or bad transform without grepping a dozen logs.

Discovery

"Do we already have this?"

Search every schema for a metric before building it, so analysts reuse the certified table instead of spinning up the fourth copy of orders.

Governance

"Who owns this table?"

Every asset carries its owner, freshness, and certification status, so questions land with the right person instead of bouncing around #data-help.

Cleanup

"Is this safe to deprecate?"

Find the stale tables nothing queries and the ones half the company depends on, and retire dead assets with confidence instead of fear.

Customers

Data teams that stopped guessing how their data connects.

“A renamed column used to mean a week of broken dashboards we discovered one angry Slack message at a time. Now the GitHub check tells us exactly what a change breaks before we merge. Our analytics refactors went from terrifying to routine.”

Maya Lindqvist

Head of Data Platform, Brightwater

“Three teams each had a 'revenue' table and none of them agreed. We traced all three back through Lineage, found the one transform where they diverged, and finally have a single number the whole company trusts. That alone paid for it.”

Daniel Osei

Director of Analytics Engineering, Cobalt Logistics

“When a board metric looks wrong, I used to lose a day to it. Last week I clicked the number, walked the lineage upstream, found the failed Fivetran load in two minutes, and was back to work. It changed what a data incident costs us.”

Priya Raman

VP Data, Meridian Retail

Pricing

Priced by the warehouse, not per seat.

Connect one warehouse and map it free. Every plan includes full column-level lineage — the map is the product, never the upsell.

Map

For teams getting their warehouse in order.

$0/mo

Connect 1 warehouse
Column-level lineage graph
Full-text catalog search
Auto-generated docs
Up to 10 users · community support

Team

For data teams shipping changes every day.

$900/mo

Up to 5 warehouses & BI tools
Impact analysis on every pull request
Slack & GitHub integrations
Certification, ownership & freshness
Query-history popularity
Priority support & 99.9% SLA

Enterprise

For platform teams governing many warehouses at once.

Custom

Unlimited warehouses & sources
Single-tenant deployment in your VPC
SSO, SCIM & granular RBAC
Audit logs & data-access policies
Named data architect
Custom connectors & onboarding

The questions data teams ask before they connect.

Does Lineage read the data in my tables?

No. Lineage works entirely with metadata — schemas, query logs, and model definitions. It parses the SQL to build the lineage graph but never reads, copies, or stores the contents of your rows. Your actual data stays in your warehouse.

How is column-level lineage different from table-level?

Table-level lineage tells you table A feeds table B. Column-level lineage tells you that revenue_usd in B is built specifically from amount_cents in A, through the exact transforms in between. That precision is what lets impact analysis tell you which columns break, not just which tables are vaguely connected.

How does impact analysis work in practice?

Lineage installs as a GitHub check. When you open a pull request that changes a model, it walks the downstream graph and comments with every dbt model, dashboard, and reverse-ETL sync that depends on what you changed — so you see the blast radius before you merge, not after a report goes blank.

Which tools does Lineage connect to?

Warehouses including Snowflake, BigQuery, Databricks, Redshift, and Postgres; transformation via dbt; and BI tools including Looker, Tableau, and Power BI, plus ingestion tools like Fivetran. Connection is read-only and the graph builds itself from your existing query history.

How long does it take to see value?

Most teams connect a warehouse and have a full column-level lineage graph within an hour. Because Lineage parses your existing query history, there's nothing to instrument and no migration — the map reflects how your data already flows from the first crawl.

What happens to lineage when our models change?

The graph re-parses as your warehouse changes — every new model, renamed column, and dropped table is picked up on the next crawl. You're never reading a stale map; the catalog reflects the SQL running in production today, not the day someone last documented it.

Is Lineage secure?

Lineage is SOC 2 Type II certified, connects with read-only, least-privilege credentials, and can run single-tenant inside your own VPC on the Enterprise plan. Because it only ever touches metadata, the sensitive contents of your warehouse never enter the platform.

Stop guessing where your data comes from.

Connect one warehouse, watch the lineage graph draw itself, and trace your first column to its source today. Read-only, metadata-only, and no sales call required to start.

Follow any columnback to whereit came from.