Helix
From FASTQ to finding

Helix is the analysis platform for modern genomics teams. Drop in raw reads, run validated pipelines on elastic compute, and query billions of variants in plain language — no cluster to babysit, no glue scripts to maintain.

  • CLIA / CAP-ready workflows
  • Reproducible to the commit
  • Your reads never leave your tenant
helix query
helix variants \
  --cohort breast-ca-2026 \
  --gene BRCA1 BRCA2 \
  --impact high \
  --af "< 0.001"

→ 1,284 carriers across 6,400 samples
→ 37 novel pathogenic candidates
→ cohort.report.html ready (4.2s)

Trusted in sequencing cores, biobanks, and translational labs

Helmsley Sequencing CoreCaldera TherapeuticsNorthvale BiobankMeridian GenomicsLighthouse OncologyStrand Diagnostics
The platform

One platform fromsequencer to insight.

Stop gluing together aligners, callers, annotators, and a fragile cluster. Helix runs the whole arc — and keeps every step reproducible, versioned, and audit-ready.

Validated pipelines

Production-grade workflows for WGS, WES, RNA-seq, and panels — built on GATK and DeepVariant, benchmarked against GIAB truth sets, and pinned to a version so a run from last year reproduces byte-for-byte today.

Variant warehouse

Every called variant lands in a columnar store with genotypes, annotations, and sample metadata side by side — so a cross-cohort query that once meant a week of joins now returns in seconds.

Natural-language queries

Ask in plain English; Helix compiles it to a typed, reviewable query you can inspect, save, and rerun — no hand-written SQL, no ticket to bioinformatics.

Elastic compute

Burst to thousands of cores for a batch, scale to zero when it's done. You pay per genome processed, not per node sitting idle overnight.

Auto-annotation

ClinVar, gnomAD, dbSNP, and Ensembl VEP applied on ingest and refreshed on a schedule — so the evidence behind a variant is current the day you query it, not the day you imported it.

What a run looks like at scale.

41 min
Median WGS turnaround
99.9%
SNV F1 vs. GIAB
12M+
Samples queryable
$3.90
Per-genome compute
Built for bioinformaticians

Reproducible bydefault, not bydiscipline.

Helix treats every analysis like code: versioned, containerized, and reviewable. The platform owns the infrastructure so your scientists spend their hours on biology, not on YAML and broken scheduler logs.

Workflows as code

Author in WDL or Nextflow, or fork a validated Helix pipeline. Every run is pinned to a container digest and a parameter set, so the same inputs always give the same outputs.

Provenance on everything

Each result carries its full lineage — inputs, reference build, tool versions, and the exact command that produced it — ready to drop straight into a methods section.

Notebooks beside the data

Hosted JupyterLab and R sessions run in-tenant, right next to your variant warehouse — no exports, no copies, no data leaving the perimeter to land in someone's laptop.

API and CLI first

Everything in the UI is a typed REST endpoint and a single helix CLI command, so an analysis fits straight into your existing automation and CI.

Pipeline library

Open the catalog. Run it the same day.

Every workflow ships validated, version-pinned, and benchmarked — fork it, parameterize it, or run it untouched.

GATK · DeepVariant

Germline WGS

Align, call, and annotate a 30× whole genome against GRCh38 or T2T-CHM13, with SNV/indel F1 published per release.

Mutect2

Somatic tumor / normal

Paired calling with contamination and orientation-bias filtering, tuned for low-VAF variants in FFPE samples.

STAR · Salmon

RNA-seq expression

Quantify transcripts, call fusions, and land a normalized expression matrix straight in the warehouse.

Amplicon · hybrid

Targeted panels

UMI-aware deduplication and panel-specific QC for clinical and research targeted sequencing.

Cohort GVCF

Joint genotyping

Scale to population-level joint calls across tens of thousands of GVCFs without hand-managing shards.

PCA · relatedness

Cohort QC & ancestry

Sample contamination, sex checks, kinship, and ancestry inference in one pass before any downstream analysis.

Governance and trust

Compliance built in,not bolted on.

Genomic data is the most sensitive data there is — it identifies a person and their relatives for life. Helix is engineered for the regulatory and privacy bar that clinical and research genomics demand.

Single-tenant by design

Your reads, variants, and notebooks live in an isolated tenant in the region you choose. Helix staff have no standing access to it.

End-to-end encryption

Encrypted in transit and at rest with customer-managed keys, so you hold the keys to your own genomes — not us.

Clinical-grade audit

Immutable, exportable audit logs for every access and every run — the paper trail an accreditor expects to see on day one.

Consent-aware access

Tie sample-level consent and IRB scope to access policy, so a query can only ever touch the data it's permitted to.

Residency you control

Pin storage and compute to a jurisdiction to meet GDPR, PHIPA, and HIPAA data-residency requirements.

De-identification built in

Generate analysis-ready, de-identified cohorts in a click for sharing and secondary use, with the linkage held back in your tenant.

From the labs

Teams ship discovery faster on Helix.

We retired a 14-step pipeline and a cluster nobody wanted to maintain. A whole-genome cohort that used to take our core three weeks now lands overnight, and every result is reproducible to the exact tool version.

D
Dr. Lena Vásquez
Director, Helmsley Sequencing Core

The natural-language query is the part our clinicians actually use. They ask for high-impact variants under a frequency cutoff and get a reviewable cohort back in seconds — no ticket to bioinformatics, no two-day wait.

D
Dr. Marcus Okonkwo
Head of Translational Genomics, Caldera Therapeutics

Single-tenant with our own keys was the only way our IRB would sign off. Helix gave us the audit trail and the residency controls out of the box, so the review took weeks instead of quarters.

P
Priya Raghunathan
Principal Investigator, Northvale Biobank
Pricing

Pay per genome, not per cluster.

Usage-based compute and storage with no idle-node tax. Start on a project, scale to a population.

Lab

For a single team running its first cohorts.

$0/mo
  • Up to 500 genomes / mo
  • Validated WGS, WES & panel pipelines
  • Variant warehouse + NL query
  • Hosted notebooks
  • Community support
Most popular

Core

For sequencing cores and translational programs.

$2,400/mo
  • Up to 25,000 genomes / mo
  • Custom WDL & Nextflow workflows
  • Customer-managed encryption keys
  • Choice of data residency
  • Priority support + onboarding

Population

For biobanks and clinical diagnostics at scale.

Custom
  • Unlimited genomes
  • Dedicated single-tenant region
  • CLIA / CAP validation support
  • SSO, SCIM & advanced audit
  • Named genomics solutions architect

Questions from the bench.

Which assays and references does Helix support?

Whole-genome, whole-exome, RNA-seq, and targeted panels out of the box, against GRCh38 and T2T-CHM13. Pipelines are built on GATK and DeepVariant and benchmarked against Genome in a Bottle truth sets, with published precision and recall for every release.

How does Helix keep results reproducible?

Every run is pinned to a container digest, a reference build, and an exact parameter set, and each result carries its full provenance. A pipeline version from a year ago reproduces the same output today — that's the contract, not a best effort.

Where does our genomic data actually live?

In a single-tenant environment in the cloud region you choose, encrypted at rest with keys you manage. Helix staff have no standing access to your reads, variants, or notebooks, and every access is written to an immutable audit log.

Is Helix suitable for clinical use?

Helix provides the controls a clinical lab needs — validated workflows, residency, consent-aware access, and accreditor-ready audit trails — and we support your CLIA and CAP validation. Helix is platform infrastructure; clinical interpretation and reporting remain the responsibility of your licensed laboratory.

Can we bring our own pipelines and tools?

Yes. Author in WDL or Nextflow, fork a validated Helix workflow, or run custom containers. Everything is reachable through a typed REST API and the helix CLI, so it drops into your existing automation.

Put your genomes to work.

Request access and run your first cohort on validated pipelines this week. Bring your reads — we'll handle the infrastructure.