Helix is the analysis platform for modern genomics teams. Drop in raw reads, run validated pipelines on elastic compute, and query billions of variants in plain language — no cluster to babysit, no glue scripts to maintain.
helix variants \
--cohort breast-ca-2026 \
--gene BRCA1 BRCA2 \
--impact high \
--af "< 0.001"
→ 1,284 carriers across 6,400 samples
→ 37 novel pathogenic candidates
→ cohort.report.html ready (4.2s)Trusted in sequencing cores, biobanks, and translational labs
Stop gluing together aligners, callers, annotators, and a fragile cluster. Helix runs the whole arc — and keeps every step reproducible, versioned, and audit-ready.
Production-grade workflows for WGS, WES, RNA-seq, and panels — built on GATK and DeepVariant, benchmarked against GIAB truth sets, and pinned to a version so a run from last year reproduces byte-for-byte today.
Every called variant lands in a columnar store with genotypes, annotations, and sample metadata side by side — so a cross-cohort query that once meant a week of joins now returns in seconds.
Ask in plain English; Helix compiles it to a typed, reviewable query you can inspect, save, and rerun — no hand-written SQL, no ticket to bioinformatics.
Burst to thousands of cores for a batch, scale to zero when it's done. You pay per genome processed, not per node sitting idle overnight.
ClinVar, gnomAD, dbSNP, and Ensembl VEP applied on ingest and refreshed on a schedule — so the evidence behind a variant is current the day you query it, not the day you imported it.
What a run looks like at scale.
Helix treats every analysis like code: versioned, containerized, and reviewable. The platform owns the infrastructure so your scientists spend their hours on biology, not on YAML and broken scheduler logs.
Author in WDL or Nextflow, or fork a validated Helix pipeline. Every run is pinned to a container digest and a parameter set, so the same inputs always give the same outputs.
Each result carries its full lineage — inputs, reference build, tool versions, and the exact command that produced it — ready to drop straight into a methods section.
Hosted JupyterLab and R sessions run in-tenant, right next to your variant warehouse — no exports, no copies, no data leaving the perimeter to land in someone's laptop.
Everything in the UI is a typed REST endpoint and a single helix CLI command, so an analysis fits straight into your existing automation and CI.
Every workflow ships validated, version-pinned, and benchmarked — fork it, parameterize it, or run it untouched.
Align, call, and annotate a 30× whole genome against GRCh38 or T2T-CHM13, with SNV/indel F1 published per release.
Paired calling with contamination and orientation-bias filtering, tuned for low-VAF variants in FFPE samples.
Quantify transcripts, call fusions, and land a normalized expression matrix straight in the warehouse.
UMI-aware deduplication and panel-specific QC for clinical and research targeted sequencing.
Scale to population-level joint calls across tens of thousands of GVCFs without hand-managing shards.
Sample contamination, sex checks, kinship, and ancestry inference in one pass before any downstream analysis.
Genomic data is the most sensitive data there is — it identifies a person and their relatives for life. Helix is engineered for the regulatory and privacy bar that clinical and research genomics demand.
Your reads, variants, and notebooks live in an isolated tenant in the region you choose. Helix staff have no standing access to it.
Encrypted in transit and at rest with customer-managed keys, so you hold the keys to your own genomes — not us.
Immutable, exportable audit logs for every access and every run — the paper trail an accreditor expects to see on day one.
Tie sample-level consent and IRB scope to access policy, so a query can only ever touch the data it's permitted to.
Pin storage and compute to a jurisdiction to meet GDPR, PHIPA, and HIPAA data-residency requirements.
Generate analysis-ready, de-identified cohorts in a click for sharing and secondary use, with the linkage held back in your tenant.
“We retired a 14-step pipeline and a cluster nobody wanted to maintain. A whole-genome cohort that used to take our core three weeks now lands overnight, and every result is reproducible to the exact tool version.”
“The natural-language query is the part our clinicians actually use. They ask for high-impact variants under a frequency cutoff and get a reviewable cohort back in seconds — no ticket to bioinformatics, no two-day wait.”
“Single-tenant with our own keys was the only way our IRB would sign off. Helix gave us the audit trail and the residency controls out of the box, so the review took weeks instead of quarters.”
Usage-based compute and storage with no idle-node tax. Start on a project, scale to a population.
For a single team running its first cohorts.
For sequencing cores and translational programs.
For biobanks and clinical diagnostics at scale.
Whole-genome, whole-exome, RNA-seq, and targeted panels out of the box, against GRCh38 and T2T-CHM13. Pipelines are built on GATK and DeepVariant and benchmarked against Genome in a Bottle truth sets, with published precision and recall for every release.
Every run is pinned to a container digest, a reference build, and an exact parameter set, and each result carries its full provenance. A pipeline version from a year ago reproduces the same output today — that's the contract, not a best effort.
In a single-tenant environment in the cloud region you choose, encrypted at rest with keys you manage. Helix staff have no standing access to your reads, variants, or notebooks, and every access is written to an immutable audit log.
Helix provides the controls a clinical lab needs — validated workflows, residency, consent-aware access, and accreditor-ready audit trails — and we support your CLIA and CAP validation. Helix is platform infrastructure; clinical interpretation and reporting remain the responsibility of your licensed laboratory.
Yes. Author in WDL or Nextflow, fork a validated Helix workflow, or run custom containers. Everything is reachable through a typed REST API and the helix CLI, so it drops into your existing automation.
Request access and run your first cohort on validated pipelines this week. Bring your reads — we'll handle the infrastructure.