Skip to content

fg-labs/divref-wf

Repository files navigation

CI Python Versions MyPy Checked uv Ruff Pixi

Snakemake workflow implementation to create DivRef-style resource

This workflow is inspired by the DivRef repository which is used to generate a bundle of FASTA sequences and a corresponding DuckDB index of common human variation.

The original implementation is via a set of standalone Python scripts and a Makefile.

This implementation:

  1. Wraps the Python scripts in a toolkit with added typing, improved parameterization, and added unit testing.
  2. Adds a Snakemake workflow and associated configuration to drive the resource generation process.

Set up Environment

The environment for this analysis is managed using pixi. Follow the developer instructions to install pixi.

The environment and dependencies are automatically created and installed when calling pixi install for the first time.

To enable access to Hail tables via the GCS Connector, run pixi run setup-gcs.

You will need to log in to GCS before running any of the Hail-dependent tools.

gcloud auth application-default login

Source Data

gnomAD 3.1.2 HGDP+1KG individual-level genotypes and sample metadata

Data description

About

Snakemake workflow implementation of DivRef resource generation

Resources

Code of conduct

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages