r1cheu/imputation

imputation pipeline for low-coverage sequencing in hybrid rice

Overview

Latest release: None, Last update: 2026-05-20

Share link: https://snakemake.github.io/snakemake-workflow-catalog?wf=r1cheu/imputation

Quality control: linting: failed formatting: failed

Deployment

Step 1: Install Snakemake and Snakedeploy

Snakemake and Snakedeploy are best installed via the Conda package manager. It is recommended to install conda via Miniforge. Run

conda create -c conda-forge -c bioconda -c nodefaults --name snakemake snakemake snakedeploy

to install both Snakemake and Snakedeploy in an isolated environment. For all following commands ensure that this environment is activated via

conda activate snakemake

For other installation methods, refer to the Snakemake and Snakedeploy documentation.

Step 2: Deploy workflow

With Snakemake and Snakedeploy installed, the workflow can be deployed as follows. First, create an appropriate project working directory on your system and enter it:

mkdir -p path/to/project-workdir
cd path/to/project-workdir

In all following steps, we will assume that you are inside of that directory. Then run

snakedeploy deploy-workflow https://github.com/r1cheu/imputation . --tag None

Snakedeploy will create two folders, workflow and config. The former contains the deployment of the chosen workflow as a Snakemake module, the latter contains configuration files which will be modified in the next step in order to configure the workflow to your needs.

Step 3: Configure workflow

To configure the workflow, adapt config/config.yml to your needs following the instructions below.

Step 4: Run workflow

The deployment method is controlled using the --software-deployment-method (short --sdm) argument.

To run the workflow using apptainer/singularity, use

snakemake --cores all --sdm apptainer

To run the workflow using a combination of conda and apptainer/singularity for software deployment, use

snakemake --cores all --sdm conda apptainer

To run the workflow with automatic deployment of all required software via conda/mamba, use

snakemake --cores all --sdm conda

Snakemake will automatically detect the main Snakefile in the workflow subfolder and execute the workflow module that has been defined by the deployment in step 2.

For further options such as cluster and cloud execution, see the docs.

Step 5: Generate report

After finalizing your data analysis, you can automatically generate an interactive visual HTML report for inspection of results together with parameters and code inside of the browser using

snakemake --report report.zip

Configuration

The following section is imported from the workflow’s config/README.md.

Configuration

config.yaml

Key

Meaning

sample_sheet

TSV listing samples (see below)

reference.fasta

Reference genome FASTA (e.g. IRGSP-1.0). Indices are built by the workflow.

panel.vcf

Whole-genome phased reference panel VCF, bgzipped. Will be split per chromosome.

chromosomes

List of chromosome names. Must match both reference and panel.

genetic_map.template

Per-chromosome genetic map path template, e.g. resources/maps/{chrom}.gmap. Format expected by GLIMPSE2: pos chr cM.

glimpse2_chunk.window_mb

GLIMPSE2_chunk --window-mb (default 4.0)

glimpse2_chunk.buffer_mb

GLIMPSE2_chunk --buffer-mb (default 0.5)

glimpse2_chunk.extra

Extra flags passed verbatim to GLIMPSE2_chunk

fastp.extra

Extra flags passed verbatim to fastp (e.g. quality cutoffs)

Threads/memory per rule are hard-coded in workflow/rules/*.smk (tuned for ~600 cores / ~5000 samples). Override by editing those resources: blocks.

samples.tsv

Tab-separated, one sample per row.

Column

Required

Meaning

sample

yes

unique sample id (used as RG ID/SM)

platform

yes

sequencing platform string (RG PL), e.g. ILLUMINA

fq1

yes

path to read 1 fastq.gz

fq2

yes

path to read 2 fastq.gz

Workflow parameters

The following table is automatically parsed from the workflow’s config.schema.y(a)ml file.

Parameter

Type

Description

Required

Default

sample_sheet

string

yes

config/samples.tsv

reference

yes

. fasta

string

path to the reference genome FASTA

yes

panel

yes

. full_template

string

per-chrom full-GT panel BCF template, e.g. “reference/panel_{chrom}.bcf”

yes

. sites_tsv_template

string

per-chrom sites TSV (CHROM\tPOS\tREF,ALT) bgzipped with .tbi alongside

yes

chromosomes

array

chromosome names matching the reference and panel

yes

genetic_map

yes

. template

string

per-chromosome genetic map path template, e.g. “maps/{chrom}.gmap”

yes

glimpse_chunk

yes

. window_size

integer

yes

2000000

. buffer_size

integer

yes

200000

. extra

string

fastp

. extra

string

Linting and formatting

Linting results
 1Workflow defines that rule bwa_mem2_index is eligible for caching between workflows (use the --cache argument to enable this).
 2Workflow defines that rule samtools_faidx is eligible for caching between workflows (use the --cache argument to enable this).
 3Lints for rule bwa_mem2_index (line 1, /tmp/tmpsxbngvh8/workflow/rules/reference.smk):
 4    * Specify a conda environment or container for each rule.:
 5      This way, the used software for each specific step is documented, and the
 6      workflow can be executed on any machine without prerequisites.
 7      Also see:
 8      https://snakemake.readthedocs.io/en/latest/snakefiles/deployment.html#integrated-package-management
 9      https://snakemake.readthedocs.io/en/latest/snakefiles/deployment.html#running-jobs-in-containers
10
11Lints for rule samtools_faidx (line 23, /tmp/tmpsxbngvh8/workflow/rules/reference.smk):
12    * Specify a conda environment or container for each rule.:
13      This way, the used software for each specific step is documented, and the
14      workflow can be executed on any machine without prerequisites.
15      Also see:
16      https://snakemake.readthedocs.io/en/latest/snakefiles/deployment.html#integrated-package-management
17      https://snakemake.readthedocs.io/en/latest/snakefiles/deployment.html#running-jobs-in-containers
18
19Lints for rule fastp_trim (line 1, /tmp/tmpsxbngvh8/workflow/rules/trim.smk):
20    * Specify a conda environment or container for each rule.:
21      This way, the used software for each specific step is documented, and the
22      workflow can be executed on any machine without prerequisites.
23      Also see:
24      https://snakemake.readthedocs.io/en/latest/snakefiles/deployment.html#integrated-package-management
25      https://snakemake.readthedocs.io/en/latest/snakefiles/deployment.html#running-jobs-in-containers
26
27Lints for rule bwa_align_dedup (line 1, /tmp/tmpsxbngvh8/workflow/rules/align.smk):
28    * Specify a conda environment or container for each rule.:
29      This way, the used software for each specific step is documented, and the
30      workflow can be executed on any machine without prerequisites.
31      Also see:
32      https://snakemake.readthedocs.io/en/latest/snakefiles/deployment.html#integrated-package-management
33      https://snakemake.readthedocs.io/en/latest/snakefiles/deployment.html#running-jobs-in-containers
34
35Lints for rule compute_gl (line 1, /tmp/tmpsxbngvh8/workflow/rules/imputation.smk):
36    * Specify a conda environment or container for each rule.:
37      This way, the used software for each specific step is documented, and the
38      workflow can be executed on any machine without prerequisites.
39      Also see:
40      https://snakemake.readthedocs.io/en/latest/snakefiles/deployment.html#integrated-package-management
41      https://snakemake.readthedocs.io/en/latest/snakefiles/deployment.html#running-jobs-in-containers
42
43Lints for rule concat_gl (line 26, /tmp/tmpsxbngvh8/workflow/rules/imputation.smk):
44    * Specify a conda environment or container for each rule.:
45      This way, the used software for each specific step is documented, and the
46      workflow can be executed on any machine without prerequisites.
47      Also see:
48      https://snakemake.readthedocs.io/en/latest/snakefiles/deployment.html#integrated-package-management
49      https://snakemake.readthedocs.io/en/latest/snakefiles/deployment.html#running-jobs-in-containers
50
51Lints for rule merge_gl (line 43, /tmp/tmpsxbngvh8/workflow/rules/imputation.smk):
52    * Specify a conda environment or container for each rule.:
53      This way, the used software for each specific step is documented, and the
54      workflow can be executed on any machine without prerequisites.
55      Also see:
56      https://snakemake.readthedocs.io/en/latest/snakefiles/deployment.html#integrated-package-management
57      https://snakemake.readthedocs.io/en/latest/snakefiles/deployment.html#running-jobs-in-containers
58
59Lints for rule glimpse_chunk (line 61, /tmp/tmpsxbngvh8/workflow/rules/imputation.smk):
60    * Specify a conda environment or container for each rule.:
61      This way, the used software for each specific step is documented, and the
62      workflow can be executed on any machine without prerequisites.
63      Also see:
64      https://snakemake.readthedocs.io/en/latest/snakefiles/deployment.html#integrated-package-management
65      https://snakemake.readthedocs.io/en/latest/snakefiles/deployment.html#running-jobs-in-containers
66
67Lints for rule glimpse_phase (line 83, /tmp/tmpsxbngvh8/workflow/rules/imputation.smk):
68    * Specify a conda environment or container for each rule.:
69      This way, the used software for each specific step is documented, and the
70      workflow can be executed on any machine without prerequisites.
71      Also see:
72      https://snakemake.readthedocs.io/en/latest/snakefiles/deployment.html#integrated-package-management
73      https://snakemake.readthedocs.io/en/latest/snakefiles/deployment.html#running-jobs-in-containers
74
75Lints for rule glimpse_ligate (line 108, /tmp/tmpsxbngvh8/workflow/rules/imputation.smk):
76    * Specify a conda environment or container for each rule.:
77      This way, the used software for each specific step is documented, and the
78      workflow can be executed on any machine without prerequisites.
79      Also see:
80      https://snakemake.readthedocs.io/en/latest/snakefiles/deployment.html#integrated-package-management
81      https://snakemake.readthedocs.io/en/latest/snakefiles/deployment.html#running-jobs-in-containers
82
83Lints for rule concat_imputed (line 130, /tmp/tmpsxbngvh8/workflow/rules/imputation.smk):
84    * Specify a conda environment or container for each rule.:
85      This way, the used software for each specific step is documented, and the
86      workflow can be executed on any machine without prerequisites.
87      Also see:
88      https://snakemake.readthedocs.io/en/latest/snakefiles/deployment.html#integrated-package-management
89      https://snakemake.readthedocs.io/en/latest/snakefiles/deployment.html#running-jobs-in-containers
Formatting results
 1[DEBUG] 
 2[DEBUG] In file "/tmp/tmpsxbngvh8/workflow/rules/common.smk":  Formatted content is different from original
 3[DEBUG] 
 4[DEBUG] 
 5[DEBUG] 
 6[DEBUG] 
 7[DEBUG] 
 8[INFO] 1 file(s) would be changed 😬
 9[INFO] 5 file(s) would be left unchanged 🎉
10
11snakefmt version: 0.11.5