scholl-lab/sm-calling

None

Overview

Latest release: None, Last update: 2026-02-13

Linting: linting: failed, Formatting: formatting: failed

Deployment

Step 1: Install Snakemake and Snakedeploy

Snakemake and Snakedeploy are best installed via the Conda. It is recommended to install conda via Miniforge. Run

conda create -c conda-forge -c bioconda -c nodefaults --name snakemake snakemake snakedeploy

to install both Snakemake and Snakedeploy in an isolated environment. For all following commands ensure that this environment is activated via

conda activate snakemake

For other installation methods, refer to the Snakemake and Snakedeploy documentation.

Step 2: Deploy workflow

With Snakemake and Snakedeploy installed, the workflow can be deployed as follows. First, create an appropriate project working directory on your system and enter it:

mkdir -p path/to/project-workdir
cd path/to/project-workdir

In all following steps, we will assume that you are inside of that directory. Then run

snakedeploy deploy-workflow https://github.com/scholl-lab/sm-calling . --tag None

Snakedeploy will create two folders, workflow and config. The former contains the deployment of the chosen workflow as a Snakemake module, the latter contains configuration files which will be modified in the next step in order to configure the workflow to your needs.

Step 3: Configure workflow

To configure the workflow, adapt config/config.yml to your needs following the instructions below.

Step 4: Run workflow

The deployment method is controlled using the --software-deployment-method (short --sdm) argument.

To run the workflow with automatic deployment of all required software via conda/mamba, use

snakemake --cores all --sdm conda

Snakemake will automatically detect the main Snakefile in the workflow subfolder and execute the workflow module that has been defined by the deployment in step 2.

For further options such as cluster and cloud execution, see the docs.

Step 5: Generate report

After finalizing your data analysis, you can automatically generate an interactive visual HTML report for inspection of results together with parameters and code inside of the browser using

snakemake --report report.zip

Configuration

The following section is imported from the workflow’s config/README.md.

Configuration

config.yaml

Unified pipeline configuration. All paths are relative to the repository root.

Required Fields

Field

Type

Description

caller

string

"mutect2", "freebayes", or "all"

ref.genome

string

Path to reference genome FASTA

ref.build

string

"GRCh37" or "GRCh38"

paths.samples

string

Path to samples TSV

paths.bam_folder

string

Directory with input BAMs

paths.output_folder

string

Root output directory

bam.file_extension

string

BAM file suffix (e.g., .merged.dedup.bqsr.bam)

scatter.mode

string

"chromosome", "interval", or "none"

Required for Mutect2

Field

Description

gatk_resources.panel_of_normals

Path to Panel of Normals VCF

gatk_resources.af_only_gnomad

Path to gnomAD allele frequency VCF

gatk_resources.common_biallelic_gnomad

Path to gnomAD common biallelic VCF

Mutect2 Parameters (params.mutect2)

Field

Type

Default

Description

genotype_germline_sites

boolean

true

Emit germline sites (required for PureCN)

genotype_pon_sites

boolean

true

Emit PoN sites (required for PureCN)

annotations

array of strings

[]

Extra --annotation flags for Mutect2

annotation_groups

array of strings

[]

Extra --annotation-group flags for Mutect2

extra

string

""

Passthrough for any other Mutect2 flags

QC Parameters (params.bcftools_stats)

Field

Type

Default

Description

extra

string

""

Extra flags for bcftools stats

PureCN Settings (purecn)

Field

Type

Default

Description

enabled

boolean

false

Enable PureCN copy number analysis

genome

string

"hg38"

PureCN genome identifier ("hg19" or "hg38")

intervals_bed

string

""

BED file with capture bait coordinates

normaldb

string

""

Pre-built normalDB.rds (skip NormalDB.R if provided)

mapping_bias

string

""

Pre-built mapping_bias.rds

snp_blacklist

string

""

Optional simple repeats BED for SNP filtering

extra

string

""

Extra PureCN.R arguments

seed

integer

123

Random seed for PureCN

postoptimize

boolean

true

Run PureCN post-optimization

samples.tsv

Tab-separated sample metadata. Required columns:

Column

Description

sample

Unique sample/analysis identifier

tumor_bam

BAM file basename (without extension)

normal_bam

Matched normal BAM basename, or . if none

analysis_type

tumor_only, tumor_normal, or germline

Linting and formatting

Linting results

 1Using workflow specific profile profiles/default for setting default command line arguments.
 2Lints for snakefile /tmp/tmpyoswbyd0/workflow/rules/freebayes.smk:
 3    * Mixed rules and functions in same snakefile.:
 4      Small one-liner functions used only once should be defined as lambda
 5      expressions. Other functions should be collected in a common module, e.g.
 6      'rules/common.smk'. This makes the workflow steps more readable.
 7      Also see:
 8      https://snakemake.readthedocs.io/en/latest/snakefiles/modularization.html#includes
 9
10Lints for snakefile /tmp/tmpyoswbyd0/workflow/rules/purecn.smk:
11    * Mixed rules and functions in same snakefile.:
12      Small one-liner functions used only once should be defined as lambda
13      expressions. Other functions should be collected in a common module, e.g.
14      'rules/common.smk'. This makes the workflow steps more readable.
15      Also see:
16      https://snakemake.readthedocs.io/en/latest/snakefiles/modularization.html#includes
17
18Lints for rule merge_freebayes_vcfs (line 57, /tmp/tmpyoswbyd0/workflow/rules/freebayes.smk):
19    * Shell command directly uses variable REF from outside of the rule:
20      It is recommended to pass all files as input and output, and non-file
21      parameters via the params directive. Otherwise, provenance tracking is
22      less accurate.
23      Also see:
24      https://snakemake.readthedocs.io/en/stable/snakefiles/rules.html#non-file-parameters-for-rules

Formatting results

 1[DEBUG] 
 2[DEBUG] 
 3[DEBUG] 
 4[DEBUG] 
 5[DEBUG] 
 6[DEBUG] In file "/tmp/tmpyoswbyd0/workflow/rules/common.smk":  Formatted content is different from original
 7[DEBUG] 
 8[ERROR] In file "/tmp/tmpyoswbyd0/workflow/rules/mutect2.smk":  IndexError: pop from empty list
 9[DEBUG] In file "/tmp/tmpyoswbyd0/workflow/rules/mutect2.smk":  
10[DEBUG] In file "/tmp/tmpyoswbyd0/workflow/rules/purecn.smk":  Formatted content is different from original
11[INFO] 1 file(s) raised parsing errors 🤕
12[INFO] 2 file(s) would be changed 😬
13[INFO] 4 file(s) would be left unchanged 🎉
14
15snakefmt version: 0.11.4