scholl-lab/sm-calling

None

Overview

Latest release: None, Last update: 2026-02-13

Share link: https://snakemake.github.io/snakemake-workflow-catalog?wf=scholl-lab/sm-calling

Quality control: linting: failed formatting: failed

Deployment

Step 1: Install Snakemake and Snakedeploy

Snakemake and Snakedeploy are best installed via the Conda package manager. It is recommended to install conda via Miniforge. Run

conda create -c conda-forge -c bioconda -c nodefaults --name snakemake snakemake snakedeploy

to install both Snakemake and Snakedeploy in an isolated environment. For all following commands ensure that this environment is activated via

conda activate snakemake

For other installation methods, refer to the Snakemake and Snakedeploy documentation.

Step 2: Deploy workflow

With Snakemake and Snakedeploy installed, the workflow can be deployed as follows. First, create an appropriate project working directory on your system and enter it:

mkdir -p path/to/project-workdir
cd path/to/project-workdir

In all following steps, we will assume that you are inside of that directory. Then run

snakedeploy deploy-workflow https://github.com/scholl-lab/sm-calling . --tag None

Snakedeploy will create two folders, workflow and config. The former contains the deployment of the chosen workflow as a Snakemake module, the latter contains configuration files which will be modified in the next step in order to configure the workflow to your needs.

Step 4: Run workflow

The deployment method is controlled using the --software-deployment-method (short --sdm) argument.

To run the workflow with automatic deployment of all required software via conda/mamba, use

snakemake --cores all --sdm conda

Snakemake will automatically detect the main Snakefile in the workflow subfolder and execute the workflow module that has been defined by the deployment in step 2.

For further options such as cluster and cloud execution, see the docs.

Step 5: Generate report

After finalizing your data analysis, you can automatically generate an interactive visual HTML report for inspection of results together with parameters and code inside of the browser using

snakemake --report report.zip

Configuration

The following section is imported from the workflow’s config/README.md.

Configuration

`config.yaml`

Unified pipeline configuration. All paths are relative to the repository root.

Required Fields

Field	Type	Description
`caller`	string	`"mutect2"`, `"freebayes"`, or `"all"`
`ref.genome`	string	Path to reference genome FASTA
`ref.build`	string	`"GRCh37"` or `"GRCh38"`
`paths.samples`	string	Path to samples TSV
`paths.bam_folder`	string	Directory with input BAMs
`paths.output_folder`	string	Root output directory
`bam.file_extension`	string	BAM file suffix (e.g., `.merged.dedup.bqsr.bam`)
`scatter.mode`	string	`"chromosome"`, `"interval"`, or `"none"`

Required for Mutect2

Field	Description
`gatk_resources.panel_of_normals`	Path to Panel of Normals VCF
`gatk_resources.af_only_gnomad`	Path to gnomAD allele frequency VCF
`gatk_resources.common_biallelic_gnomad`	Path to gnomAD common biallelic VCF

Mutect2 Parameters (`params.mutect2`)

Field	Type	Default	Description
`genotype_germline_sites`	boolean	`true`	Emit germline sites (required for PureCN)
`genotype_pon_sites`	boolean	`true`	Emit PoN sites (required for PureCN)
`annotations`	array of strings	`[]`	Extra `--annotation` flags for Mutect2
`annotation_groups`	array of strings	`[]`	Extra `--annotation-group` flags for Mutect2
`extra`	string	`""`	Passthrough for any other Mutect2 flags

QC Parameters (`params.bcftools_stats`)

Field	Type	Default	Description
`extra`	string	`""`	Extra flags for `bcftools stats`

PureCN Settings (`purecn`)

Field	Type	Default	Description
`enabled`	boolean	`false`	Enable PureCN copy number analysis
`genome`	string	`"hg38"`	PureCN genome identifier (`"hg19"` or `"hg38"`)
`intervals_bed`	string	`""`	BED file with capture bait coordinates
`normaldb`	string	`""`	Pre-built `normalDB.rds` (skip NormalDB.R if provided)
`mapping_bias`	string	`""`	Pre-built `mapping_bias.rds`
`snp_blacklist`	string	`""`	Optional simple repeats BED for SNP filtering
`extra`	string	`""`	Extra PureCN.R arguments
`seed`	integer	`123`	Random seed for PureCN
`postoptimize`	boolean	`true`	Run PureCN post-optimization

`samples.tsv`

Tab-separated sample metadata. Required columns:

Column	Description
`sample`	Unique sample/analysis identifier
`tumor_bam`	BAM file basename (without extension)
`normal_bam`	Matched normal BAM basename, or `.` if none
`analysis_type`	`tumor_only`, `tumor_normal`, or `germline`

Linting and formatting

Linting results

Using workflow specific profile profiles/default for setting default command line arguments.
Lints for snakefile /tmp/tmpyoswbyd0/workflow/rules/freebayes.smk:
    * Mixed rules and functions in same snakefile.:
      Small one-liner functions used only once should be defined as lambda
      expressions. Other functions should be collected in a common module, e.g.
      'rules/common.smk'. This makes the workflow steps more readable.
      Also see:
      https://snakemake.readthedocs.io/en/latest/snakefiles/modularization.html#includes

Lints for snakefile /tmp/tmpyoswbyd0/workflow/rules/purecn.smk:
    * Mixed rules and functions in same snakefile.:
      Small one-liner functions used only once should be defined as lambda
      expressions. Other functions should be collected in a common module, e.g.
      'rules/common.smk'. This makes the workflow steps more readable.
      Also see:
      https://snakemake.readthedocs.io/en/latest/snakefiles/modularization.html#includes

Lints for rule merge_freebayes_vcfs (line 57, /tmp/tmpyoswbyd0/workflow/rules/freebayes.smk):
    * Shell command directly uses variable REF from outside of the rule:
      It is recommended to pass all files as input and output, and non-file
      parameters via the params directive. Otherwise, provenance tracking is
      less accurate.
      Also see:
      https://snakemake.readthedocs.io/en/stable/snakefiles/rules.html#non-file-parameters-for-rules

Formatting results

[DEBUG] 
[DEBUG] 
[DEBUG] 
[DEBUG] 
[DEBUG] 
[DEBUG] In file "/tmp/tmpyoswbyd0/workflow/rules/common.smk":  Formatted content is different from original
[DEBUG] 
[ERROR] In file "/tmp/tmpyoswbyd0/workflow/rules/mutect2.smk":  IndexError: pop from empty list
[DEBUG] In file "/tmp/tmpyoswbyd0/workflow/rules/mutect2.smk":  
[DEBUG] In file "/tmp/tmpyoswbyd0/workflow/rules/purecn.smk":  Formatted content is different from original
[INFO] 1 file(s) raised parsing errors 🤕
[INFO] 2 file(s) would be changed 😬
[INFO] 4 file(s) would be left unchanged 🎉

snakefmt version: 0.11.4

scholl-lab/sm-calling

Overview

Deployment

Configuration

Configuration

config.yaml

Required Fields

Required for Mutect2

Mutect2 Parameters (params.mutect2)

QC Parameters (params.bcftools_stats)

PureCN Settings (purecn)

samples.tsv

Linting and formatting

`config.yaml`

Mutect2 Parameters (`params.mutect2`)

QC Parameters (`params.bcftools_stats`)

PureCN Settings (`purecn`)

`samples.tsv`