merfre/Adaptive_Sequencing_Analysis_Workflow

Snakemake workflow for ONT adaptive sequencing data integrating QC, taxonomy (Kraken2), rejected read analysis, and control vs adaptive comparison.

Overview

Latest release: v.1.1.0, Last update: 2026-04-18

Share link: https://snakemake.github.io/snakemake-workflow-catalog?wf=merfre/Adaptive_Sequencing_Analysis_Workflow

Quality control: linting: failed formatting: failed

Topics: microbiome nanopore snakemake adaptive-sequencing bioinformatics kraken2 metagenomics reproducibility taxonomy workflow

Deployment

Step 1: Install Snakemake and Snakedeploy

Snakemake and Snakedeploy are best installed via the Conda package manager. It is recommended to install conda via Miniforge. Run

conda create -c conda-forge -c bioconda -c nodefaults --name snakemake snakemake snakedeploy

to install both Snakemake and Snakedeploy in an isolated environment. For all following commands ensure that this environment is activated via

conda activate snakemake

For other installation methods, refer to the Snakemake and Snakedeploy documentation.

Step 2: Deploy workflow

With Snakemake and Snakedeploy installed, the workflow can be deployed as follows. First, create an appropriate project working directory on your system and enter it:

mkdir -p path/to/project-workdir
cd path/to/project-workdir

In all following steps, we will assume that you are inside of that directory. Then run

snakedeploy deploy-workflow https://github.com/merfre/Adaptive_Sequencing_Analysis_Workflow . --tag v.1.1.0

Snakedeploy will create two folders, workflow and config. The former contains the deployment of the chosen workflow as a Snakemake module, the latter contains configuration files which will be modified in the next step in order to configure the workflow to your needs.

Step 3: Configure workflow

To configure the workflow, adapt config/config.yml to your needs following the instructions below.

Step 4: Run workflow

The deployment method is controlled using the --software-deployment-method (short --sdm) argument.

To run the workflow with automatic deployment of all required software via conda/mamba, use

snakemake --cores all --sdm conda

Snakemake will automatically detect the main Snakefile in the workflow subfolder and execute the workflow module that has been defined by the deployment in step 2.

For further options such as cluster and cloud execution, see the docs.

Step 5: Generate report

After finalizing your data analysis, you can automatically generate an interactive visual HTML report for inspection of results together with parameters and code inside of the browser using

snakemake --report report.zip

Configuration

The following section is imported from the workflow’s config/README.md.

Configuration

This document describes the structure and parameters of config/config.yaml for ASAW.

Overview of config.yaml

The configuration file defines:

  • Input metadata

  • Workflow behaviour (analysis inclusion/exclusion)

  • Database locations

  • Tool-specific parameters

Required metadata

Sample/run metadata table

Specified by: metadata_file: "config/ONT_skin_runs.txt"

Must include:

  • Sample identifiers

  • Run identifiers

  • Paths to FASTQ files

  • Paths to sequence summary files

  • Paths to rejected read ID lists (if used)

Analysis options (workflow toggles)

Control comparison

Parameter

Description

include_control

Enable comparison of adaptive vs control runs

control_run

List of control runs (comma-separated, matching metadata)

Standard human microbiome analysis

Parameter

Description

include_samples

Enable full microbiome workflow (QC → host filtering → assembly → taxonomy → downstream analysis)

Alternative analysis modes

Parameter

Description

include_huminc

Include analysis without host filtering

include_unassem

Include analysis without assembly

Database locations

kraken_db: "~/Kraken2_Simple_Workflow/resources/databases/krakenstd_06_2023/kraken2_std_database"
human_reference: "resources/databases/filtering_reference/GCF_000001405.40_GRCh38.p14_genomic.fna"

Preprocessing parameters

fastp (unclassified / rejected reads)

Parameter

Description

qualified_quality_phred_unclass

Minimum base quality

unqualified_percent_limit_unclass

Max % low-quality bases

average_qual_unclass

Minimum average read quality

min_length_unclass

Minimum read length

front_trim_unclass

Trim bases from 5’ end

tail_trim_unclass

Trim bases from 3’ end

fastp (all samples)

Parameter

Description

qualified_quality_phred

Minimum base quality

unqualified_percent_limit

Max % low-quality bases

average_qual

Minimum average read quality

min_length

Minimum read length

front_trim

Trim bases from 5’ end

tail_trim

Trim bases from 3’ end

Taxonomic assignment (Kraken2)

Parameter

Description

kraken_confidence

Confidence threshold for classification

Recentrifuge parameters

Parameter

Description

scoring_scheme

Scoring method (KRAKEN or SHEL)

minscore

Minimum score threshold

Assembly parameters (Flye)

Parameter

Description

read_type

ONT read type (--nano-raw)

minimum_overlap

Minimum read overlap

Rejected read analysis

Rejected reads:

  • Extracted from adaptive sequencing outputs

  • Processed through QC and taxonomy steps

  • Analysed alongside retained reads for comparison

Control vs adaptive comparison

When enabled:

  • Compares sequence statistics between runs

  • Compares taxonomic abundance

  • Calculates diversity metrics (alpha, beta)

Benchmarking and resources

  • Workflow executed using Snakemake

  • Resource usage tracked via built-in benchmarking

How to modify safely

Users should:

  • Modify paths to match local environment

  • Adjust parameters cautiously

  • Ensure metadata formatting is consistent

Minimal example config

conda_envs: "workflow/envs/asaw_environment.yaml"
metadata_file: "config/ONT_skin_runs.txt"

include_control: True
include_samples: True

kraken_db: "~/Kraken2_Simple_Workflow/resources/databases/krakenstd_06_2023/kraken2_std_database"

Linting and formatting

Linting results
 1/tmp/tmpxj9es4rc/merfre-Adaptive_Sequencing_Analysis_Workflow-634527c/workflow/Snakefile:92: SyntaxWarning: invalid escape sequence '\/'
 2  CON_UNCLASS = ".*\/.*",
 3/tmp/tmpxj9es4rc/merfre-Adaptive_Sequencing_Analysis_Workflow-634527c/workflow/Snakefile:93: SyntaxWarning: invalid escape sequence '\/'
 4  ADAP_PATHS = ".*\/.*",
 5/tmp/tmpxj9es4rc/merfre-Adaptive_Sequencing_Analysis_Workflow-634527c/workflow/Snakefile:94: SyntaxWarning: invalid escape sequence '\/'
 6  CON_PATHS = ".*\/.*",
 7/tmp/tmpxj9es4rc/merfre-Adaptive_Sequencing_Analysis_Workflow-634527c/workflow/Snakefile:95: SyntaxWarning: invalid escape sequence '\/'
 8  ALL_PATHS = ".*\/.*",
 9/tmp/tmpxj9es4rc/merfre-Adaptive_Sequencing_Analysis_Workflow-634527c/workflow/Snakefile:96: SyntaxWarning: invalid escape sequence '\/'
10  CON_RUNS = "[^\/]+",
11/tmp/tmpxj9es4rc/merfre-Adaptive_Sequencing_Analysis_Workflow-634527c/workflow/Snakefile:97: SyntaxWarning: invalid escape sequence '\/'
12  ADAP_RUNS = "[^\/]+",
13/tmp/tmpxj9es4rc/merfre-Adaptive_Sequencing_Analysis_Workflow-634527c/workflow/Snakefile:98: SyntaxWarning: invalid escape sequence '\/'
14  ALL_RUNS = "[^\/]+"
15/tmp/tmpxj9es4rc/merfre-Adaptive_Sequencing_Analysis_Workflow-634527c/workflow/Snakefile:99: SyntaxWarning: invalid escape sequence '\/'
16  
17/tmp/tmpxj9es4rc/merfre-Adaptive_Sequencing_Analysis_Workflow-634527c/workflow/Snakefile:100: SyntaxWarning: invalid escape sequence '\/'
18  ### Concatenate fastq files from barcodes ###
19FileNotFoundError in file "/tmp/tmpxj9es4rc/merfre-Adaptive_Sequencing_Analysis_Workflow-634527c/workflow/Snakefile", line 120:
20[Errno 2] No such file or directory: './resources/ONT_skin1_adap/LF1A_adap.fastq'
21  File "/tmp/tmpxj9es4rc/merfre-Adaptive_Sequencing_Analysis_Workflow-634527c/workflow/Snakefile", line 120, in <module>
Formatting results
 1[DEBUG] 
 2[DEBUG] In file "/tmp/tmpxj9es4rc/merfre-Adaptive_Sequencing_Analysis_Workflow-634527c/workflow/rules/control_compare.smk":  Formatted content is different from original
 3[DEBUG] 
 4[DEBUG] In file "/tmp/tmpxj9es4rc/merfre-Adaptive_Sequencing_Analysis_Workflow-634527c/workflow/rules/taxdump.smk":  Formatted content is different from original
 5[DEBUG] 
 6[DEBUG] In file "/tmp/tmpxj9es4rc/merfre-Adaptive_Sequencing_Analysis_Workflow-634527c/workflow/rules/community.smk":  Formatted content is different from original
 7[DEBUG] 
 8[DEBUG] In file "/tmp/tmpxj9es4rc/merfre-Adaptive_Sequencing_Analysis_Workflow-634527c/workflow/rules/control_fastp.smk":  Formatted content is different from original
 9[DEBUG] 
10[DEBUG] In file "/tmp/tmpxj9es4rc/merfre-Adaptive_Sequencing_Analysis_Workflow-634527c/workflow/rules/samp_metaflye.smk":  Formatted content is different from original
11[DEBUG] 
12<unknown>:5: SyntaxWarning: invalid escape sequence '\#'
13[DEBUG] In file "/tmp/tmpxj9es4rc/merfre-Adaptive_Sequencing_Analysis_Workflow-634527c/workflow/rules/samp_human_inc.smk":  Formatted content is different from original
14[DEBUG] 
15[DEBUG] In file "/tmp/tmpxj9es4rc/merfre-Adaptive_Sequencing_Analysis_Workflow-634527c/workflow/rules/samp_host_removal.smk":  Formatted content is different from original
16[DEBUG] 
17<unknown>:5: SyntaxWarning: invalid escape sequence '\#'
18[DEBUG] In file "/tmp/tmpxj9es4rc/merfre-Adaptive_Sequencing_Analysis_Workflow-634527c/workflow/rules/rejected_kraken.smk":  Formatted content is different from original
19[DEBUG] 
20<unknown>:5: SyntaxWarning: invalid escape sequence '\#'
21[DEBUG] In file "/tmp/tmpxj9es4rc/merfre-Adaptive_Sequencing_Analysis_Workflow-634527c/workflow/rules/samp_kraken.smk":  Formatted content is different from original
22[DEBUG] 
23[DEBUG] In file "/tmp/tmpxj9es4rc/merfre-Adaptive_Sequencing_Analysis_Workflow-634527c/workflow/rules/samp_unassembled.smk":  Formatted content is different from original
24[DEBUG] 
25[DEBUG] In file "/tmp/tmpxj9es4rc/merfre-Adaptive_Sequencing_Analysis_Workflow-634527c/workflow/rules/rejected_separation.smk":  Formatted content is different from original
26[DEBUG] 
27[DEBUG] In file "/tmp/tmpxj9es4rc/merfre-Adaptive_Sequencing_Analysis_Workflow-634527c/workflow/rules/samp_fastp.smk":  Formatted content is different from original
28[DEBUG] 
29[DEBUG] In file "/tmp/tmpxj9es4rc/merfre-Adaptive_Sequencing_Analysis_Workflow-634527c/workflow/rules/rejected_seq_summary.smk":  Formatted content is different from original
30[DEBUG] 
31[DEBUG] In file "/tmp/tmpxj9es4rc/merfre-Adaptive_Sequencing_Analysis_Workflow-634527c/workflow/rules/rejected_fastp.smk":  Formatted content is different from original
32[DEBUG] 
33[DEBUG] In file "/tmp/tmpxj9es4rc/merfre-Adaptive_Sequencing_Analysis_Workflow-634527c/workflow/rules/control_seq_summary.smk":  Formatted content is different from original
34[DEBUG] 
35<unknown>:1: SyntaxWarning: invalid escape sequence '\/'
36<unknown>:1: SyntaxWarning: invalid escape sequence '\/'
37<unknown>:1: SyntaxWarning: invalid escape sequence '\/'
38<unknown>:1: SyntaxWarning: invalid escape sequence '\/'
39<unknown>:1: SyntaxWarning: invalid escape sequence '\/'
40<unknown>:1: SyntaxWarning: invalid escape sequence '\/'
41<unknown>:1: SyntaxWarning: invalid escape sequence '\/'
42<unknown>:1: SyntaxWarning: invalid escape sequence '\/'
43<unknown>:1: SyntaxWarning: invalid escape sequence '\/'
44[ERROR] In file "/tmp/tmpxj9es4rc/merfre-Adaptive_Sequencing_Analysis_Workflow-634527c/workflow/Snakefile":  InvalidPython: Black error:

Cannot parse for target version Python 3.13: 6:0:

(Note reported line number may be incorrect, as snakefmt could not determine the true line number)


[DEBUG] In file "/tmp/tmpxj9es4rc/merfre-Adaptive_Sequencing_Analysis_Workflow-634527c/workflow/Snakefile":  
[DEBUG] In file "/tmp/tmpxj9es4rc/merfre-Adaptive_Sequencing_Analysis_Workflow-634527c/workflow/rules/control_kraken.smk":  Formatted content is different from original
[INFO] 1 file(s) raised parsing errors 🤕
[INFO] 16 file(s) would be changed 😬

snakefmt version: 0.11.5