rodolfobrandao8/Snakemake-Mag-Annotation
None
Overview
Latest release: None, Last update: 2026-06-14
Share link: https://snakemake.github.io/snakemake-workflow-catalog?wf=rodolfobrandao8/Snakemake-Mag-Annotation
Quality control: linting: failed formatting: failed
Deployment
Step 1: Install Snakemake and Snakedeploy
Snakemake and Snakedeploy are best installed via the Conda package manager. It is recommended to install conda via Miniforge. Run
conda create -c conda-forge -c bioconda -c nodefaults --name snakemake snakemake snakedeploy
to install both Snakemake and Snakedeploy in an isolated environment. For all following commands ensure that this environment is activated via
conda activate snakemake
For other installation methods, refer to the Snakemake and Snakedeploy documentation.
Step 2: Deploy workflow
With Snakemake and Snakedeploy installed, the workflow can be deployed as follows. First, create an appropriate project working directory on your system and enter it:
mkdir -p path/to/project-workdir
cd path/to/project-workdir
In all following steps, we will assume that you are inside of that directory. Then run
snakedeploy deploy-workflow https://github.com/rodolfobrandao8/Snakemake-Mag-Annotation . --tag None
Snakedeploy will create two folders, workflow and config. The former contains the deployment of the chosen workflow as a Snakemake module, the latter contains configuration files which will be modified in the next step in order to configure the workflow to your needs.
Step 3: Configure workflow
To configure the workflow, adapt config/config.yml to your needs following the instructions below.
Step 4: Run workflow
The deployment method is controlled using the --software-deployment-method (short --sdm) argument.
To run the workflow using apptainer/singularity, use
snakemake --cores all --sdm apptainer
To run the workflow using a combination of conda and apptainer/singularity for software deployment, use
snakemake --cores all --sdm conda apptainer
To run the workflow with automatic deployment of all required software via conda/mamba, use
snakemake --cores all --sdm conda
Snakemake will automatically detect the main Snakefile in the workflow subfolder and execute the workflow module that has been defined by the deployment in step 2.
For further options such as cluster and cloud execution, see the docs.
Step 5: Generate report
After finalizing your data analysis, you can automatically generate an interactive visual HTML report for inspection of results together with parameters and code inside of the browser using
snakemake --report report.zip
Configuration
The following section is imported from the workflow’s config/README.md.
Workflow configuration
The workflow processes one or more Metagenome-Assembled Genomes (MAGs) per run.
Set these fields in config/config.yaml:
sample_sheet: path to a TSV file containing the sample names and paths.prodigal.extra: optional extra options string passed to the Prodigal wrapper (e.g.,-p meta -f gff).bakta.db: path to the Bakta database directory.bakta.extra: optional extra options string passed to the Bakta wrapper.gtdbtk.data_dir: path to the GTDB-Tk reference database directory.gtdbtk.extra: optional extra options string passed to the GTDB-Tk wrapper.metaeuk.db: path to the MetaEuk reference database (UniProt database).metaeuk.extra: optional extra options string passed to the MetaEuk wrapper.recognizer.resources_dir: path to the reCOGnizer resources database directory.recognizer.extra: optional extra options string passed to the reCOGnizer wrapper.upimapi.extra: optional extra options string passed to the UPIMAPI wrapper.threads: dictionary containing computational resource presets (high,medium,low).
Sample sheet format (TSV)
Required columns:
sample: unique identifier/name for the MAG or isolate.path: path to the input genome file in FASTA format (.fasta,.fna,.fa).
The workflow will dynamically process all rows defined in this sheet.
Example files
config/config.yaml:
sample_sheet: config/samples.tsv
prodigal:
extra: "-p meta -f gff"
bakta:
db: "resources/bakta_db"
extra: ""
gtdbtk:
data_dir: "resources/gtdbtk_db"
extra: ""
metaeuk:
db: "resources/metaeuk_db/uniprot_db"
extra: "--e 0.0001"
recognizer:
resources_dir: "resources/recognizer_db"
extra: "--evalue 0.001"
upimapi:
extra: "--evalue 1e-5"
threads:
high: 16
medium: 8
low: 1
## Workflow parameters
_The following table is automatically parsed from the workflow's `config.schema.y(a)ml` file_.
| Parameter | Type | Description | Required | Default |
| ---------------- | ------- | ---------------------------------------------- | -------- | ------------------ |
| **sample_sheet** | string | path to sample sheet, mandatory | yes | config/samples.tsv |
| **prodigal** | | parameters for Prodigal gene prediction | yes | |
| . extra | string | extra CLI options passed to Prodigal wrapper | | |
| **bakta** | | parameters for Bakta annotation | yes | |
| . db | string | path to Bakta database directory | yes | |
| . extra | string | extra CLI options passed to Bakta wrapper | | |
| **gtdbtk** | | parameters for GTDB-Tk classification | yes | |
| . data_dir | string | path to GTDB-Tk database directory | yes | |
| . extra | string | extra CLI options passed to GTDB-Tk wrapper | | |
| **metaeuk** | | parameters for MetaEuk gene prediction | yes | |
| . db | string | path to MetaEuk reference database | yes | |
| . extra | string | extra CLI options passed to MetaEuk wrapper | | |
| **recognizer** | | parameters for reCOGnizer domain annotation | yes | |
| . resources_dir | string | path to reCOGnizer database directory | yes | |
| . extra | string | extra CLI options passed to reCOGnizer wrapper | | |
| **upimapi** | | parameters for UPIMAPI functional annotation | yes | |
| . extra | string | extra CLI options passed to UPIMAPI wrapper | | |
| **threads** | | computational resources presets | yes | |
| . high | integer | | | |
| . medium | integer | | | |
| . low | integer | | | |
## Linting and formatting
(linting-rodolfobrandao8-snakemake-mag-annotation)=
:::{dropdown} Linting results
<div style="max-height: 400px; overflow-y: auto; padding: 0;">
```{code-block}
:linenos:
FileNotFoundError in file "/tmp/tmpw3tva81d/workflow/rules/common.smk", line 7:
[Errno 2] No such file or directory: '/home/argomes/data/meta/EST6/results/domain/EST6/annotation_samples_test.tsv'
File "/tmp/tmpw3tva81d/workflow/rules/common.smk", line 7, in <module>
File "/home/runner/work/snakemake-workflow-catalog/snakemake-workflow-catalog/.pixi/envs/default/lib/python3.13/site-packages/pandas/io/parsers/readers.py", line 1026, in read_csv
File "/home/runner/work/snakemake-workflow-catalog/snakemake-workflow-catalog/.pixi/envs/default/lib/python3.13/site-packages/pandas/io/parsers/readers.py", line 620, in _read
File "/home/runner/work/snakemake-workflow-catalog/snakemake-workflow-catalog/.pixi/envs/default/lib/python3.13/site-packages/pandas/io/parsers/readers.py", line 1620, in __init__
File "/home/runner/work/snakemake-workflow-catalog/snakemake-workflow-catalog/.pixi/envs/default/lib/python3.13/site-packages/pandas/io/parsers/readers.py", line 1880, in _make_engine
File "/home/runner/work/snakemake-workflow-catalog/snakemake-workflow-catalog/.pixi/envs/default/lib/python3.13/site-packages/pandas/io/common.py", line 873, in get_handle