rodolfobrandao8/Snakemake-Mag-Annotation

None

Overview

Latest release: None, Last update: 2026-06-14

Share link: https://snakemake.github.io/snakemake-workflow-catalog?wf=rodolfobrandao8/Snakemake-Mag-Annotation

Quality control: linting: failed formatting: failed

Deployment

Step 1: Install Snakemake and Snakedeploy

Snakemake and Snakedeploy are best installed via the Conda package manager. It is recommended to install conda via Miniforge. Run

conda create -c conda-forge -c bioconda -c nodefaults --name snakemake snakemake snakedeploy

to install both Snakemake and Snakedeploy in an isolated environment. For all following commands ensure that this environment is activated via

conda activate snakemake

For other installation methods, refer to the Snakemake and Snakedeploy documentation.

Step 2: Deploy workflow

With Snakemake and Snakedeploy installed, the workflow can be deployed as follows. First, create an appropriate project working directory on your system and enter it:

mkdir -p path/to/project-workdir
cd path/to/project-workdir

In all following steps, we will assume that you are inside of that directory. Then run

snakedeploy deploy-workflow https://github.com/rodolfobrandao8/Snakemake-Mag-Annotation . --tag None

Snakedeploy will create two folders, workflow and config. The former contains the deployment of the chosen workflow as a Snakemake module, the latter contains configuration files which will be modified in the next step in order to configure the workflow to your needs.

Step 3: Configure workflow

To configure the workflow, adapt config/config.yml to your needs following the instructions below.

Step 4: Run workflow

The deployment method is controlled using the --software-deployment-method (short --sdm) argument.

To run the workflow using apptainer/singularity, use

snakemake --cores all --sdm apptainer

To run the workflow using a combination of conda and apptainer/singularity for software deployment, use

snakemake --cores all --sdm conda apptainer

To run the workflow with automatic deployment of all required software via conda/mamba, use

snakemake --cores all --sdm conda

Snakemake will automatically detect the main Snakefile in the workflow subfolder and execute the workflow module that has been defined by the deployment in step 2.

For further options such as cluster and cloud execution, see the docs.

Step 5: Generate report

After finalizing your data analysis, you can automatically generate an interactive visual HTML report for inspection of results together with parameters and code inside of the browser using

snakemake --report report.zip

Configuration

The following section is imported from the workflow’s config/README.md.

Workflow configuration

The workflow processes one or more Metagenome-Assembled Genomes (MAGs) per run. Set these fields in config/config.yaml:

  • sample_sheet: path to a TSV file containing the sample names and paths.

  • prodigal.extra: optional extra options string passed to the Prodigal wrapper (e.g., -p meta -f gff).

  • bakta.db: path to the Bakta database directory.

  • bakta.extra: optional extra options string passed to the Bakta wrapper.

  • gtdbtk.data_dir: path to the GTDB-Tk reference database directory.

  • gtdbtk.extra: optional extra options string passed to the GTDB-Tk wrapper.

  • metaeuk.db: path to the MetaEuk reference database (UniProt database).

  • metaeuk.extra: optional extra options string passed to the MetaEuk wrapper.

  • recognizer.resources_dir: path to the reCOGnizer resources database directory.

  • recognizer.extra: optional extra options string passed to the reCOGnizer wrapper.

  • upimapi.extra: optional extra options string passed to the UPIMAPI wrapper.

  • threads: dictionary containing computational resource presets (high, medium, low).

Sample sheet format (TSV)

Required columns:

  • sample: unique identifier/name for the MAG or isolate.

  • path: path to the input genome file in FASTA format (.fasta, .fna, .fa).

The workflow will dynamically process all rows defined in this sheet.

Example files

config/config.yaml:

sample_sheet: config/samples.tsv

prodigal:
  extra: "-p meta -f gff"

bakta:
  db: "resources/bakta_db"
  extra: ""

gtdbtk:
  data_dir: "resources/gtdbtk_db"
  extra: ""

metaeuk:
  db: "resources/metaeuk_db/uniprot_db"
  extra: "--e 0.0001"

recognizer:
  resources_dir: "resources/recognizer_db"
  extra: "--evalue 0.001"

upimapi:
  extra: "--evalue 1e-5"

threads:
  high: 16
  medium: 8
  low: 1


## Workflow parameters

_The following table is automatically parsed from the workflow's `config.schema.y(a)ml` file_.

| Parameter        | Type    | Description                                    | Required | Default            |
| ---------------- | ------- | ---------------------------------------------- | -------- | ------------------ |
| **sample_sheet** | string  | path to sample sheet, mandatory                | yes      | config/samples.tsv |
| **prodigal**     |         | parameters for Prodigal gene prediction        | yes      |                    |
|  . extra         | string  | extra CLI options passed to Prodigal wrapper   |          |                    |
| **bakta**        |         | parameters for Bakta annotation                | yes      |                    |
|  . db            | string  | path to Bakta database directory               | yes      |                    |
|  . extra         | string  | extra CLI options passed to Bakta wrapper      |          |                    |
| **gtdbtk**       |         | parameters for GTDB-Tk classification          | yes      |                    |
|  . data_dir      | string  | path to GTDB-Tk database directory             | yes      |                    |
|  . extra         | string  | extra CLI options passed to GTDB-Tk wrapper    |          |                    |
| **metaeuk**      |         | parameters for MetaEuk gene prediction         | yes      |                    |
|  . db            | string  | path to MetaEuk reference database             | yes      |                    |
|  . extra         | string  | extra CLI options passed to MetaEuk wrapper    |          |                    |
| **recognizer**   |         | parameters for reCOGnizer domain annotation    | yes      |                    |
|  . resources_dir | string  | path to reCOGnizer database directory          | yes      |                    |
|  . extra         | string  | extra CLI options passed to reCOGnizer wrapper |          |                    |
| **upimapi**      |         | parameters for UPIMAPI functional annotation   | yes      |                    |
|  . extra         | string  | extra CLI options passed to UPIMAPI wrapper    |          |                    |
| **threads**      |         | computational resources presets                | yes      |                    |
|  . high          | integer |                                                |          |                    |
|  . medium        | integer |                                                |          |                    |
|  . low           | integer |                                                |          |                    |


## Linting and formatting

(linting-rodolfobrandao8-snakemake-mag-annotation)=
:::{dropdown} Linting results

<div style="max-height: 400px; overflow-y: auto; padding: 0;">

```{code-block}
:linenos:

FileNotFoundError in file "/tmp/tmpw3tva81d/workflow/rules/common.smk", line 7:
[Errno 2] No such file or directory: '/home/argomes/data/meta/EST6/results/domain/EST6/annotation_samples_test.tsv'
  File "/tmp/tmpw3tva81d/workflow/rules/common.smk", line 7, in <module>
  File "/home/runner/work/snakemake-workflow-catalog/snakemake-workflow-catalog/.pixi/envs/default/lib/python3.13/site-packages/pandas/io/parsers/readers.py", line 1026, in read_csv
  File "/home/runner/work/snakemake-workflow-catalog/snakemake-workflow-catalog/.pixi/envs/default/lib/python3.13/site-packages/pandas/io/parsers/readers.py", line 620, in _read
  File "/home/runner/work/snakemake-workflow-catalog/snakemake-workflow-catalog/.pixi/envs/default/lib/python3.13/site-packages/pandas/io/parsers/readers.py", line 1620, in __init__
  File "/home/runner/work/snakemake-workflow-catalog/snakemake-workflow-catalog/.pixi/envs/default/lib/python3.13/site-packages/pandas/io/parsers/readers.py", line 1880, in _make_engine
  File "/home/runner/work/snakemake-workflow-catalog/snakemake-workflow-catalog/.pixi/envs/default/lib/python3.13/site-packages/pandas/io/common.py", line 873, in get_handle