Snakemake executor plugin: aws-batch

https://img.shields.io/badge/repository-github-blue?color=%23022c22 GitHub - Last commit https://img.shields.io/badge/author-jakevc-purple?color=%23064e3b PyPI - Version PyPI - License Snakemake

This is the Snakemake plugin for AWS Batch. This plugin is used to distribute Snakemake jobs to AWS Batch EC2 instances.

Installation

Install this plugin by installing it with pip or mamba directly, e.g.:

pip install snakemake-executor-plugin-aws-batch

Or, if you are using pixi, add the plugin to your pixi.toml. Be careful to put it under the right dependency type based on the plugin’s availability, e.g.:

snakemake-executor-plugin-aws-batch = "*"

Usage

In order to use the plugin, run Snakemake (>=8.6) in the folder where your workflow code and config resides (containing either workflow/Snakefile or Snakefile) with the corresponding value for the executor flag:

snakemake --executor aws-batch --default-resources --jobs N ...

with N being the number of jobs you want to run in parallel and ... being any additional arguments you want to use (see below). The machine on which you run Snakemake must have the executor plugin installed, and, depending on the type of the executor plugin, have access to the target service of the executor plugin (e.g. an HPC middleware like slurm with the sbatch command, or internet access to submit jobs to some cloud provider, e.g. azure).

The flag --default-resources ensures that Snakemake auto-calculates the mem and disk resources for each job, based on the input file size. The values assumed there are conservative and should usually suffice. However, you can always override those defaults by specifying the resources in your Snakemake rules or via the --set-resources flag.

Depending on the executor plugin, you might either rely on a shared local filesystem or use a remote filesystem or storage. For the latter, you have to additionally use a suitable storage plugin (see section storage plugins in the sidebar of this catalog) and eventually check for further recommendations in the sections below.

All arguments can also be persisted via a profile, such that they don’t have to be specified on each invocation. Here, this would mean the following entries inside of the profile

executor: aws-batch
default_resources: []

For specifying other default resources than the built-in ones, see the docs.

Settings

The executor plugin has the following settings (which can be passed via command line, the workflow or environment variables, if provided in the respective columns):

Settings

CLI argument

Description

Default

Choices

Required

Type

--aws-batch-region VALUE

AWS Region

None

--aws-batch-job-queue VALUE

The AWS Batch task queue ARN used for running tasks

None

--aws-batch-job-role VALUE

The AWS job role ARN that is used for running the tasks

None

--aws-batch-tags VALUE

The tags that should be applied to all of the batch tasks,of the form KEY=VALUE

None

--aws-batch-task-timeout VALUE

Task timeout (seconds) will force AWS Batch to terminate a Batch task if it fails to finish within the timeout, minimum 60

300

Further details

AWS Credentials

This plugin assumes you have setup AWS CLI credentials in ~/.aws/credentials. For more information see aws cli configuration.

AWS Infrastructure Requirements

The snakemake-executor-plugin-aws-batch requires an EC2 compute environment and a job queue to be configured. The plugin repo contains terraform used to setup the requisite AWS Batch infrastructure.

Assuming you have terraform installed and aws cli credentials configured, you can deploy the required infrastructure as follows:

cd terraform
terraform init
terraform plan
terraform apply

Resource names can be updated by including a terraform.tfvars file that specifies variable name overrides of the defaults defined in vars.tf. The outputs variables from
running terraform apply can be exported as environment variables for snakemake-executor-plugin-aws-batch to use.

SNAKEMAKE_AWS_BATCH_REGION SNAKEMAKE_AWS_BATCH_JOB_QUEUE SNAKEMAKE_AWS_BATCH_JOB_ROLE

Example

Create environment

Install snakemake and the AWS executor and storage plugins into an environment. We recommend the use of mamba package manager which can be installed using miniforge, but these dependencies can also be installed using pip or other python package managers.

mamba create -n snakemake-example \
    snakemake snakemake-storage-plugin-s3 snakemake-executor-plugin-aws-batch
mamba activate snakemake-example

Clone the snakemake tutorial repo containing the example workflow:

git clone https://github.com/snakemake/snakemake-tutorial-data.git

Setup and run tutorial workflow on the the executor

cd snakemake-tutorial-data

export SNAKEMAKE_AWS_BATCH_REGION=
export SNAKEMAKE_AWS_BATCH_JOB_QUEUE=
export SNAKEMAKE_AWS_BATCH_JOB_ROLE=

snakemake --jobs 4 \
    --executor aws-batch \
    --aws-batch-region us-west-2 \
    --default-storage-provider s3 \
    --default-storage-prefix s3://snakemake-tutorial-example \
    --verbose

Container Image Requirements

The plugin does not auto-deploy the default storage provider to workers (auto_deploy_default_storage_provider=False): workers no longer run pip install snakemake-storage-plugin-s3 at startup, because that pulls an unpinned version whose newer releases require snakemake >= 9 and break workers running snakemake 8.x. The container image used for jobs must therefore pre-install a compatible version of the storage plugin (e.g. snakemake-storage-plugin-s3) alongside snakemake itself. Image maintainers are responsible for pinning a plugin version compatible with the snakemake version in the image.

Per-Rule Job Queues

By default all jobs are submitted to the queue given by --aws-batch-job-queue. A rule can override this with the batch_queue resource, e.g. to route jobs to a queue wired to a different compute environment (ARM vs x86, GPU vs CPU):

rule align:
    resources:
        batch_queue="arn:aws:batch:us-west-2:123456789012:job-queue/arm-queue"
    ...

Platform detection and job submission both use the resolved per-rule queue.

Task Timeout

By default jobs have no timeout. Set --aws-batch-task-timeout to impose a workflow-wide limit (in seconds; minimum 60). A rule can override this with the aws_batch_task_timeout resource, e.g. to give a long-running alignment step more time while keeping a tight limit on bookkeeping rules:

rule align:
    resources:
        aws_batch_task_timeout=14400  # 4 h
    ...

The per-rule resource takes precedence over --aws-batch-task-timeout. When neither is set, AWS Batch imposes no timeout. When set, the value must be at least 60 seconds (the AWS minimum).

Shared Memory (/dev/shm)

On EC2/ECS containers /dev/shm defaults to 64 MB, which is too small for tools that stage large in-memory indexes (e.g. bwa-mem2 shared-memory indexes). A rule can enlarge it via the shared_memory_size_mb resource:

rule align:
    resources:
        shared_memory_size_mb=4096
    ...

This sets linuxParameters.sharedMemorySize on the job definition. It only applies on EC2 queues — Fargate does not honor linuxParameters.sharedMemorySize, so the resource is ignored there.

Job Tags

Tags from --aws-batch-tags are applied to every job definition and job submitted by the plugin. In addition, dynamic tags can be supplied via the SNAKEMAKE_AWS_BATCH_JOB_TAGS environment variable as comma-separated KEY=VALUE pairs:

export SNAKEMAKE_AWS_BATCH_JOB_TAGS="run_id=2024-06-01,team=genomics"

Environment variable tags are merged with --aws-batch-tags and take precedence on key conflicts. This enables per-run cost tracking: a coordinator job can set the variable so that all child jobs it submits inherit the run-specific tags. AWS Batch allows at most 50 tags per job; malformed pairs (missing = or an empty key) raise an error at submission time.