Snakemake executor plugin: aws-batch

https://img.shields.io/badge/repository-github-blue?color=%23022c22 https://img.shields.io/badge/author-jakevc-purple?color=%23064e3b PyPI - Version PyPI - License

This is the Snakemake plugin for AWS Batch. This plugin is used to distribute Snakemake jobs to AWS Batch EC2 instances.

Installation

Install this plugin by installing it with pip or mamba, e.g.:

pip install snakemake-executor-plugin-aws-batch

Usage

In order to use the plugin, run Snakemake (>=8.0) with the corresponding value for the executor flag:

snakemake --executor aws-batch ...

with ... being any additional arguments you want to use.

The executor plugin has the following settings:

Settings

CLI argument

Description

Default

Choices

Required

Type

--aws-batch-region VALUE

AWS Region

None

--aws-batch-job-queue VALUE

The AWS Batch task queue ARN used for running tasks

None

--aws-batch-job-role VALUE

The AWS job role ARN that is used for running the tasks

None

--aws-batch-tags VALUE

The tags that should be applied to all of the batch tasks,of the form KEY=VALUE

None

--aws-batch-task-timeout VALUE

Task timeout (seconds) will force AWS Batch to terminate a Batch task if it fails to finish within the timeout, minimum 60

300

Further details

AWS Credentials

This plugin assumes you have setup AWS CLI credentials in ~/.aws/credentials. For more information see aws cli configuration.

AWS Infrastructure Requirements

The snakemake-executor-plugin-aws-batch requires an EC2 compute environment and a job queue to be configured. The plugin repo contains terraform used to setup the requisite AWS Batch infrastructure.

Assuming you have terraform installed and aws cli credentials configured, you can deploy the required infrastructure as follows:

cd terraform
terraform init
terraform plan
terraform apply

Resource names can be updated by including a terraform.tfvars file that specifies variable name overrides of the defaults defined in vars.tf. The outputs variables from
running terraform apply can be exported as environment variables for snakemake-executor-plugin-aws-batch to use.

SNAKEMAKE_AWS_BATCH_REGION SNAKEMAKE_AWS_BATCH_JOB_QUEUE SNAKEMAKE_AWS_BATCH_JOB_ROLE

Example

Create environment

Install snakemake and the AWS executor and storage plugins into an environment. We recommend the use of mamba package manager which can be installed using miniforge, but these dependencies can also be installed using pip or other python package managers.

mamba create -n snakemake-example \
    snakemake snakemake-storage-plugin-s3 snakemake-executor-plugin-aws-batch
mamba activate snakemake-example

Clone the snakemake tutorial repo containing the example workflow:

git clone https://github.com/snakemake/snakemake-tutorial-data.git

Setup and run tutorial workflow on the the executor

cd snakemake-tutorial-data

export SNAKEMAKE_AWS_BATCH_REGION=
export SNAKEMAKE_AWS_BATCH_JOB_QUEUE=
export SNAKEMAKE_AWS_BATCH_JOB_ROLE=

snakemake --jobs 4 \
    --executor aws-batch \
    --aws-batch-region us-west-2 \
    --default-storage-provider s3 \
    --default-storage-prefix s3://snakemake-tutorial-example \
    --verbose