Snakemake executor plugin: slurm-gustave-roussy

https://img.shields.io/badge/repository-github-blue?color=%23022c22 https://img.shields.io/badge/author-tdayris-purple?color=%23064e3b PyPI - Version PyPI - License

Warning

This plugin is not maintained and reviewed by the official Snakemake organization.

Snakemake plugin executor designed to match Gustave Roussy computing cluster specificities : automatic partition seleciton, and default resources value.

Installation

Install this plugin by installing it with pip or mamba, e.g.:

pip install snakemake-executor-plugin-slurm-gustave-roussy

Usage

In order to use the plugin, run Snakemake (>=8.0) in the folder where your workflow code and config resides (containing either workflow/Snakefile or Snakefile) with the corresponding value for the executor flag:

snakemake --executor slurm-gustave-roussy --default-resources --jobs N ...

with N being the number of jobs you want to run in parallel and ... being any additional arguments you want to use (see below). The machine on which you run Snakemake must have the executor plugin installed, and, depending on the type of the executor plugin, have access to the target service of the executor plugin (e.g. an HPC middleware like slurm with the sbatch command, or internet access to submit jobs to some cloud provider, e.g. azure).

The flag --default-resources ensures that Snakemake auto-calculates the mem and disk resources for each job, based on the input file size. The values assumed there are conservative and should usually suffice. However, you can always override those defaults by specifying the resources in your Snakemake rules or via the --set-resources flag.

Depending on the executor plugin, you might either rely on a shared local filesystem or use a remote filesystem or storage. For the latter, you have to additionally use a suitable storage plugin (see section storage plugins in the sidebar of this catalog) and eventually check for further recommendations in the sections below.

All arguments can also be persisted via a profile, such that they don’t have to be specified on each invocation. Here, this would mean the following entries inside of the profile

executor: slurm-gustave-roussy
default_resources: []

For specifying other default resources than the built-in ones, see the docs.

Settings

The executor plugin has the following settings (which can be passed via command line, the workflow or environment variables, if provided in the respective columns):

Settings

CLI argument

Description

Default

Choices

Required

Type

--slurm-gustave-roussy-init-seconds-before-status-checks VALUE

Defines the time in seconds before the first status check is performed after job submission.

40

--slurm-gustave-roussy-requeue VALUE

Allow requeuing preempted of failed jobs, if no cluster default. Results in sbatch … –requeue … This flag has no effect, if not set.

False

Further details

Automatic partition selection

This executor automatically selects the best queue on Flamingo computing cluster at Gustave Roussy.

In order not to break pipelines running on Colibri, and other (old) clusters, this executor selects the best queue if, and only if the host name startswith “flamingo“.

GPU queue is automatically selected once job.resources.gres is not null. One can find examples in official Snakemake documentation and expecially about gpu resources

Default values

By default, according to Flamingo defaults behavior, --mem is set to 1024 bytes, and --time to 6 hours.

As described in offcial Snakemake documentation, one can change these values, respectively through job.resources.mem_mb as described here, and through job.threads as described in there.

Additional arguments

Additional Slurm arguments can be provided through Snakemake command line, using --slurm_gustave_roussy_args "..."