Snakemake executor plugin: cannon

https://img.shields.io/badge/repository-github-blue?color=%23022c22

https://img.shields.io/badge/author-Gregg%20Thomas-purple?color=%23064e3b

Warning

This plugin is not maintained and reviewed by the official Snakemake organization.

Automatic partition selection for the Cannon compute cluster at Harvard when running SLURM jobs through Snakemake.

What follows is information specific to the Cannon plugin. For full documentation relating to the SLURM plugin, which is still applicable here, see the official SLURM plugin docs.

Installation

Install this plugin by installing it with pip or mamba directly, e.g.:

pip install snakemake-executor-plugin-cannon

Or, if you are using pixi, add the plugin to your pixi.toml. Be careful to put it under the right dependency type based on the plugin’s availability, e.g.:

snakemake-executor-plugin-cannon = "*"

Usage

In order to use the plugin, run Snakemake (>=8.0) in the folder where your workflow code and config resides (containing either workflow/Snakefile or Snakefile) with the corresponding value for the executor flag:

snakemake --executor cannon --default-resources --jobs N ...

with N being the number of jobs you want to run in parallel and ... being any additional arguments you want to use (see below). The machine on which you run Snakemake must have the executor plugin installed, and, depending on the type of the executor plugin, have access to the target service of the executor plugin (e.g. an HPC middleware like slurm with the sbatch command, or internet access to submit jobs to some cloud provider, e.g. azure).

The flag --default-resources ensures that Snakemake auto-calculates the mem and disk resources for each job, based on the input file size. The values assumed there are conservative and should usually suffice. However, you can always override those defaults by specifying the resources in your Snakemake rules or via the --set-resources flag.

Depending on the executor plugin, you might either rely on a shared local filesystem or use a remote filesystem or storage. For the latter, you have to additionally use a suitable storage plugin (see section storage plugins in the sidebar of this catalog) and eventually check for further recommendations in the sections below.

All arguments can also be persisted via a profile, such that they don’t have to be specified on each invocation. Here, this would mean the following entries inside of the profile

executor: cannon
default_resources: []

For specifying other default resources than the built-in ones, see the docs.

Settings

The executor plugin has the following settings (which can be passed via command line, the workflow or environment variables, if provided in the respective columns):

Settings
CLI argument	Description	Default	Required
`--cannon-logdir VALUE`	Per default the SLURM log directory is relative to the working directory.This flag allows to set an alternative directory.	`None`	✗
`--cannon-keep-successful-logs VALUE`	Per default SLURM log files will be deleted upon sucessful completion of a job. Whenever a SLURM job fails, its log file will be preserved. This flag allows to keep all SLURM log files, even those of successful jobs.	`False`	✗
`--cannon-delete-logfiles-older-than VALUE`	Per default SLURM log files in the SLURM log directory of a workflow will be deleted after 10 days. For this, best leave the default log directory unaltered. Setting this flag allows to change this behaviour. If set to <=0, no old files will be deleted.	`10`	✗
`--cannon-init-seconds-before-status-checks VALUE`	Defines the time in seconds before the first status check is performed after job submission.	`40`	✗
`--cannon-status-attempts VALUE`	Defines the number of attempts to query the status of all active jobs. If the status query fails, the next attempt will be performed after the next status check interval.The default is 5 status attempts before giving up. The maximum time between status checks is 180 seconds.	`5`	✗
`--cannon-requeue VALUE`	Allow requeuing preempted of failed jobs, if no cluster default. Results in sbatch … –requeue … This flag has no effect, if not set.	`False`	✗
`--cannon-no-account VALUE`	Do not use any account for submission. This flag has no effect, if not set.	`False`	✗
`--cannon-efficiency-report VALUE`	Generate an efficiency report at the end of the workflow. This flag has no effect, if not set.	`False`	✗
`--cannon-efficiency-report-path VALUE`	Path to the efficiency report file. If not set, the report will be written to the current working directory with the name ‘efficiency_report_<run_uuid>.csv’. This flag has no effect, if not set.	`None`	✗
`--cannon-efficiency-threshold VALUE`	The efficiency threshold for the efficiency report. Jobs with an efficiency below this threshold will be reported. This flag has no effect, if not set.	`0.8`	✗
`--cannon-qos VALUE`	If set, the given QoS will be used for job submission.	`None`	✗
`--cannon-reservation VALUE`	If set, the given reservation will be used for job submission.	`None`	✗
`--cannon-resources VALUE`	Print information about the Cannon cluster and exit.	`False`	✗

Further details

How this Plugin works

This plugin is based off of the general SLURM plugin, but with added logic for automatic partition selection specifically on the Cannon cluster at Harvard University. With this plugin, Snakemake submits itself as a job script when operating on the Cannon cluster. Consequently, the SLURM log file will duplicate the output of the corresponding rule. To avoid redundancy, the plugin deletes the SLURM log file for successful jobs, relying instead on the rule-specific logs.

Remote executors submit Snakemake jobs to ensure unique functionalities — such as piped group jobs and rule wrappers — are available on cluster nodes. The memory footprint varies based on these functionalities; for instance, rules with a run directive that import modules and read data may require more memory.

The information provided below is specific to the Cannon plugin. For full documentation of the general SLURM plugin see the official documentation for that plugin.

Installing this plugin into your Snakemake base environment using conda will also install the ‘jobstep’ plugin, utilized on cluster nodes. Additionally, we recommend installing the snakemake-storage-plugin-fs, which will automate transferring data from the main file system to slurm execution nodes and back (stage-in and stage-out).

Contributions

We welcome bug reports and feature requests! Please report issues specific to this plugin in the plugin’s GitHub repository. For other concerns, refer to the Snakemake main repository or the relevant Snakemake plugin repository. Cluster-related issues should be directed to FAS Research Computing or FAS Informatics.

Partition selection

On a computinng cluster, a partition designates a subset of compute nodes grouped for specific purposes, such as high-memory or GPU tasks.

The Cannon plugin uses the provided resources (see below) to best place a job on a partition on the cluster. Briefly, the plugin first checks if any GPUs are required and, if so, assigns the job to the gpu partition. Next, if the job requires a lot of memory, it will be assigned to one of the bigmem partitions. If the job requires many CPUs, it will be assigned to intermediate or sapphire depending on memory an runtime requirements. If the job doesn’t exceed either the memory or CPU threshold, it will be put on the shared partition.

If a partition for a particular rule is provided in the rule, the command line, or in the profile, that partition will be used regardless.

After partition selection, the plugin does some checks to ensure the selected partition has the resources requested and will inform the user if not.

Specifying Account

In SLURM, an account is used for resource accounting and allocation.

This resource is typically omitted from Snakemake workflows to maintain platform independence, allowing the same workflow to run on different systems without modification.

To specify it at the command line, define it as default resources:

$ snakemake --executor slurm \
> -j unlimited \
> --workflow-profile <profile directory with a `config.yaml`> \
> --configfile config/config.yaml \
> --directory <path>
``` console
$ snakemake --executor cannon --default-resources slurm_account=<your SLURM account>

The plugin does its best to guess your account. That might not be possible. Particularly, when dealing with several SLURM accounts, users ought to set them per workflow. Some clusters, however, have a pre-defined default per user and do not allow users to set their account or partition. The plugin will always attempt to set an account. To override this behavior, the --slurm-no-account flag can be used.

If individual rules require e.g. a different partition, you can override the default per rule:

$ snakemake --executor cannon --default-resources slurm_account=<your SLURM account> slurm_partition=<your SLURM partition> --set-resources <somerule>:slurm_partition=<some other partition>

To ensure consistency and ease of management, it’s advisable to persist such settings via a configuration profile, which can be provided system-wide, per user, or per workflow.

By default, the executor waits 40 seconds before performing the first job status check. This interval can be adjusted using the --slurm-init-seconds-before-status-checks=<time in seconds> option, which may be useful when developing workflows on an HPC cluster to minimize turn-around times.

Configuring SMP Jobs in Snakemake with the Cannon Executor Plugin

In Snakemake workflows, many jobs are executed by programs that are either single-core scripts or multithreaded applications, which are categorized as SMP (**S**hared **M**memory **P**rocessing) jobs. To allocate resources for such jobs using the SLURM executor plugin, you can specify the required number of CPU cores and memory directly within the resources section of a rule. Here’s how you can define a rule that requests 8 CPU cores and 14 GB of memory:

rule a:
    input: ...
    output: ...
    threads: 8
    resources:
        mem_mb=14000

Snakemake knows the cpus_per_task, similar to SLURM, as an alternative to threads. Parameters in the resources section will take precedence.

Default resource values for the Cannon plugin

The following resource flags (and default values) are available to be set in rules and affect partition selection, with there being multiple ways to specify the amount of memory for a job.

Note that only one of mem, mem_gb, and mem_mb should be set. If multiple are set, only one will be used with the order of precedence being mem > mem_gb > mem_mb.

If you want to specify usage of GPUs in resources, you will have to use the slurm_extra tag, which there are examples of below in the Setting GPUs section.

See the official SLURM plugin docs for information about other resource specifciations available.

Workflow profiles

To avoid hard-coding resource parameters into your Snakefiles, it is advisable to create a cluster-specific workflow profile. This profile should be named config.yaml and placed in a directory named profiles relative to your workflow directory. You can then indicate this profile to Snakemake using the --workflow-profile profiles option. Here’s an example of how the config.yaml file might look:

default-resources: # Set these if you wish to override the defaults set in the Cannon plugin
    slurm_account: "<account>"
    slurm_partition: "<default partition>"
    mem_mb_per_cpu: 1800 # take a sensible default for your cluster
    runtime: "30m"

# here only rules, which require different (more) resources:
set-resources:
    rule_a:
        runtime: "2h"

    rule_b:
        mem_mb_per_cpu: 3600
        runtime: "5h"

# parallelization with threads needs to be defined separately:
set-threads:
    rule_b: 64

In this configuration:

default-resources sets the default SLURM account, partition, memory per CPU, and runtime for all jobs. These only need to be set if you want to change the ones set in the Cannon plugin (see above)
set-resources allows you to override these defaults for specific rules, such as rule_a and rule_b
set-threads specifies the number of threads for particular rules, enabling fine-grained control over parallelization.

One may also set a specific partition for a specific rule by using the slurm_partition: parameter under a rule. This will override the Cannon plugin’s automatic partition selection.

By utilizing a configuration profile, you can maintain a clean and platform-independent workflow definition while tailoring resource specifications to the requirements of your SLURM cluster environment.

Cannon plugin profile example

Because profiles may contain multiple files, the profile argument is passed a directory path. However, for resource specification, the file you need to create is config.yaml, in which you can specify the resources for the rules of your pipeline, e.g. for a workflow with rules named a and b:

executor: cannon

set-resources:
  a:
    slurm_partition: sapphire
    mem: 5G
    cpus_per_task: 1
    runtime: 30m

  b:
    mem: 10G
    cpus_per_task: 4
    runtime: 2h
    gres: "'gpu:2'"

Note that the slurm_partition: specification can be blank or omitted, as in rule b, since this plugin will select the partition for you based on the other resources provided. However, if slurm_partition: is provided with a value, as in rule a, that partition will be used.

Any resource fields implemented in Snakemake are available to be used in the profile and with this plugin, but only memory (mem: or mem_mb: or mem_gb), cpus_per_task:, runtime:, and GPUs via slurm_extra: will affect partition selection. If fields are left blank, the plugin has default values to fall back on.

The resource flags and default values are used in profiles as described above.

Setting GPUs

Use the gres: field to supply the number of GPUs. See b above for an example requesting 2 GPUs.

Knowing which rules are in a workflow

If you’re working with a workflow developed by someone else, you will need to get a general sense of which rules exist to specify resources for them in your profile.

The absolute quickest way to see the names of the rules in a workflow is to use the --list option:

snakemake -s <path/to/snakefile.smk> --list

This will simply print out the names of the rules in the workflow, which are hopefully descriptive enough to give you a sense for what resources they will need.

For a little more information, you can use --dryrun:

snakemake -s <path/to/snakefile.smk> --dryrun

This will run through the workflow and report exactly what jobs will be submitted without actually submitting them.

Once you know the rules in your workflow, you can setup their resources in your profile.

Example profile

As a template, you can use the tests/cannon-test-profile/config.yaml, which will need to be modified with the necessary changes for the workflow that you want to run.

Specifying the executor in the profile

Note the first line of the profile:

executor: cannon

This tells Snakemake which plugin to use to execute job submission. Alternatively, if this line is excluded from the profile, one could specify the plugin directly from the command line:

snakemake -e cannon ...

Either method is acceptable.

End

Recall that this information is specific to the Cannon plugin for the Cannon cluster at Harvard University. For full documentation of the general SLURM plugin see the official documentation.