Snakemake executor plugin: deeporigin

https://img.shields.io/badge/repository-github-blue?color=%23022c22 https://img.shields.io/badge/author-Bilal%20Shaikh-purple?color=%23064e3b PyPI - Version PyPI - License

Warning

This plugin is not maintained and reviewed by the official Snakemake organization.

Warning

No documentation found in repository https://github.com/deeporiginbio/snakemake-executor-plugin-deeporigin. The plugin should provide a docs/intro.md with some introductory sentences and optionally a docs/further.md file with details beyond the auto-generated usage instructions presented in this catalog.

Installation

Install this plugin by installing it with pip or mamba, e.g.:

pip install snakemake-executor-plugin-deeporigin

Usage

In order to use the plugin, run Snakemake (>=8.0) in the folder where your workflow code and config resides (containing either workflow/Snakefile or Snakefile) with the corresponding value for the executor flag:

snakemake --executor deeporigin --default-resources --jobs N ...

with N being the number of jobs you want to run in parallel and ... being any additional arguments you want to use (see below). The machine on which you run Snakemake must have the executor plugin installed, and, depending on the type of the executor plugin, have access to the target service of the executor plugin (e.g. an HPC middleware like slurm with the sbatch command, or internet access to submit jobs to some cloud provider, e.g. azure).

The flag --default-resources ensures that Snakemake auto-calculates the mem and disk resources for each job, based on the input file size. The values assumed there are conservative and should usually suffice. However, you can always override those defaults by specifying the resources in your Snakemake rules or via the --set-resources flag.

Depending on the executor plugin, you might either rely on a shared local filesystem or use a remote filesystem or storage. For the latter, you have to additionally use a suitable storage plugin (see section storage plugins in the sidebar of this catalog) and eventually check for further recommendations in the sections below.

All arguments can also be persisted via a profile, such that they don’t have to be specified on each invocation. Here, this would mean the following entries inside of the profile

executor: deeporigin
default_resources: []

For specifying other default resources than the built-in ones, see the docs.

Settings

The executor plugin has the following settings (which can be passed via command line, the workflow or environment variables, if provided in the respective columns):

Settings

CLI argument

Description

Default

Choices

Required

Type

--deeporigin-namespace VALUE

The namespace to use for submitted jobs.

'default'

--deeporigin-cpu-scalar VALUE

K8s reserves some proportion of available CPUs for its own use. So, where an underlying node may have 8 CPUs, only e.g. 7600 milliCPUs are allocatable to k8s pods (i.e. snakemake jobs). As 8 > 7.6, k8s can’t find a node with enough CPU resource to run such jobs. This argument acts as a global scalar on each job’s CPU request, so that e.g. a job whose rule definition asks for 8 CPUs will request 7600m CPUs from k8s, allowing it to utilise one entire node. N.B: the job itself would still see the original value, i.e. as the value substituted in {threads}.

0.95

--deeporigin-service-account-name VALUE

This argument allows the use of customer service accounts for kubernetes pods. If specified, serviceAccountName will be added to the pod specs. This is e.g. needed when using workload identity which is enforced when using Google Cloud GKE Autopilot.

None