Snakemake executor plugin: htcondor

https://img.shields.io/badge/repository-github-blue?color=%23022c22 https://img.shields.io/badge/author-Jannis%20Speer-purple?color=%23064e3b PyPI - Version PyPI - License

Warning

This plugin is not maintained and reviewed by the official Snakemake organization.

The HTCondor Software Suite (HTCSS) is a software system that creates a High-Throughput Computing (HTC) environment. This environment might be a single cluster, a set of related clusters on a campus, cloud resources, or national or international federations of computers.

Installation

Install this plugin by installing it with pip or mamba, e.g.:

pip install snakemake-executor-plugin-htcondor

Usage

In order to use the plugin, run Snakemake (>=8.0) in the folder where your workflow code and config resides (containing either workflow/Snakefile or Snakefile) with the corresponding value for the executor flag:

snakemake --executor htcondor --default-resources --jobs N ...

with N being the number of jobs you want to run in parallel and ... being any additional arguments you want to use (see below). The machine on which you run Snakemake must have the executor plugin installed, and, depending on the type of the executor plugin, have access to the target service of the executor plugin (e.g. an HPC middleware like slurm with the sbatch command, or internet access to submit jobs to some cloud provider, e.g. azure).

The flag --default-resources ensures that Snakemake auto-calculates the mem and disk resources for each job, based on the input file size. The values assumed there are conservative and should usually suffice. However, you can always override those defaults by specifying the resources in your Snakemake rules or via the --set-resources flag.

Depending on the executor plugin, you might either rely on a shared local filesystem or use a remote filesystem or storage. For the latter, you have to additionally use a suitable storage plugin (see section storage plugins in the sidebar of this catalog) and eventually check for further recommendations in the sections below.

All arguments can also be persisted via a profile, such that they don’t have to be specified on each invocation. Here, this would mean the following entries inside of the profile

executor: htcondor
default_resources: []

For specifying other default resources than the built-in ones, see the docs.

Settings

The executor plugin has the following settings (which can be passed via command line, the workflow or environment variables, if provided in the respective columns):

Settings

CLI argument

Description

Default

Choices

Required

Type

--htcondor-jobdir VALUE

Directory where the job will create a directory to store log, output and error files.

'.snakemake/htcondor'

Further details

  • It is recommended to use the dedicated snakemake profile for HTCondor, which you can find here.

  • This plugin currently only supports job submission with a shared file system.

  • The jobs use the python binary of the environment that the user had when starting snakemake as the executable.

  • Error messages, the output of stdout and log files are written to htcondor-jobdir (see in the usage section above).

  • The job directive threads is used to set request_cpu command for HTCondor.

  • As default, the jobs will be executed with the same set of environment variables that the user had at submit time. If you don’t want this behavior, set the following in resources getenv: False.

  • For the job status, this plugin reports the values of the job ClassAd Attribute JobStatus.

  • To determine whether a job was successful, this plugin relies on htcondor.Schedd.history (see API reference) and checks the values of the job ClassAd Attribute ExitCode.

The following submit description file commands are supported (add them as user-defined resources): | Basic | Matchmaking | Matchmaking (GPU) | Policy | | —————– | —————- | ————————- | ————————– | | getenv | rank | request_gpus | max_retries | | environment | request_disk | require_gpus | allowed_execute_duration | | input | request_memory | gpus_minimum_capability | allowed_job_duration | | max_materialize | requirements | gpus_minimum_memory | retry_until | | max_idle | | gpus_minimum_runtime | | | | | cuda_version | |