snakemake-workflows/cellranger-count

A Snakemake workflow for counting single cell RNAseq (scRNA-seq) data with Cell Ranger (Cell Ranger licensing requires a manual download of the software).

Overview

Latest release: v1.0.0, Last update: 2025-07-11

Linting: linting: passed, Formatting: formatting: passed

Deployment

Step 1: Install Snakemake and Snakedeploy

Snakemake and Snakedeploy are best installed via the Mamba package manager (a drop-in replacement for conda). If you have neither Conda nor Mamba, it is recommended to install Miniforge. More details regarding Mamba can be found here.

When using Mamba, run

mamba create -c conda-forge -c bioconda --name snakemake snakemake snakedeploy

to install both Snakemake and Snakedeploy in an isolated environment. For all following commands ensure that this environment is activated via

conda activate snakemake

Step 2: Deploy workflow

With Snakemake and Snakedeploy installed, the workflow can be deployed as follows. First, create an appropriate project working directory on your system and enter it:

mkdir -p path/to/project-workdir
cd path/to/project-workdir

In all following steps, we will assume that you are inside of that directory. Then run

snakedeploy deploy-workflow https://github.com/snakemake-workflows/cellranger-count . --tag v1.0.0

Snakedeploy will create two folders, workflow and config. The former contains the deployment of the chosen workflow as a Snakemake module, the latter contains configuration files which will be modified in the next step in order to configure the workflow to your needs.

Step 3: Configure workflow

To configure the workflow, adapt config/config.yml to your needs following the instructions below.

Step 4: Run workflow

The deployment method is controlled using the --software-deployment-method (short --sdm) argument.

To run the workflow using apptainer/singularity, use

snakemake --cores all --sdm apptainer

To run the workflow using a combination of conda and apptainer/singularity for software deployment, use

snakemake --cores all --sdm conda apptainer

To run the workflow with automatic deployment of all required software via conda/mamba, use

snakemake --cores all --sdm conda

Snakemake will automatically detect the main Snakefile in the workflow subfolder and execute the workflow module that has been defined by the deployment in step 2.

For further options such as cluster and cloud execution, see the docs.

Step 5: Generate report

After finalizing your data analysis, you can automatically generate an interactive visual HTML report for inspection of results together with parameters and code inside of the browser using

snakemake --report report.zip

Configuration

The following section is imported from the workflow’s config/README.md.

Workflow overview

This workflow is a best-practice workflow for systematically running cellranger count on one or more samples. The workflow is built using snakemake and consists of the following steps:

Link in files to a new file name that follows cellranger requirements.
Create a per-sample cellranger library CSV sheet.
Run cellranger count, parallelizing over samples.
Create a snakemake report with the Web Summaries.

Running the workflow

cellranger download

As a pre-requisite for running the workflow, you need to download the *.tar.gz file with the Cell Ranger executable from the Cell Ranger Download center: https://www.10xgenomics.com/support/software/cell-ranger/downloads

Afterwards, set the environment variable CELLRANGER_TARBALL to the full path of this executable, for example:

To make this a permanently set environment variable for your user on the respective system, add the (adapted) line from above to your ~/.bashrc file and make sure this file is always loaded.

With this environment variable set, the workflow will automatically install cellranger into a conda environment that is then used for all cellranger steps.

Input data

The sample sheet has the following layout:

sample	lane_number	library_type	read1	read2
sample1	1	Gene Expression	sample1.bwa.L001.read1.fastq.gz	sample1.bwa.L001.read2.fastq.gz
sample1	2	Gene Expression	sample1.bwa.L002.read1.fastq.gz	sample1.bwa.L002.read2.fastq.gz
sample2	1	Gene Expression	sample2.bwa.read1.fastq.gz	sample2.bwa.read2.fastq.gz

The lane_number column is optional, and only necessary if a any sample is sequenced across multiple lanes. All other columns are required. read1 and read2 require relative paths to the main workflow directory (where you run the snakemake command).

Parameters

This table lists the most important configuration parameters that can be set in the config/config.yaml file.

The ref_data needs to be downloaded manually from the Cell Ranger Download Center: https://www.10xgenomics.com/support/software/cell-ranger/downloads After download, extract the tar file into a directory and provide the directory’s path under ref_data:.

You can also check with your local compute cluster, if they have the reference data available already. In that case, you can just point the ref_data: configuration variable to the respective path.

parameter	type	details	default
sample_sheet
path	str	path to sample sheet, mandatory	“config/samples.tsv”
ref_data
path	str	path to downloaded reference data, mandatory

Linting and formatting

Linting results

All tests passed!

Formatting results

All tests passed!