snakemake-workflows/cellranger-count
A Snakemake workflow for counting single cell RNAseq (scRNA-seq) data with Cell Ranger (Cell Ranger licensing requires a manual download of the software).
Overview
Latest release: v1.0.0, Last update: 2025-07-11
Linting: linting: passed, Formatting: formatting: passed
Deployment
Step 1: Install Snakemake and Snakedeploy
Snakemake and Snakedeploy are best installed via the Mamba package manager (a drop-in replacement for conda). If you have neither Conda nor Mamba, it is recommended to install Miniforge. More details regarding Mamba can be found here.
When using Mamba, run
mamba create -c conda-forge -c bioconda --name snakemake snakemake snakedeploy
to install both Snakemake and Snakedeploy in an isolated environment. For all following commands ensure that this environment is activated via
conda activate snakemake
Step 2: Deploy workflow
With Snakemake and Snakedeploy installed, the workflow can be deployed as follows. First, create an appropriate project working directory on your system and enter it:
mkdir -p path/to/project-workdir
cd path/to/project-workdir
In all following steps, we will assume that you are inside of that directory. Then run
snakedeploy deploy-workflow https://github.com/snakemake-workflows/cellranger-count . --tag v1.0.0
Snakedeploy will create two folders, workflow
and config
. The former contains the deployment of the chosen workflow as a Snakemake module, the latter contains configuration files which will be modified in the next step in order to configure the workflow to your needs.
Step 3: Configure workflow
To configure the workflow, adapt config/config.yml
to your needs following the instructions below.
Step 4: Run workflow
The deployment method is controlled using the --software-deployment-method
(short --sdm
) argument.
To run the workflow using apptainer
/singularity
, use
snakemake --cores all --sdm apptainer
To run the workflow using a combination of conda
and apptainer
/singularity
for software deployment, use
snakemake --cores all --sdm conda apptainer
To run the workflow with automatic deployment of all required software via conda
/mamba
, use
snakemake --cores all --sdm conda
Snakemake will automatically detect the main Snakefile
in the workflow
subfolder and execute the workflow module that has been defined by the deployment in step 2.
For further options such as cluster and cloud execution, see the docs.
Step 5: Generate report
After finalizing your data analysis, you can automatically generate an interactive visual HTML report for inspection of results together with parameters and code inside of the browser using
snakemake --report report.zip
Configuration
The following section is imported from the workflow’s config/README.md
.
Workflow overview
This workflow is a best-practice workflow for systematically running cellranger count
on one or more samples.
The workflow is built using snakemake and consists of the following steps:
Link in files to a new file name that follows cellranger requirements.
Create a per-sample cellranger library CSV sheet.
Run cellranger count, parallelizing over samples.
Create a snakemake report with the Web Summaries.
Running the workflow
cellranger download
As a pre-requisite for running the workflow, you need to download the *.tar.gz
file with the Cell Ranger executable from the Cell Ranger Download center:
https://www.10xgenomics.com/support/software/cell-ranger/downloads
Afterwards, set the environment variable CELLRANGER_TARBALL
to the full path of this executable, for example:
To make this a permanently set environment variable for your user on the respective system, add the (adapted) line from above to your ~/.bashrc
file and make sure this file is always loaded.
With this environment variable set, the workflow will automatically install cellranger
into a conda environment that is then used for all cellranger steps.
Input data
The sample sheet has the following layout:
sample |
lane_number |
library_type |
read1 |
read2 |
---|---|---|---|---|
sample1 |
1 |
Gene Expression |
sample1.bwa.L001.read1.fastq.gz |
sample1.bwa.L001.read2.fastq.gz |
sample1 |
2 |
Gene Expression |
sample1.bwa.L002.read1.fastq.gz |
sample1.bwa.L002.read2.fastq.gz |
sample2 |
1 |
Gene Expression |
sample2.bwa.read1.fastq.gz |
sample2.bwa.read2.fastq.gz |
The lane_number
column is optional, and only necessary if a any sample is sequenced across multiple lanes.
All other columns are required.
read1
and read2
require relative paths to the main workflow directory (where you run the snakemake
command).
Parameters
This table lists the most important configuration parameters that can be set in the config/config.yaml
file.
The ref_data
needs to be downloaded manually from the Cell Ranger Download Center:
https://www.10xgenomics.com/support/software/cell-ranger/downloads
After download, extract the tar file into a directory and provide the directory’s path under ref_data:
.
You can also check with your local compute cluster, if they have the reference data available already.
In that case, you can just point the ref_data:
configuration variable to the respective path.
parameter |
type |
details |
default |
---|---|---|---|
sample_sheet |
|||
path |
str |
path to sample sheet, mandatory |
“config/samples.tsv” |
ref_data |
|||
path |
str |
path to downloaded reference data, mandatory |
Linting and formatting
Linting results
All tests passed!
Formatting results
All tests passed!