milesroberts-123/slim-sweep-cnn

A workflow to simulate selective sweeps in SLiM and turn the results into images

Overview

Topics: evolutionary-biology machine-learning population-genetics

Latest release: None, Last update: 2025-06-16

Linting: linting: failed, Formatting: formatting: failed

Deployment

Step 1: Install Snakemake and Snakedeploy

Snakemake and Snakedeploy are best installed via the Mamba package manager (a drop-in replacement for conda). If you have neither Conda nor Mamba, it is recommended to install Miniforge. More details regarding Mamba can be found here.

When using Mamba, run

mamba create -c conda-forge -c bioconda --name snakemake snakemake snakedeploy

to install both Snakemake and Snakedeploy in an isolated environment. For all following commands ensure that this environment is activated via

conda activate snakemake

Step 2: Deploy workflow

With Snakemake and Snakedeploy installed, the workflow can be deployed as follows. First, create an appropriate project working directory on your system and enter it:

mkdir -p path/to/project-workdir
cd path/to/project-workdir

In all following steps, we will assume that you are inside of that directory. Then run

snakedeploy deploy-workflow https://github.com/milesroberts-123/slim-sweep-cnn . --tag None

Snakedeploy will create two folders, workflow and config. The former contains the deployment of the chosen workflow as a Snakemake module, the latter contains configuration files which will be modified in the next step in order to configure the workflow to your needs.

Step 3: Configure workflow

To configure the workflow, adapt config/config.yml to your needs following the instructions below.

Step 4: Run workflow

The deployment method is controlled using the --software-deployment-method (short --sdm) argument.

To run the workflow using a combination of conda and apptainer/singularity for software deployment, use

snakemake --cores all --sdm conda apptainer

To run the workflow with automatic deployment of all required software via conda/mamba, use

snakemake --cores all --sdm conda

Snakemake will automatically detect the main Snakefile in the workflow subfolder and execute the workflow module that has been defined by the deployment in step 2.

For further options such as cluster and cloud execution, see the docs.

Step 5: Generate report

After finalizing your data analysis, you can automatically generate an interactive visual HTML report for inspection of results together with parameters and code inside of the browser using

snakemake --report report.zip

Configuration

The following section is imported from the workflow’s config/README.md.

1. Configure workflow with config/config.yaml

The parameters that are held constant across all simulations in a workflow are in config/config.yaml. These are:

Parameter

Description

Default

K

number of simulations to run

5000

nidv

Number of individual genomes to sample from each simulation

128

nloc

Number of loci to sample from each simulation

128

distMethod

Method for measuring genetic distance between loci

“manhattan”

clustMethod

Method used to cluster genomes based on genetic distance

“complete “

2. Generate table of simulation parameters config/parameters.tsv

The parameters that vary across simulations are within config/parameters.tsv. Each row of this file represents a different simulation and each simulation gets a unique number as an ID.

There’s an example simple R script in resources/s00_make_param_table.R that generates a parameters table, but you don’t need to use that script. However you choose to generate a parameters table, it needs to have the following columns:

Parameter

Description

ID

Number from 1:K, used as a unique ID for each simulation

Q

scaling factor

N

ancestral population size, used for burn-in

sweepS

selection coefficient for sweep mutation

h

dominance coefficient of sweep mutation

sigma

selfing rate

mu

mutation rate

R

recombination rate

tau

time when population is sampled (cycles post-burn-in when simulation ends)

kappa

time when sweep is introduced (simulation will restart here if sweep fails)

f0

threshold frequency to convert sweep from neutral -> beneficial (for soft sweeps)

f1

threshold frequency to convert sweep from beneficial -> neutral (for partial sweeps)

n

number of sweep mutations to introduce (recurrent mutation)

lambda

average waiting time between sweep mutations (poisson distribution)

ncf

proportion of cross over events that are gene conversions

cl

length of gene conversion crossover events

fsimple

fraction of crossover events that are simple

B

proportion of non-sweep mutations that are beneficial

U

proportion of non-sweep mutations that are deleterious

M

proportion of non-sweep mutations that are neutral

hU

dominance coefficient for deleterious non-sweep mutations

hB

dominance coefficient for beneficial non-sweep mutations

bBar

average selection coefficient for beneficial non-sweep mutations

uBar

average selection coefficient for deleterious non-sweep mutations

alpha

shape parameter for distribution of fitness effects for deleterious non-sweep mutations

r

logistic growth rate

K

logistic carrying capacity

custom_demography

whether to use custom demography in config/demography.csv or a logistic model

Depending on your parameter choices, you can simulate lots of different sweep types. Here is a table summarizing what parameter values produce what sweep types:

Sweep type

f0

f1

n

hard

0

1

1

soft

0<

1

1

partial

0

<1

1

recurrent

0

1

>1

soft + partial

0<

<1

1

soft + recurrent

0<

1

>1

partial + recurrent

0

<1

>1

soft + partial + recurrent

0<

<1

>1

If your demography follows a logistic growth rate model, then you can simulate a wide range of demographies:

Demography

Description

r

K

constant

Population size does not change

0

N

growth

Population size increases until K

0 < r < 2

N < K

decay

Population size decreases until K

0 < r < 2

N > K

cycle

Population size cycles between two values

2 < r < sqrt(6)

anything

chaotic

Population size changes chaotically*

sqrt(6) < r < 3

anything

Note that for the chaotic demography, because our population sizes are discrete a population that randomly changes back to a size it had during a previous simulation tick will just cycle. So a chaotic demography will probably just be an arbitrarily long cycle in many cases.

3. (Optional) Specify a custom demographic pattern with config/demography.csv

For each simulation in config/parameters.tsv you need to define a switch called custom_demography. If custom_demography != 1, then slim will look for r and K values in config/parameters.tsv to use a logistic growth/death model for the population. If custom_demography == 1, then slim will look for config/demography.csv which specifies a cutom demographic pattern. It is a headerless csv file with two columns:

Population size

Time point

Column of population sizes

Column of time points, starting with 1 as the generation after burn-in, at which the population size changes

For example, a file like the following:

1000,10
2000,15
3000,20

means that 10 generations after burn-in the population size will change to 1000 (burn-in population size is defined by N), at 15 generations post-burn-in the population size will change to 2000, and at 20 generations post-burn-in the population size will change to 3000.

Linting and formatting

Linting results

 1Using workflow specific profile workflow/profiles/default for setting default command line arguments.
 2usage: snakemake [-h] [--dry-run] [--profile PROFILE]
 3                 [--workflow-profile WORKFLOW_PROFILE] [--cache [RULE ...]]
 4                 [--snakefile FILE] [--cores N] [--jobs N] [--local-cores N]
 5                 [--resources NAME=INT [NAME=INT ...]]
 6                 [--set-threads RULE=THREADS [RULE=THREADS ...]]
 7                 [--max-threads MAX_THREADS]
 8                 [--set-resources RULE:RESOURCE=VALUE [RULE:RESOURCE=VALUE ...]]
 9                 [--set-scatter NAME=SCATTERITEMS [NAME=SCATTERITEMS ...]]
10                 [--set-resource-scopes RESOURCE=[global|local]
11                 [RESOURCE=[global|local] ...]]
12                 [--default-resources [NAME=INT ...]]
13                 [--preemptible-rules [PREEMPTIBLE_RULES ...]]
14                 [--preemptible-retries PREEMPTIBLE_RETRIES]
15                 [--configfile FILE [FILE ...]] [--config [KEY=VALUE ...]]
16                 [--replace-workflow-config] [--envvars VARNAME [VARNAME ...]]
17                 [--directory DIR] [--touch] [--keep-going]
18                 [--rerun-triggers {code,input,mtime,params,software-env} [{code,input,mtime,params,software-env} ...]]
19                 [--force] [--executor {local,dryrun,touch}] [--forceall]
20                 [--forcerun [TARGET ...]]
21                 [--consider-ancient RULE=INPUTITEMS [RULE=INPUTITEMS ...]]
22                 [--prioritize TARGET [TARGET ...]]
23                 [--batch RULE=BATCH/BATCHES] [--until TARGET [TARGET ...]]
24                 [--omit-from TARGET [TARGET ...]] [--rerun-incomplete]
25                 [--shadow-prefix DIR]
26                 [--strict-dag-evaluation {cyclic-graph,functions,periodic-wildcards} [{cyclic-graph,functions,periodic-wildcards} ...]]
27                 [--scheduler [{ilp,greedy}]]
28                 [--scheduler-ilp-solver {COIN_CMD}]
29                 [--conda-base-path CONDA_BASE_PATH] [--no-subworkflows]
30                 [--precommand PRECOMMAND] [--groups GROUPS [GROUPS ...]]
31                 [--group-components GROUP_COMPONENTS [GROUP_COMPONENTS ...]]
32                 [--report [FILE]] [--report-after-run]
33                 [--report-stylesheet CSSFILE] [--reporter PLUGIN]
34                 [--draft-notebook TARGET] [--edit-notebook TARGET]
35                 [--notebook-listen IP:PORT] [--lint [{text,json}]]
36                 [--generate-unit-tests [TESTPATH]] [--containerize]
37                 [--export-cwl FILE] [--list-rules] [--list-target-rules]
38                 [--dag [{dot,mermaid-js}]] [--rulegraph [{dot,mermaid-js}]]
39                 [--filegraph] [--d3dag] [--summary] [--detailed-summary]
40                 [--archive FILE] [--cleanup-metadata FILE [FILE ...]]
41                 [--cleanup-shadow] [--skip-script-cleanup] [--unlock]
42                 [--list-changes {input,code,params}] [--list-input-changes]
43                 [--list-params-changes] [--list-untracked]
44                 [--delete-all-output | --delete-temp-output]
45                 [--keep-incomplete] [--drop-metadata] [--version]
46                 [--printshellcmds] [--debug-dag] [--nocolor]
47                 [--quiet [{all,host,progress,reason,rules} ...]]
48                 [--print-compilation] [--verbose] [--force-use-threads]
49                 [--allow-ambiguity] [--nolock] [--ignore-incomplete]
50                 [--max-inventory-time SECONDS] [--trust-io-cache]
51                 [--max-checksum-file-size SIZE] [--latency-wait SECONDS]
52                 [--wait-for-free-local-storage WAIT_FOR_FREE_LOCAL_STORAGE]
53                 [--wait-for-files [FILE ...]] [--wait-for-files-file FILE]
54                 [--queue-input-wait-time SECONDS] [--notemp] [--all-temp]
55                 [--unneeded-temp-files FILE [FILE ...]]
56                 [--keep-storage-local-copies] [--not-retrieve-storage]
57                 [--target-files-omit-workdir-adjustment]
58                 [--allowed-rules ALLOWED_RULES [ALLOWED_RULES ...]]
59                 [--max-jobs-per-timespan MAX_JOBS_PER_TIMESPAN]
60                 [--max-jobs-per-second MAX_JOBS_PER_SECOND]
61                 [--max-status-checks-per-second MAX_STATUS_CHECKS_PER_SECOND]
62                 [--seconds-between-status-checks SECONDS_BETWEEN_STATUS_CHECKS]
63                 [--retries RETRIES] [--wrapper-prefix WRAPPER_PREFIX]
64                 [--default-storage-provider DEFAULT_STORAGE_PROVIDER]
65                 [--default-storage-prefix DEFAULT_STORAGE_PREFIX]
66                 [--local-storage-prefix LOCAL_STORAGE_PREFIX]
67                 [--remote-job-local-storage-prefix REMOTE_JOB_LOCAL_STORAGE_PREFIX]
68                 [--shared-fs-usage {input-output,persistence,software-deployment,source-cache,sources,storage-local-copies,none} [{input-output,persistence,software-deployment,source-cache,sources,storage-local-copies,none} ...]]
69                 [--scheduler-greediness SCHEDULER_GREEDINESS]
70                 [--scheduler-subsample SCHEDULER_SUBSAMPLE] [--no-hooks]
71                 [--debug] [--runtime-profile FILE]
72                 [--local-groupid LOCAL_GROUPID] [--attempt ATTEMPT]
73                 [--show-failed-logs] [--logger {} [{} ...]]
74                 [--job-deploy-sources] [--benchmark-extended]
75                 [--container-image IMAGE] [--immediate-submit]
76                 [--jobscript SCRIPT] [--jobname NAME] [--flux]
77                 [--software-deployment-method {apptainer,conda,env-modules} [{apptainer,conda,env-modules} ...]]
78                 [--container-cleanup-images] [--use-conda]
79                 [--conda-not-block-search-path-envvars] [--list-conda-envs]
80                 [--conda-prefix DIR] [--conda-cleanup-envs]
81                 [--conda-cleanup-pkgs [{tarballs,cache}]]
82                 [--conda-create-envs-only] [--conda-frontend {conda,mamba}]
83                 [--use-apptainer] [--apptainer-prefix DIR]
84                 [--apptainer-args ARGS] [--use-envmodules]
85                 [--scheduler-solver-path SCHEDULER_SOLVER_PATH]
86                 [--deploy-sources QUERY CHECKSUM]
87                 [--target-jobs TARGET_JOBS [TARGET_JOBS ...]]
88                 [--mode {default,remote,subprocess}]
89                 [--report-html-path VALUE]
90                 [--report-html-stylesheet-path VALUE]
91                 [targets ...]
92snakemake: error: argument --executor/-e: invalid choice: 'slurm' (choose from local, dryrun, touch)

Formatting results

 1[DEBUG] 
 2[DEBUG] 
 3[DEBUG] In file "/tmp/tmpa13kciu_/workflow/Snakefile":  Formatted content is different from original
 4[DEBUG] 
 5[DEBUG] 
 6[DEBUG] 
 7[INFO] 1 file(s) would be changed 😬
 8[INFO] 4 file(s) would be left unchanged 🎉
 9
10snakefmt version: 0.11.0