seankmartin/atn-sub-lfp-workflow
Working with snakemake for analysis of SUB LFP
Overview
Topics:
Latest release: 23.02.01, Last update: 2023-03-10
Linting: linting: passed, Formatting:formatting: failed
Deployment
Step 1: Install Snakemake and Snakedeploy
Snakemake and Snakedeploy are best installed via the Mamba package manager (a drop-in replacement for conda). If you have neither Conda nor Mamba, it is recommended to install Miniforge. More details regarding Mamba can be found here.
When using Mamba, run
mamba create -c conda-forge -c bioconda --name snakemake snakemake snakedeploy
to install both Snakemake and Snakedeploy in an isolated environment. For all following commands ensure that this environment is activated via
conda activate snakemake
Step 2: Deploy workflow
With Snakemake and Snakedeploy installed, the workflow can be deployed as follows. First, create an appropriate project working directory on your system and enter it:
mkdir -p path/to/project-workdir
cd path/to/project-workdir
In all following steps, we will assume that you are inside of that directory. Then run
snakedeploy deploy-workflow https://github.com/seankmartin/atn-sub-lfp-workflow . --tag 23.02.01
Snakedeploy will create two folders, workflow
and config
. The former contains the deployment of the chosen workflow as a Snakemake module, the latter contains configuration files which will be modified in the next step in order to configure the workflow to your needs.
Step 3: Configure workflow
To configure the workflow, adapt config/config.yml
to your needs following the instructions below.
Step 4: Run workflow
The deployment method is controlled using the --software-deployment-method
(short --sdm
) argument.
To run the workflow with automatic deployment of all required software via conda
/mamba
, use
snakemake --cores all --sdm conda
Snakemake will automatically detect the main Snakefile
in the workflow
subfolder and execute the workflow module that has been defined by the deployment in step 2.
For further options such as cluster and cloud execution, see the docs.
Step 5: Generate report
After finalizing your data analysis, you can automatically generate an interactive visual HTML report for inspection of results together with parameters and code inside of the browser using
snakemake --report report.zip
Configuration
The following section is imported from the workflow’s config/README.md
.
Configuration
The main config for path setup is config.yaml, and simuran_params.yml for analyis parameters. If you have the raw Axona data, you should change the data_directory and ca1_directory parameters in config.yaml to the paths containing the downloaded SUB and CA1 data. Otherwise, create a folder called results in the parent directory to this file, and place the information downloaded from our open data publication there. The other config files are unlikely to require modification.
Possible Error
If you get an error, try updating workflow/Snakefile
to have path=workflow/Snakefile
instead of path=workflow\Snakefile
.
Main config files
config.yaml
This file contains the following variables, in particular, 1 and 2 likely need to be modified:
- data_directory: The directory where the SUB data is stored.
- ca1_directory: The directory where the CA1 data is stored.
- simuran_config: The path to the simuran config file (simuran_params.yml).
- openfield_filter: The filter to use for openfield recordings (openfield_recordings.yml).
- tmaze_filter: The filter to use for tmaze recordings (tmaze_recordings.yml).
- overwrite_nwb: Whether to overwrite the NWB files if they already exist (False).
- sleep_only: Whether to only process sleep recordings (False).
- overwrite_sleep: Whether to overwrite the sleep analysis files if they already exist (False).
- except_nwb_errors: Whether to ignore NWB errors (True).
simuran_params.yml
This file contains individual parameters for each analysis, such as the band to use to consider theta to be in (e.g. 6-12 Hz):
- cfg_base_dir: The base directory to use for data referred to by relative paths in the config files.
- do_spectrogram_plot: Whether to plot the spectrogram (False).
- plot_psd: Whether to plot the power spectrum (True).
- image_format: The format to use for images (png).
- loader: The name of the loader to use (neurochat).
- loader_kwargs: The keyword arguments to pass to the loader.
- clean_method: The method to use to clean the LFP signals, by default it zscore normalises the signals and the picks the bipolar electrode signals from these if they don't exceed a standard deviation from the average for non-canulated rats. For canulated rats, it proceeds similarly but uses all clean signals, not just those on the bipolar electrodes.
- clean_kwargs: The keyword arguments to pass to the clean method for non-canulated rats.
- can_clean_kwargs: The keyword arguments to pass to the clean method for canulated rats.
- z_score_threshold: The z-score threshold to use for the LFP cleaning.
- fmin: The minimum frequency to consider for filtering.
- fmax: The maximum frequency to consider for filtering.
- filter_kwargs: The keyword arguments to pass to the filter method.
- theta_min, theta_max: The minimum and maximum frequencies to consider for theta.
- delta_min, delta_max: The minimum and maximum frequencies to consider for delta.
- low_gamma_min, low_gamma_max: The minimum and maximum frequencies to consider for low_gamma.
- high_gamma_min, high_gamma_max: The minimum and maximum frequencies to consider for high_gamma.
- beta_min, beta_max: The minimum and maximum frequencies to consider for beta.
- psd_scale: The scale to use for the power spectrum (decibels or volts).
- number_of_shuffles_sta: The number of shuffles of time to use for the STA analysis.
- num_spike_shuffles: The number of shuffles of spikes to use for the STA analysis.
- max_psd_freq, max_fooof_freq: The maximum frequency to consider for the power spectrum and fooof analysis.
- speed_theta_samples_per_second: How many speed theta samples per second to use, as this data is binned.
- max_speed: The maximum speed to consider for the speed theta analysis, in cm/s.
- tmaze_minf, tmaze_maxf: The minimum and maximum frequencies to consider for the tmaze analysis.
- tmaze_winsec: The window size to use for the tmaze analysis, in seconds for LFP analysis.
- max_lfp_lengths: How to split up the LFP signal during tmaze analysis. Defaults give 1 second windows.
- tmaze_egf: Whether to use eeg or egf (higher rate signal) in tmaze anlaysis.
- spindles_use_avg: Whether to run spindle analysis on the average signal or on all signals.
- use_first_two_for_ripples: Whether to use the first two signals for ripple analysis or all signals.
- lfp_ripple_rate: the rate of the high frequency LFP signal to use for ripple analysis, can be a downsample of the full egf rate.
- min_sleep_length: The minimum length of sleep to consider for sleep analysis, in seconds.
- only_kay_detect: Whether to only use Kay's algorithm for sleep detection.
- except_nwb_errors: Whether to ignore NWB errors (True).
- sleep_join_tol: The allowed time of movement between sleep epochs to join them, in seconds (0.0).
- sleep_max_interval_size: The maximum allowed time for a sleep epoch to be, before splitting for efficiency, in seconds (300).
Additional config files
tmaze_recordings.yml
This lists how to obtain the tmaze recordings, with 8 control rats and 6 lesion rats considered.
openfield_recordings.yml
This lists the name of the rats to use openfield recordings for, with 6 control rats and 5 lesion rats considered.
Linting and formatting
Linting results
None
Formatting results
[DEBUG]
[DEBUG] In file "/tmp/tmpdyibimgy/seankmartin-atn-sub-lfp-workflow-d177bfe/workflow/rules/plot_data.smk": Formatted content is different from original
[DEBUG]
[DEBUG] In file "/tmp/tmpdyibimgy/seankmartin-atn-sub-lfp-workflow-d177bfe/workflow/Snakefile": Formatted content is different from original
[DEBUG]
[DEBUG] In file "/tmp/tmpdyibimgy/seankmartin-atn-sub-lfp-workflow-d177bfe/workflow/rules/analyse_data.smk": Formatted content is different from original
[DEBUG]
[DEBUG] In file "/tmp/tmpdyibimgy/seankmartin-atn-sub-lfp-workflow-d177bfe/workflow/rules/process_data.smk": Formatted content is different from original
[INFO] 4 file(s) would be changed 😬
snakefmt version: 0.8.2