young5454/ABComp
ABComp : Assembly Polishing and Bacterial Whole-genome Comparison Pipeline for Multi-group Clinical Isolates
Overview
Topics:
Latest release: None, Last update: 2024-12-07
Linting: linting: failed, Formatting: formatting: failed
Deployment
Step 1: Install Snakemake and Snakedeploy
Snakemake and Snakedeploy are best installed via the Mamba package manager (a drop-in replacement for conda). If you have neither Conda nor Mamba, it is recommended to install Miniforge. More details regarding Mamba can be found here.
When using Mamba, run
mamba create -c conda-forge -c bioconda --name snakemake snakemake snakedeploy
to install both Snakemake and Snakedeploy in an isolated environment. For all following commands ensure that this environment is activated via
conda activate snakemake
Step 2: Deploy workflow
With Snakemake and Snakedeploy installed, the workflow can be deployed as follows. First, create an appropriate project working directory on your system and enter it:
mkdir -p path/to/project-workdir
cd path/to/project-workdir
In all following steps, we will assume that you are inside of that directory. Then run
snakedeploy deploy-workflow https://github.com/young5454/ABComp . --tag None
Snakedeploy will create two folders, workflow
and config
. The former contains the deployment of the chosen workflow as a Snakemake module, the latter contains configuration files which will be modified in the next step in order to configure the workflow to your needs.
Step 3: Configure workflow
To configure the workflow, adapt config/config.yml
to your needs following the instructions below.
Step 4: Run workflow
The deployment method is controlled using the --software-deployment-method
(short --sdm
) argument.
To run the workflow with automatic deployment of all required software via conda
/mamba
, use
snakemake --cores all --sdm conda
Snakemake will automatically detect the main Snakefile
in the workflow
subfolder and execute the workflow module that has been defined by the deployment in step 2.
For further options such as cluster and cloud execution, see the docs.
Step 5: Generate report
After finalizing your data analysis, you can automatically generate an interactive visual HTML report for inspection of results together with parameters and code inside of the browser using
snakemake --report report.zip
Configuration
The following section is imported from the workflow’s config/README.md
.
ABComp requires two configuration files for running the pipeline. These yaml files can be found in the config/
directory.
config.yml
is a default configuration setting for the overall Snakemake run. Make sure you specify the correct parameters and directory names of your preference.
groups_original.yml
is a configuration file for the complete group-strain information of your clinical isolates. Below is an example yaml file of a 2-group, 5-strain setting :
NONMDR:
- B0112
- C0234
- C3455
MDR:
- B0232
- D0991
Linting and formatting
Linting results
1Lints for snakefile /tmp/tmpf06e0zdf/workflow/Snakefile:
2 * Path composition with '+' in line 82:
3 This becomes quickly unreadable. Usually, it is better to endure some
4 redundancy against having a more readable workflow. Hence, just repeat
5 common prefixes. If path composition is unavoidable, use pathlib or
6 (python >= 3.6) string formatting with f"...".
7
8Lints for rule polypolish (line 366, /tmp/tmpf06e0zdf/workflow/Snakefile):
9 * Param path is a prefix of input or output file but hardcoded:
10 If this is meant to represent a file path prefix, it will fail when
11 running workflow in environments without a shared filesystem. Instead,
12 provide a function that infers the appropriate prefix from the input or
13 output file, e.g.: lambda w, input: os.path.splitext(input[0])[0]
14 Also see:
15 https://snakemake.readthedocs.io/en/stable/snakefiles/rules.html#non-file-parameters-for-rules
16 https://snakemake.readthedocs.io/en/stable/tutorial/advanced.html#tutorial-input-functions
17
18Lints for rule busco (line 412, /tmp/tmpf06e0zdf/workflow/Snakefile):
19 * No log directive defined:
20 Without a log directive, all output will be printed to the terminal. In
21 distributed environments, this means that errors are harder to discover.
22 In local environments, output of concurrent jobs will be mixed and become
23 unreadable.
24 Also see:
25 https://snakemake.readthedocs.io/en/stable/snakefiles/rules.html#log-files
26 * Param out_path is a prefix of input or output file but hardcoded:
27 If this is meant to represent a file path prefix, it will fail when
28 running workflow in environments without a shared filesystem. Instead,
29 provide a function that infers the appropriate prefix from the input or
30 output file, e.g.: lambda w, input: os.path.splitext(input[0])[0]
31 Also see:
32 https://snakemake.readthedocs.io/en/stable/snakefiles/rules.html#non-file-parameters-for-rules
33 https://snakemake.readthedocs.io/en/stable/tutorial/advanced.html#tutorial-input-functions
34
35Lints for rule quast (line 448, /tmp/tmpf06e0zdf/workflow/Snakefile):
36 * No log directive defined:
37 Without a log directive, all output will be printed to the terminal. In
38 distributed environments, this means that errors are harder to discover.
39 In local environments, output of concurrent jobs will be mixed and become
40 unreadable.
41 Also see:
42 https://snakemake.readthedocs.io/en/stable/snakefiles/rules.html#log-files
43
44Lints for rule prokka_ref (line 470, /tmp/tmpf06e0zdf/workflow/Snakefile):
45 * No log directive defined:
46 Without a log directive, all output will be printed to the terminal. In
47 distributed environments, this means that errors are harder to discover.
48 In local environments, output of concurrent jobs will be mixed and become
49 unreadable.
50 Also see:
51 https://snakemake.readthedocs.io/en/stable/snakefiles/rules.html#log-files
52 * Param out_dir is a prefix of input or output file but hardcoded:
53 If this is meant to represent a file path prefix, it will fail when
54 running workflow in environments without a shared filesystem. Instead,
55 provide a function that infers the appropriate prefix from the input or
56 output file, e.g.: lambda w, input: os.path.splitext(input[0])[0]
57 Also see:
58 https://snakemake.readthedocs.io/en/stable/snakefiles/rules.html#non-file-parameters-for-rules
59 https://snakemake.readthedocs.io/en/stable/tutorial/advanced.html#tutorial-input-functions
60
61Lints for rule prokka_strain (line 513, /tmp/tmpf06e0zdf/workflow/Snakefile):
62 * No log directive defined:
63 Without a log directive, all output will be printed to the terminal. In
64 distributed environments, this means that errors are harder to discover.
65 In local environments, output of concurrent jobs will be mixed and become
66 unreadable.
67 Also see:
68 https://snakemake.readthedocs.io/en/stable/snakefiles/rules.html#log-files
69 * Param out_dir is a prefix of input or output file but hardcoded:
70 If this is meant to represent a file path prefix, it will fail when
71 running workflow in environments without a shared filesystem. Instead,
72 provide a function that infers the appropriate prefix from the input or
73 output file, e.g.: lambda w, input: os.path.splitext(input[0])[0]
74 Also see:
75 https://snakemake.readthedocs.io/en/stable/snakefiles/rules.html#non-file-parameters-for-rules
76 https://snakemake.readthedocs.io/en/stable/tutorial/advanced.html#tutorial-input-functions
77
78Lints for rule roary_strain_ref_pairwise (line 559, /tmp/tmpf06e0zdf/workflow/Snakefile):
79 * No log directive defined:
80 Without a log directive, all output will be printed to the terminal. In
81 distributed environments, this means that errors are harder to discover.
82 In local environments, output of concurrent jobs will be mixed and become
83 unreadable.
84 Also see:
85 https://snakemake.readthedocs.io/en/stable/snakefiles/rules.html#log-files
86 * Param out_dir is a prefix of input or output file but hardcoded:
87 If this is meant to represent a file path prefix, it will fail when
88 running workflow in environments without a shared filesystem. Instead,
89 provide a function that infers the appropriate prefix from the input or
90 output file, e.g.: lambda w, input: os.path.splitext(input[0])[0]
91 Also see:
92 https://snakemake.readthedocs.io/en/stable/snakefiles/rules.html#non-file-parameters-for-rules
93 https://snakemake.readthedocs.io/en/stable/tutorial/advanced.html#tutorial-input-functions
94
95Lints for rule move_gff_files (line 625, /tmp/tmpf06e0zdf/workflow/Snakefile):
96 * No log directive defined:
97 Without a log directive, all output will be printed to the terminal. In
98 distributed environments, this means that errors are harder to discover.
99 In local environments, output of concurrent jobs will be mixed and become
100 unreadable.
101 Also see:
102 https://snakemake.readthedocs.io/en/stable/snakefiles/rules.html#log-files
103 * Param workspace is a prefix of input or output file but hardcoded:
104 If this is meant to represent a file path prefix, it will fail when
105 running workflow in environments without a shared filesystem. Instead,
106 provide a function that infers the appropriate prefix from the input or
107 output file, e.g.: lambda w, input: os.path.splitext(input[0])[0]
108 Also see:
109 https://snakemake.readthedocs.io/en/stable/snakefiles/rules.html#non-file-parameters-for-rules
110 https://snakemake.readthedocs.io/en/stable/tutorial/advanced.html#tutorial-input-functions
111 * Param tmp_dir is a prefix of input or output file but hardcoded:
112 If this is meant to represent a file path prefix, it will fail when
113 running workflow in environments without a shared filesystem. Instead,
114 provide a function that infers the appropriate prefix from the input or
115 output file, e.g.: lambda w, input: os.path.splitext(input[0])[0]
116 Also see:
117 https://snakemake.readthedocs.io/en/stable/snakefiles/rules.html#non-file-parameters-for-rules
118 https://snakemake.readthedocs.io/en/stable/tutorial/advanced.html#tutorial-input-functions
119
120Lints for rule roary_within_group (line 657, /tmp/tmpf06e0zdf/workflow/Snakefile):
121 * No log directive defined:
122 Without a log directive, all output will be printed to the terminal. In
123 distributed environments, this means that errors are harder to discover.
124 In local environments, output of concurrent jobs will be mixed and become
125 unreadable.
126 Also see:
127 https://snakemake.readthedocs.io/en/stable/snakefiles/rules.html#log-files
128
129Lints for rule gene_list_maker (line 687, /tmp/tmpf06e0zdf/workflow/Snakefile):
130 * No log directive defined:
131 Without a log directive, all output will be printed to the terminal. In
132 distributed environments, this means that errors are harder to discover.
133 In local environments, output of concurrent jobs will be mixed and become
134 unreadable.
135 Also see:
136 https://snakemake.readthedocs.io/en/stable/snakefiles/rules.html#log-files
137
138Lints for rule move_faa_files (line 709, /tmp/tmpf06e0zdf/workflow/Snakefile):
139 * No log directive defined:
140 Without a log directive, all output will be printed to the terminal. In
141 distributed environments, this means that errors are harder to discover.
142 In local environments, output of concurrent jobs will be mixed and become
143 unreadable.
144 Also see:
145 https://snakemake.readthedocs.io/en/stable/snakefiles/rules.html#log-files
146 * Param workspace is a prefix of input or output file but hardcoded:
147 If this is meant to represent a file path prefix, it will fail when
148 running workflow in environments without a shared filesystem. Instead,
149 provide a function that infers the appropriate prefix from the input or
150 output file, e.g.: lambda w, input: os.path.splitext(input[0])[0]
151 Also see:
152 https://snakemake.readthedocs.io/en/stable/snakefiles/rules.html#non-file-parameters-for-rules
153 https://snakemake.readthedocs.io/en/stable/tutorial/advanced.html#tutorial-input-functions
154 * Param group_dir is a prefix of input or output file but hardcoded:
155 If this is meant to represent a file path prefix, it will fail when
156 running workflow in environments without a shared filesystem. Instead,
157 provide a function that infers the appropriate prefix from the input or
158 output file, e.g.: lambda w, input: os.path.splitext(input[0])[0]
159 Also see:
160 https://snakemake.readthedocs.io/en/stable/snakefiles/rules.html#non-file-parameters-for-rules
161 https://snakemake.readthedocs.io/en/stable/tutorial/advanced.html#tutorial-input-functions
162
163Lints for rule fasta_curation (line 741, /tmp/tmpf06e0zdf/workflow/Snakefile):
164 * No log directive defined:
165 Without a log directive, all output will be printed to the terminal. In
166 distributed environments, this means that errors are harder to discover.
167 In local environments, output of concurrent jobs will be mixed and become
168 unreadable.
169 Also see:
170 https://snakemake.readthedocs.io/en/stable/snakefiles/rules.html#log-files
171
172Lints for rule cog_analysis_core (line 767, /tmp/tmpf06e0zdf/workflow/Snakefile):
173 * No log directive defined:
174 Without a log directive, all output will be printed to the terminal. In
175 distributed environments, this means that errors are harder to discover.
176 In local environments, output of concurrent jobs will be mixed and become
177 unreadable.
178 Also see:
179 https://snakemake.readthedocs.io/en/stable/snakefiles/rules.html#log-files
180
181Lints for rule cog_analysis_shells (line 803, /tmp/tmpf06e0zdf/workflow/Snakefile):
182 * No log directive defined:
183 Without a log directive, all output will be printed to the terminal. In
184 distributed environments, this means that errors are harder to discover.
185 In local environments, output of concurrent jobs will be mixed and become
186 unreadable.
187 Also see:
188 https://snakemake.readthedocs.io/en/stable/snakefiles/rules.html#log-files
189
190Lints for rule cog_visualization (line 839, /tmp/tmpf06e0zdf/workflow/Snakefile):
191 * No log directive defined:
192 Without a log directive, all output will be printed to the terminal. In
193 distributed environments, this means that errors are harder to discover.
194 In local environments, output of concurrent jobs will be mixed and become
195 unreadable.
196 Also see:
197 https://snakemake.readthedocs.io/en/stable/snakefiles/rules.html#log-files
198
199Lints for rule roary_visualization (line 884, /tmp/tmpf06e0zdf/workflow/Snakefile):
200 * No log directive defined:
201
202... (truncated)
Formatting results
1[DEBUG]
2[DEBUG] In file "/tmp/tmpf06e0zdf/workflow/Snakefile": Formatted content is different from original
3[INFO] 1 file(s) would be changed 😬
4
5snakefmt version: 0.10.2