nodrogluap/tamor

Illumina Dragen cancer genome and transcriptome analysis automation using Snakemake

Overview

Topics: cancer-genomics dragen mutation-analysis pcgr ruo sciworkflows bioinformatics-pipeline

Latest release: None, Last update: 2025-05-05

Linting: linting: failed, Formatting: formatting: failed

Deployment

Step 1: Install Snakemake and Snakedeploy

Snakemake and Snakedeploy are best installed via the Mamba package manager (a drop-in replacement for conda). If you have neither Conda nor Mamba, it is recommended to install Miniforge. More details regarding Mamba can be found here.

When using Mamba, run

mamba create -c conda-forge -c bioconda --name snakemake snakemake snakedeploy

to install both Snakemake and Snakedeploy in an isolated environment. For all following commands ensure that this environment is activated via

conda activate snakemake

Step 2: Deploy workflow

With Snakemake and Snakedeploy installed, the workflow can be deployed as follows. First, create an appropriate project working directory on your system and enter it:

mkdir -p path/to/project-workdir
cd path/to/project-workdir

In all following steps, we will assume that you are inside of that directory. Then run

snakedeploy deploy-workflow https://github.com/nodrogluap/tamor . --tag None

Snakedeploy will create two folders, workflow and config. The former contains the deployment of the chosen workflow as a Snakemake module, the latter contains configuration files which will be modified in the next step in order to configure the workflow to your needs.

Step 3: Configure workflow

To configure the workflow, adapt config/config.yml to your needs following the instructions below.

Step 4: Run workflow

The deployment method is controlled using the --software-deployment-method (short --sdm) argument.

To run the workflow with automatic deployment of all required software via conda/mamba, use

snakemake --cores all --sdm conda

Snakemake will automatically detect the main Snakefile in the workflow subfolder and execute the workflow module that has been defined by the deployment in step 2.

For further options such as cluster and cloud execution, see the docs.

Step 5: Generate report

After finalizing your data analysis, you can automatically generate an interactive visual HTML report for inspection of results together with parameters and code inside of the browser using

snakemake --report report.zip

Configuration

The following section is imported from the workflow’s config/README.md.

NOTA BENE!!! When running snakemake for the first time with this repository, it may take many hours, as it will download both all the software environment needed to run PCGR mutation impact reports, and all the large public resource files needed for the same (by automatically running workflow/scripts/download_resources.py). If you intererupt the downloading and unpacking of these files, you will need to rerun the download script manually.

Configuring Tamor to Analyze Your Cancer Cases

Cases are organized into logical units: Projects (a.k.a. cohorts), that have Subjects (a.k.a. patients) that have Samples (e.g. biopsy or blood).

Cases from the same cohort will be outputted into the same output folder, for organizational purposes.

A Subject must have at least one normal/germline sample. We may expand to support tumor-only analysis in the future.

A Subject can have one or more tumor samples (e.g. primary and refractory). Each tumor must have a DNA sample, and optionally an RNA sample.

Table of Contents

Test case defaults

The default config files are preconfigured for didactic purposes with a public leukaemia genome+transcriptome case from the NCBI Short Read Archive. This case is part of the cohort PR-TEST-CLL, with the patient labelled as PR-TEST-CLL-SAMN08512283, and there are three sets of input FASTQ files downloaded/generated by running workflow/scripts/download_testdata.py. The three samples are PR-TEST-CLL-SAMN08512283-SRR6702602-T (tumor DNA) , PR-TEST-CLL-SAMN08512283-SRR6702602-N (pseudonormal DNA generated by the script since no actual normal is available), and PR-TEST-CLL-SAMN08512283-SRR6702601-T (tumor RNA). Such long systematic names are not necessary, but in practice we have found them very useful as you start accumulating larger cohorts.

Sequencing instrument run IDs and sample IDs are typically rather opaque and automatically assigned by the sequencing lab. These are not part of the config files, nor reported out by Tamor, but rather linked to designated Subjects and Samples via the Illumina Samplesheets.

Site-specific customizations

Site-specific file paths

config/config.yaml is the file that you can customize for your site-specific settings. By default the config is set up to read input files from the resources folder, and write result files under the results folder. By default the genome index and annotation files, as well as the PCGR data bundle, are expected in resources. This is where workflow/scripts/download_resources.py puts those files.

Tamor’s default config has the input lists of paired tumor-normal samples (with minimal metadata, described below) in files called config/dna_samples.tsv and config/rna_samples.tsv. These TSVs are the main config files that you will need to edit to run your own samples through the workflow. Config files are internally validated for completeness based on workflow/schemas/dna_sample_config.schema.yaml and workflow/schemas/rna_sample_config.schema.yaml.

Site-specific file permissions and group ownership

On systems where multiple users will generate or use the Tamor analysis, it can useful to have the workflow automatically set shared permissions for the output files. Uncommenting the set_output_group and set_output_umask lines in config/config.yaml will make Tamor try to honour those wishes. As per UNIX convention, a umask of 007 will allow read/write by the owner and designated group, but give no permissions to others.

DNA Sample Metadata

The config/dna_samples.tsv file has 8 columns to be specified with column names:

subjectID<tab>
tumorSampleID<tab>
trueOrFalseTumorHasPCRDuplicates<tab>
germlineSampleID<tab>
trueOrFalseGermlineHasPCRDuplicates<tab>
trueOrFalseGermlineContainsSomeTumor<tab>
oncoTreeCode<tab>
projectID

Details on how to set each column are below.

Sample IDs

The subjectID, tumorSampleID and germlineSampleID must:

  • CONTAIN NO UNDERSCORES

  • The subjectID must be between 6 and 35 characters (due to a PCGR naming limitation)

  • tumorSampleID and germlineSampleID must be the exact Sample_Name values you used in your Illumina sequencing sample spreadsheets (see samplesheet section below for details).

Handling PCR duplicates

The third and fifth column tell Dragen whether to consider (in tumor and germline respectively) as PCR duplicates read pairs that map to the same start and end in the reference genome. If you used a PCR-free library prep, set this to False, otherwise set it to True.

Project designation

The eighth column is a unique project ID to which the subject belongs. For example if you have two cohorts of lung and breast cancer, assigning individuals to two projects would be logical. All project output files go into their own output folders, even if they were sequenced together on the same Illumina sequencing runs.

Tumor-in-normal handling

The sixth column of the paired input sample TSV file is usually False, unless your germline sample is from a leukaemia or perhaps a poor quality histology section from a tumor, in which case use True. This instructs Dragen to consider low frequency variants in the germline sample to still show up as somatic variants in the tumor analysis output (see default of 0.05 under tumor_in_normal_tolerance_proportion in config.yaml)

Cancer type designation

For the seventh column, the type of cancer the tumor represent must be coded. This is preferably an OncoTree code. Those codes can be found here: https://oncotree.mskcc.org/ This information will be used to customize some parts of the variant, gene expression, and immune profiling reports. If no cancer type information is available at all, you can use the top-level code in OncoTree: “TISSUE”. While OncoTree codes are preferred, Tamor will also attempt to uniquely map codes from the ICD-O, NCIt, UMLS and HemeOnc systems.

RNA Sample Metadata

The config/rna_samples.tsv file has 5 columns to be specified with column names:

subjectID<tab>
tumorRNASampleID<tab>
matchedTumorDNASampleID<tab>
projectID<tab>
cohortNameForExpressionAnalysis

If you have both normal and tumor RNA samples available, it is critical to list the tumor RNA sample first.
The first RNA sample listed in the file is the one that will be included on the PCGR report for matchedTumorDNASampleID, and typically you want to report out regarding the tumor RNA.

The last column cohortNameForExpressionAnalysis is used for Djerba cohort reporting, e.g. to identify Z-score and percentile rank outliers genes in this sample compared to others being processed at the same time and nominally of the same cancer/tissue type as defined by the user.

Illumina Samplesheets

These sample sheets are the only other metadata to which Tamor has access. Place all the Illumina experiment sample sheets for your project into resources/spreadsheets by default (see the samplesheets_dir setting in config/config.yaml). They must be called runID.csv, where runID is typically the Illumina folder name in the format YYMMDD_machineID_SideFlowCellID.

FASTQ File Location

If you are providing the FASTQs directly as input to Tamor, they must also be in the resources/analysis/primary/sequencerName/runID directory, with a corresponding Illumina Experiment Manager samplesheet resources/spreadsheets/runID.csv. Why? This is required because Tamor reads the sample sheet to find the correspondence between Sample_Name and Sample ID for each sequencing library, also analysis for DNA samples differs from that for RNA samples, so the sample sheet must also contain a Sample_Project column. Sample projects with names that contain “RNA” in them will be processed as such, all others are assumed to be DNA. The Sample_Project is not used for any other purpose than distinguishing RNA and DNA, and does not need to be the same as the projectIDs listed in the config folder files.

If you provide FASTQ files directly, they must be timestamped later than the corresponding Illumina Experiment Manager spreadsheet, otherwise Snakemake will assume you’ve consequentially changed the spreadsheet and try to automatically regenerated all FASTQs for that run – from potentially non-existent BCLs.

Samples Split Across Multiple Runs

Note that a sample can actually be sequenced across multiple runs, Tamor will aggregate the sequence data across the runs to generate a single report (e.g. a primary run and some top-up sequencing due to unexpected low read count on the first run). The same sample name can have the same sample ID or different sample IDs across runs, they will be aggregated regardless. This allows for a single tumor sample to be prepared using two different sequencing library preps for example.

BCL File Location (optional)

If you are starting with BCLs, the full Illumina experiment output folders (which contain the requisite Data/Intensities/Basecalls subfolder) are expected by in resources/bcls/runID (see bcl_dir setting inconfig.yaml). Tamor will perform BCL to FASTQ conversion, with the FASTQ output into results/analysis/primary/sequencer/runID (see analysis_dir setting in config.yaml, and the default sequencer is HiSeq per the test data mentioned earlier).

Unique Molecular Indices

The samplesheet is also used to determine if Unique Molecular Indices (UMIs) were used to generate the sequencing libraries, which requires different handling in Dragen during genotyping downstream. Use of UMIs is determined by the presence of an OverrideCycles setting in the smaplesheet that includes a “U” value. By default, random UMIs are assumed. To specify a non-random UMI scheme, uncomment the umi_whitelist, umi_correction_table, and umi_slippage_support_informative_fraction settings in config/config.yaml as appropriate.

Variant Blacklists

Some false positive variants may be recurrent across analyzes. To reduce the reporting of likely false-positive results, default list of variants to change from PASS to filtered in output VCFs have been included in Tamor, based on commonalities found in hundreds of tumor-normal cases analyzed at the University of Calgary’s CSM Centre for Health Genomics and Informatics. These can be customized by editing the dragen_cnv_blacklist.bed and dragen_snv_blacklist.txt files for copy number and small nucleotide variants respectively.

Linting and formatting

Linting results

  1Lints for snakefile /tmp/tmpj7myomrt/workflow/Snakefile:
  2    * Absolute path "/{project}/{subject}/rna/{subject}_{rna}.rna.quant.genes.hugo.tpm.txt" in line 57:
  3      Do not define absolute paths inside of the workflow, since this renders
  4      your workflow irreproducible on other machines. Use path relative to the
  5      working directory instead, or make the path configurable via a config
  6      file.
  7      Also see:
  8      https://snakemake.readthedocs.io/en/latest/snakefiles/configuration.html#configuration
  9    * Absolute path "/{project}/{subject}/rna/{subject}_{rna}.rna.quant.genes.fpkm.txt" in line 58:
 10      Do not define absolute paths inside of the workflow, since this renders
 11      your workflow irreproducible on other machines. Use path relative to the
 12      working directory instead, or make the path configurable via a config
 13      file.
 14      Also see:
 15      https://snakemake.readthedocs.io/en/latest/snakefiles/configuration.html#configuration
 16    * Absolute path "/{project}/{subject}/rna/{subject}_{rna}.rna.fusion_candidates.features.csv" in line 59:
 17      Do not define absolute paths inside of the workflow, since this renders
 18      your workflow irreproducible on other machines. Use path relative to the
 19      working directory instead, or make the path configurable via a config
 20      file.
 21      Also see:
 22      https://snakemake.readthedocs.io/en/latest/snakefiles/configuration.html#configuration
 23    * Absolute path "/{project}/{subject}/{subject}_{tumor}_{normal}.dna.somatic.hard-filtered.vcf.gz" in line 60:
 24      Do not define absolute paths inside of the workflow, since this renders
 25      your workflow irreproducible on other machines. Use path relative to the
 26      working directory instead, or make the path configurable via a config
 27      file.
 28      Also see:
 29      https://snakemake.readthedocs.io/en/latest/snakefiles/configuration.html#configuration
 30    * Absolute path "/{project}/{subject}/{subject}_{tumor}_{normal}.dna.somatic.cnv.vcf.gz" in line 61:
 31      Do not define absolute paths inside of the workflow, since this renders
 32      your workflow irreproducible on other machines. Use path relative to the
 33      working directory instead, or make the path configurable via a config
 34      file.
 35      Also see:
 36      https://snakemake.readthedocs.io/en/latest/snakefiles/configuration.html#configuration
 37    * Absolute path "/{project}/{subject}/{subject}_{tumor}_{normal}.dna.somatic.sv.vcf.gz" in line 62:
 38      Do not define absolute paths inside of the workflow, since this renders
 39      your workflow irreproducible on other machines. Use path relative to the
 40      working directory instead, or make the path configurable via a config
 41      file.
 42      Also see:
 43      https://snakemake.readthedocs.io/en/latest/snakefiles/configuration.html#configuration
 44    * Absolute path "/{project_name}/{subject}/{subject}_{tumor}_{normal}.dna.somatic.qiagen-ipa.submitted.txt" in line 63:
 45      Do not define absolute paths inside of the workflow, since this renders
 46      your workflow irreproducible on other machines. Use path relative to the
 47      working directory instead, or make the path configurable via a config
 48      file.
 49      Also see:
 50      https://snakemake.readthedocs.io/en/latest/snakefiles/configuration.html#configuration
 51    * Absolute path "/{project}/{subject}/{subject}_{tumor}_{normal}.cnv_plot.jpeg" in line 64:
 52      Do not define absolute paths inside of the workflow, since this renders
 53      your workflow irreproducible on other machines. Use path relative to the
 54      working directory instead, or make the path configurable via a config
 55      file.
 56      Also see:
 57      https://snakemake.readthedocs.io/en/latest/snakefiles/configuration.html#configuration
 58    * Absolute path "/{project}/{subject}/{subject}_{tumor}_{normal}.dna.somatic.sv.fusion_candidates.features.csv" in line 65:
 59      Do not define absolute paths inside of the workflow, since this renders
 60      your workflow irreproducible on other machines. Use path relative to the
 61      working directory instead, or make the path configurable via a config
 62      file.
 63      Also see:
 64      https://snakemake.readthedocs.io/en/latest/snakefiles/configuration.html#configuration
 65    * Absolute path "/pcgr/{project}/{subject}_{tumor}_{normal}/{subject}.pcgr.grch38.html" in line 66:
 66      Do not define absolute paths inside of the workflow, since this renders
 67      your workflow irreproducible on other machines. Use path relative to the
 68      working directory instead, or make the path configurable via a config
 69      file.
 70      Also see:
 71      https://snakemake.readthedocs.io/en/latest/snakefiles/configuration.html#configuration
 72    * Absolute path "/djerba/{project}/{subject}_{tumor}_{normal}/{subject}-v1_report.research.html" in line 67:
 73      Do not define absolute paths inside of the workflow, since this renders
 74      your workflow irreproducible on other machines. Use path relative to the
 75      working directory instead, or make the path configurable via a config
 76      file.
 77      Also see:
 78      https://snakemake.readthedocs.io/en/latest/snakefiles/configuration.html#configuration
 79
 80Lints for snakefile /tmp/tmpj7myomrt/workflow/rules/resources.smk:
 81    * Absolute path "/anchored_rna" in line 5:
 82      Do not define absolute paths inside of the workflow, since this renders
 83      your workflow irreproducible on other machines. Use path relative to the
 84      working directory instead, or make the path configurable via a config
 85      file.
 86      Also see:
 87      https://snakemake.readthedocs.io/en/latest/snakefiles/configuration.html#configuration
 88
 89Lints for snakefile /tmp/tmpj7myomrt/workflow/rules/metadata.smk:
 90    * Absolute path "/*.csv" in line 53:
 91      Do not define absolute paths inside of the workflow, since this renders
 92      your workflow irreproducible on other machines. Use path relative to the
 93      working directory instead, or make the path configurable via a config
 94      file.
 95      Also see:
 96      https://snakemake.readthedocs.io/en/latest/snakefiles/configuration.html#configuration
 97    * Path composition with '+' in line 16:
 98      This becomes quickly unreadable. Usually, it is better to endure some
 99      redundancy against having a more readable workflow. Hence, just repeat
100      common prefixes. If path composition is unavoidable, use pathlib or
101      (python >= 3.6) string formatting with f"...".
102
103Lints for snakefile /tmp/tmpj7myomrt/workflow/rules/fastq_list.smk:
104    * Absolute path "/'+wildcards.project+" in line 20:
105      Do not define absolute paths inside of the workflow, since this renders
106      your workflow irreproducible on other machines. Use path relative to the
107      working directory instead, or make the path configurable via a config
108      file.
109      Also see:
110      https://snakemake.readthedocs.io/en/latest/snakefiles/configuration.html#configuration
111    * Absolute path "/rna/'+wildcards.sample+" in line 20:
112      Do not define absolute paths inside of the workflow, since this renders
113      your workflow irreproducible on other machines. Use path relative to the
114      working directory instead, or make the path configurable via a config
115      file.
116      Also see:
117      https://snakemake.readthedocs.io/en/latest/snakefiles/configuration.html#configuration
118    * Absolute path "/'+wildcards.project+" in line 23:
119      Do not define absolute paths inside of the workflow, since this renders
120      your workflow irreproducible on other machines. Use path relative to the
121      working directory instead, or make the path configurable via a config
122      file.
123      Also see:
124      https://snakemake.readthedocs.io/en/latest/snakefiles/configuration.html#configuration
125    * Absolute path "/'+wildcards.tumor+" in line 23:
126      Do not define absolute paths inside of the workflow, since this renders
127      your workflow irreproducible on other machines. Use path relative to the
128      working directory instead, or make the path configurable via a config
129      file.
130      Also see:
131      https://snakemake.readthedocs.io/en/latest/snakefiles/configuration.html#configuration
132    * Absolute path "/'+wildcards.project+" in line 26:
133      Do not define absolute paths inside of the workflow, since this renders
134      your workflow irreproducible on other machines. Use path relative to the
135      working directory instead, or make the path configurable via a config
136      file.
137      Also see:
138      https://snakemake.readthedocs.io/en/latest/snakefiles/configuration.html#configuration
139    * Absolute path "/'+wildcards.normal+" in line 26:
140      Do not define absolute paths inside of the workflow, since this renders
141      your workflow irreproducible on other machines. Use path relative to the
142      working directory instead, or make the path configurable via a config
143      file.
144      Also see:
145      https://snakemake.readthedocs.io/en/latest/snakefiles/configuration.html#configuration
146    * Absolute path "/'+wildcards.project+" in line 29:
147      Do not define absolute paths inside of the workflow, since this renders
148      your workflow irreproducible on other machines. Use path relative to the
149      working directory instead, or make the path configurable via a config
150      file.
151      Also see:
152      https://snakemake.readthedocs.io/en/latest/snakefiles/configuration.html#configuration
153    * Absolute path "/'+wildcards.sample+" in line 29:
154      Do not define absolute paths inside of the workflow, since this renders
155      your workflow irreproducible on other machines. Use path relative to the
156      working directory instead, or make the path configurable via a config
157      file.
158      Also see:
159      https://snakemake.readthedocs.io/en/latest/snakefiles/configuration.html#configuration
160    * Absolute path "/primary/'+config["sequencer"]+" in line 36:
161      Do not define absolute paths inside of the workflow, since this renders
162      your workflow irreproducible on other machines. Use path relative to the
163      working directory instead, or make the path configurable via a config
164      file.
165      Also see:
166      https://snakemake.readthedocs.io/en/latest/snakefiles/configuration.html#configuration
167    * Absolute path "/tiered/chgi_data/analysis" in line 53:
168      Do not define absolute paths inside of the workflow, since this renders
169      your workflow irreproducible on other machines. Use path relative to the
170      working directory instead, or make the path configurable via a config
171      file.
172      Also see:
173      https://snakemake.readthedocs.io/en/latest/snakefiles/configuration.html#configuration
174    * Absolute path "/bulk/chgi_analysis" in line 53:
175      Do not define absolute paths inside of the workflow, since this renders
176      your workflow irreproducible on other machines. Use path relative to the
177      working directory instead, or make the path configurable via a config
178      file.
179      Also see:
180      https://snakemake.readthedocs.io/en/latest/snakefiles/configuration.html#configuration
181    * Absolute path "/export/chgi_data/analysis" in line 54:
182      Do not define absolute paths inside of the workflow, since this renders
183      your workflow irreproducible on other machines. Use path relative to the
184      working directory instead, or make the path configurable via a config
185      file.
186      Also see:
187      https://snakemake.readthedocs.io/en/latest/snakefiles/configuration.html#configuration
188    * Absolute path "/bulk/chgi_analysis" in line 54:
189      Do not define absolute paths inside of the workflow, since this renders
190      your workflow irreproducible on other machines. Use path relative to the
191      working directory instead, or make the path configurable via a config
192      file.
193      Also see:
194      https://snakemake.readthedocs.io/en/latest/snakefiles/configuration.html#configuration
195    * Absolute path "/tiered/chgi_data/analysis" in line 55:
196      Do not define absolute paths inside of the workflow, since this renders
197      your workflow irreproducible on other machines. Use path relative to the
198      working directory instead, or make the path configurable via a config
199      file.
200      Also see:
201
202... (truncated)

Formatting results

1[DEBUG] 
2[DEBUG] In file "/tmp/tmpj7myomrt/workflow/rules/resources.smk":  Formatted content is different from original
3[DEBUG] 
4[DEBUG] In file "/tmp/tmpj7myomrt/workflow/rules/fastq_list.smk":  Formatted content is different from original
5[DEBUG] 
6[ERROR] In file "/tmp/tmpj7myomrt/workflow/rules/djerba.smk":  InvalidPython: Black error:

Cannot parse for target version Python 3.12: 1:88: “workflow/scripts/generate_djerba.py {input.somatic_snv_vcf} {input.somatic_cnv_vcf} “ +

(Note reported line number may be incorrect, as snakefmt could not determine the true line number)


[DEBUG] In file "/tmp/tmpj7myomrt/workflow/rules/djerba.smk":  
[DEBUG] In file "/tmp/tmpj7myomrt/workflow/rules/samplesheet.smk":  Formatted content is different from original
[DEBUG] 
[DEBUG] In file "/tmp/tmpj7myomrt/workflow/rules/metadata.smk":  Formatted content is different from original
[DEBUG] 
[DEBUG] In file "/tmp/tmpj7myomrt/workflow/rules/msi.smk":  Formatted content is different from original
[DEBUG] 
[ERROR] In file "/tmp/tmpj7myomrt/workflow/rules/rna.smk":  InvalidPython: Black error:

Cannot parse for target version Python 3.12: 1:215: “perl -F\t -ane ‘BEGIN{{print “Hugo_Symbol\t{wildcards.tumor}\n”; for(split /\n/s, gzip -cd {input.annotations}| grep gene_name){{$gene2hugo{{$1}} = $2 if /gene_id “(\S+)”.*gene_name “(\S+)”/}}}}” +

(Note reported line number may be incorrect, as snakefmt could not determine the true line number)


[DEBUG] In file "/tmp/tmpj7myomrt/workflow/rules/rna.smk":  
[DEBUG] In file "/tmp/tmpj7myomrt/workflow/rules/bcl_conversion.smk":  Formatted content is different from original
[DEBUG] 
[DEBUG] In file "/tmp/tmpj7myomrt/workflow/rules/metrics.smk":  Formatted content is different from original
[DEBUG] 
[DEBUG] In file "/tmp/tmpj7myomrt/workflow/rules/karyoploter.smk":  Formatted content is different from original
[DEBUG] 
[DEBUG] In file "/tmp/tmpj7myomrt/workflow/rules/datavzrd.smk":  Formatted content is different from original
[DEBUG] 
[DEBUG] In file "/tmp/tmpj7myomrt/workflow/rules/qiagen_ipa.smk":  Formatted content is different from original
[DEBUG] 
[DEBUG] In file "/tmp/tmpj7myomrt/workflow/rules/dna.smk":  Formatted content is different from original
[DEBUG] 
[DEBUG] In file "/tmp/tmpj7myomrt/workflow/rules/filter_variants.smk":  Formatted content is different from original
[DEBUG] 
[ERROR] In file "/tmp/tmpj7myomrt/workflow/rules/pcgr.smk":  InvalidPython: Black error:

Cannot parse for target version Python 3.12: 1:146: “workflow/scripts/generate_pcgr.py {input.tumor_site_code_file} {input.cpsr} {input.cpsr_yaml} {input.somatic_snv_vcf} {input.somatic_cnv_vcf} “ +

(Note reported line number may be incorrect, as snakefmt could not determine the true line number)


[DEBUG] In file "/tmp/tmpj7myomrt/workflow/rules/pcgr.smk":  
[ERROR] In file "/tmp/tmpj7myomrt/workflow/rules/data_export.smk":  InvalidPython: Black error:

Cannot parse for target version Python 3.12: 14:0: shell(“gzip -cd {input.somatic_snv_vcf} | perl -F\t -ane ‘$F[0] =~ s/^chr//; print join(“\t”, $F[0], $F[1], $F[1]+length($F[3])-1, $F[3], $F[4], $2, $1, “{wildcards.tumor}”), “\n” if not /^#/ and $F[$#F] =~ /^[^:]+:[^:]+:(\d+),(\d+)/’ > {output.maf}”)

(Note reported line number may be incorrect, as snakefmt could not determine the true line number)


[DEBUG] In file "/tmp/tmpj7myomrt/workflow/rules/data_export.smk":  
[DEBUG] In file "/tmp/tmpj7myomrt/workflow/rules/bams.smk":  Formatted content is different from original
[DEBUG] 
[DEBUG] In file "/tmp/tmpj7myomrt/workflow/Snakefile":  Formatted content is different from original
[INFO] 4 file(s) raised parsing errors 🤕
[INFO] 14 file(s) would be changed 😬

snakefmt version: 0.10.2