WORKFLOW classify_multi
| File Path |
pipes/WDL/workflows/classify_multi.wdl
|
|---|---|
| WDL Version | 1.0 |
| Type | workflow |
Imports
| Namespace | Path |
|---|---|
metagenomics
|
../tasks/tasks_metagenomics.wdl
|
read_utils
|
../tasks/tasks_read_utils.wdl
|
taxon_filter
|
../tasks/tasks_taxon_filter.wdl
|
assembly
|
../tasks/tasks_assembly.wdl
|
reports
|
../tasks/tasks_reports.wdl
|
Workflow: classify_multi
Runs raw reads through taxonomic classification (Kraken2), human read depletion (based on Kraken2), de novo assembly (SPAdes), and FASTQC/multiQC of reads.
Author: Broad Viral Genomics
Inputs
| Name | Type | Description | Default |
|---|---|---|---|
reads_bams
|
Array[File]+
|
Reads to classify. May be unmapped or mapped or both, paired-end or single-end. | - |
ncbi_taxdump_tgz
|
File
|
An NCBI taxdump.tar.gz file that contains, at the minimum, a nodes.dmp and names.dmp file. | - |
spikein_db
|
File
|
ERCC spike-in sequences | - |
trim_clip_db
|
File
|
Adapter sequences to remove via trimmomatic prior to SPAdes assembly | - |
kraken2_db_tgz
|
File
|
Pre-built Kraken database tarball containing three files: hash.k2d, opts.k2d, and taxo.k2d. | - |
krona_taxonomy_db_kraken2_tgz
|
File
|
Krona taxonomy database containing a single file: taxonomy.tab, or possibly just a compressed taxonomy.tab | - |
machine_mem_gb
|
Int?
|
- | - |
min_base_qual
|
Int?
|
- | - |
taxonomic_ids
|
Array[Int]?
|
- | - |
minimum_hit_groups
|
Int?
|
- | - |
taxonomic_ids
|
Array[Int]?
|
- | - |
minimum_hit_groups
|
Int?
|
- | - |
spades_min_contig_len
|
Int?
|
- | - |
spades_options
|
String?
|
- | - |
machine_mem_gb
|
Int?
|
- | - |
title
|
String?
|
- | - |
comment
|
String?
|
- | - |
template
|
String?
|
- | - |
tag
|
String?
|
- | - |
ignore_analysis_files
|
String?
|
- | - |
ignore_sample_names
|
String?
|
- | - |
sample_names
|
File?
|
- | - |
exclude_modules
|
Array[String]?
|
- | - |
module_to_use
|
Array[String]?
|
- | - |
output_data_format
|
String?
|
- | - |
config
|
File?
|
- | - |
config_yaml
|
String?
|
- | - |
title
|
String?
|
- | - |
comment
|
String?
|
- | - |
template
|
String?
|
- | - |
tag
|
String?
|
- | - |
ignore_analysis_files
|
String?
|
- | - |
ignore_sample_names
|
String?
|
- | - |
sample_names
|
File?
|
- | - |
exclude_modules
|
Array[String]?
|
- | - |
module_to_use
|
Array[String]?
|
- | - |
output_data_format
|
String?
|
- | - |
config
|
File?
|
- | - |
config_yaml
|
String?
|
- | - |
title
|
String?
|
- | - |
comment
|
String?
|
- | - |
template
|
String?
|
- | - |
tag
|
String?
|
- | - |
ignore_analysis_files
|
String?
|
- | - |
ignore_sample_names
|
String?
|
- | - |
sample_names
|
File?
|
- | - |
exclude_modules
|
Array[String]?
|
- | - |
module_to_use
|
Array[String]?
|
- | - |
output_data_format
|
String?
|
- | - |
config
|
File?
|
- | - |
config_yaml
|
String?
|
- | - |
query_column
|
Int?
|
- | - |
taxid_column
|
Int?
|
- | - |
score_column
|
Int?
|
- | - |
magnitude_column
|
Int?
|
- | - |
72 optional inputs with default values |
|||
Outputs
| Name | Type | Expression |
|---|---|---|
cleaned_reads_unaligned_bams
|
Array[File]
|
deplete.bam_filtered_to_taxa
|
deduplicated_reads_unaligned
|
Array[File]
|
rmdup_ubam.dedup_bam
|
contigs_fastas
|
Array[File]
|
spades.contigs_fasta
|
read_counts_raw
|
Array[Int]
|
deplete.classified_taxonomic_filter_read_count_pre
|
read_counts_depleted
|
Array[Int]
|
deplete.classified_taxonomic_filter_read_count_post
|
read_counts_dedup
|
Array[Int]
|
rmdup_ubam.dedup_read_count_post
|
read_counts_prespades_subsample
|
Array[Int]
|
spades.subsample_read_count
|
multiqc_report_raw
|
File
|
multiqc_raw.multiqc_report
|
multiqc_report_cleaned
|
File
|
multiqc_cleaned.multiqc_report
|
multiqc_report_dedup
|
File
|
multiqc_dedup.multiqc_report
|
spikein_counts
|
File
|
spike_summary.count_summary
|
kraken2_merged_krona
|
File
|
krona_merge_kraken2.krona_report_html
|
kraken2_summary
|
File
|
metag_summary_report.krakenuniq_aggregate_taxlevel_summary
|
kraken2_summary_reports
|
Array[File]
|
kraken2.kraken2_summary_report
|
kraken2_krona_by_sample
|
Array[File]
|
kraken2.krona_report_html
|
kraken2_viral_classify_version
|
String
|
kraken2.viralngs_version[0]
|
deplete_viral_classify_version
|
String
|
deplete.viralngs_version[0]
|
spades_viral_assemble_version
|
String
|
spades.viralngs_version[0]
|
Calls
This workflow calls the following tasks or subworkflows:
CALL
TASKS
fastqc_raw
→ fastqc
Input Mappings (1)
| Input | Value |
|---|---|
reads_bam
|
raw_reads
|
CALL
TASKS
spikein
→ align_and_count
Input Mappings (2)
| Input | Value |
|---|---|
reads_bam
|
raw_reads
|
ref_db
|
spikein_db
|
CALL
TASKS
kraken2
Input Mappings (3)
| Input | Value |
|---|---|
reads_bam
|
raw_reads
|
kraken2_db_tgz
|
kraken2_db_tgz
|
krona_taxonomy_db_tgz
|
krona_taxonomy_db_kraken2_tgz
|
CALL
TASKS
deplete
→ filter_bam_to_taxa
Input Mappings (6)
| Input | Value |
|---|---|
classified_bam
|
raw_reads
|
classified_reads_txt_gz
|
kraken2.kraken2_reads_report
|
ncbi_taxonomy_db_tgz
|
ncbi_taxdump_tgz
|
exclude_taxa
|
true
|
taxonomic_names
|
["Vertebrata"]
|
out_filename_suffix
|
"hs_depleted"
|
CALL
TASKS
fastqc_cleaned
→ fastqc
Input Mappings (1)
| Input | Value |
|---|---|
reads_bam
|
deplete.bam_filtered_to_taxa
|
CALL
TASKS
filter_acellular
→ filter_bam_to_taxa
Input Mappings (6)
| Input | Value |
|---|---|
classified_bam
|
raw_reads
|
classified_reads_txt_gz
|
kraken2.kraken2_reads_report
|
ncbi_taxonomy_db_tgz
|
ncbi_taxdump_tgz
|
exclude_taxa
|
true
|
taxonomic_names
|
["Vertebrata", "other sequences", "Bacteria"]
|
out_filename_suffix
|
"acellular"
|
CALL
TASKS
rmdup_ubam
Input Mappings (1)
| Input | Value |
|---|---|
reads_unmapped_bam
|
clean_reads
|
CALL
TASKS
spades
→ assemble
Input Mappings (3)
| Input | Value |
|---|---|
reads_unmapped_bam
|
rmdup_ubam.dedup_bam
|
trim_clip_db
|
trim_clip_db
|
always_succeed
|
true
|
CALL
TASKS
multiqc_raw
→ MultiQC
Input Mappings (2)
| Input | Value |
|---|---|
input_files
|
fastqc_raw.fastqc_zip
|
file_name
|
"multiqc-raw.html"
|
CALL
TASKS
multiqc_cleaned
→ MultiQC
Input Mappings (2)
| Input | Value |
|---|---|
input_files
|
fastqc_cleaned.fastqc_zip
|
file_name
|
"multiqc-cleaned.html"
|
CALL
TASKS
multiqc_dedup
→ MultiQC
Input Mappings (2)
| Input | Value |
|---|---|
input_files
|
rmdup_ubam.dedup_fastqc_zip
|
file_name
|
"multiqc-dedup.html"
|
CALL
TASKS
spike_summary
→ align_and_count_summary
Input Mappings (1)
| Input | Value |
|---|---|
counts_txt
|
spikein.report
|
CALL
TASKS
metag_summary_report
→ aggregate_metagenomics_reports
Input Mappings (1)
| Input | Value |
|---|---|
kraken_summary_reports
|
kraken2.kraken2_summary_report
|
CALL
TASKS
krona_merge_kraken2
→ krona
Input Mappings (4)
| Input | Value |
|---|---|
reports_txt_gz
|
kraken2.kraken2_summary_report
|
krona_taxonomy_db_tgz
|
krona_taxonomy_db_kraken2_tgz
|
input_type
|
"kraken2"
|
out_basename
|
"merged-kraken2.krona"
|
Images
Container images used by tasks in this workflow:
~{docker}
~{docker}
Used by 8 tasks:
-
multiqc_raw -
multiqc_cleaned -
multiqc_dedup -
spike_summary -
metag_summary_report -
fastqc_raw -
spikein -
fastqc_cleaned
Parameterized Image
⚙️ Parameterized
Configured via input:
docker
Used by 4 tasks:
-
krona_merge_kraken2 -
kraken2 -
deplete -
filter_acellular
Parameterized Image
⚙️ Parameterized
Configured via input:
docker
Used by 1 task:
-
rmdup_ubam
Parameterized Image
⚙️ Parameterized
Configured via input:
docker
Used by 1 task:
-
spades