WORKFLOW
augur_from_mltree
File Path
pipes/WDL/workflows/augur_from_mltree.wdl
WDL Version
1.0
Type
workflow
Imports
Namespace
Path
nextstrain
../tasks/tasks_nextstrain.wdl
Workflow: augur_from_mltree
Take a premade maximum likelihood tree (Newick format) and run the remainder of the augur pipeline (timetree modificaitons, ancestral inference, etc) and convert to json representation suitable for Nextstrain visualization. See https://nextstrain.org/docs/getting-started/ and https://nextstrain-augur.readthedocs.io/en/stable/
Author: Broad Viral Genomics
Name
Type
Description
Default
raw_tree
File
Maximum likelihood tree (newick format).
-
msa_or_vcf
File
Multiple sequence alignment (aligned fasta) or variants (vcf format).
-
sample_metadata
File
Metadata in tab-separated text format. See https://nextstrain-augur.readthedocs.io/en/stable/faq/metadata.html for details.
-
ref_fasta
File
A reference assembly (not included in assembly_fastas) to align assembly_fastas against. Typically from NCBI RefSeq or similar.
-
genbank_gb
File
A 'genbank' formatted gene annotation file that is used to calculate coding consequences of observed mutations. Must correspond to the same coordinate space as ref_fasta. Typically downloaded from the same NCBI accession number as ref_fasta.
-
auspice_config
File
A file specifying options to customize the auspice export; see: https://nextstrain.github.io/auspice/customise-client/introduction
-
clades_tsv
File?
A TSV file containing clade mutation positions in four columns: [clade gene site alt]; see: https://nextstrain.org/docs/tutorials/defining-clades
-
ancestral_traits_to_infer
Array[String]?
A list of metadata traits to use for ancestral node inference (see https://nextstrain-augur.readthedocs.io/en/stable/usage/cli/traits.html). Multiple traits may be specified; must correspond exactly to column headers in metadata file. Omitting these values will skip ancestral trait inference, and ancestral nodes will not have estimated values for metadata.
-
gen_per_year
Int?
-
-
clock_rate
Float?
-
-
clock_std_dev
Float?
-
-
root
String?
-
-
covariance
Boolean?
-
-
precision
Int?
-
-
branch_length_inference
String?
-
-
coalescent
String?
-
-
vcf_reference
File?
-
-
weights
File?
-
-
sampling_bias_correction
Float?
-
-
vcf_reference
File?
-
-
root_sequence
File?
-
-
output_vcf
File?
-
-
genes
File?
-
-
vcf_reference_output
File?
-
-
vcf_reference
File?
-
-
lat_longs_tsv
File?
-
-
colors_tsv
File?
-
-
geo_resolutions
Array[String]?
-
-
color_by_metadata
Array[String]?
-
-
description_md
File?
-
-
maintainers
Array[String]?
-
-
title
String?
-
-
29 optional inputs with default values
generate_timetree
Boolean
-
true
keep_root
Boolean
-
true
keep_polytomies
Boolean
-
false
date_confidence
Boolean
-
true
date_inference
String?
-
"marginal"
clock_filter_iqd
Int?
-
4
divergence_units
String?
-
"mutations"
docker
String
-
"docker.io/nextstrain/base:build-20240318T173028Z"
disk_size
Int
-
750
machine_mem_gb
Int
-
75
confidence
Boolean
-
true
machine_mem_gb
Int
-
32
docker
String
-
"docker.io/nextstrain/base:build-20240318T173028Z"
disk_size
Int
-
750
inference
String
-
"joint"
keep_ambiguous
Boolean
-
false
infer_ambiguous
Boolean
-
false
keep_overhangs
Boolean
-
false
docker
String
-
"docker.io/nextstrain/base:build-20240318T173028Z"
disk_size
Int
-
300
docker
String
-
"docker.io/nextstrain/base:build-20240318T173028Z"
disk_size
Int
-
300
docker
String
-
"docker.io/nextstrain/base:build-20240318T173028Z"
disk_size
Int
-
300
include_root_sequence
Boolean
-
true
out_basename
String
-
basename(basename(tree,".nwk"),"_timetree")
machine_mem_gb
Int
-
64
docker
String
-
"docker.io/nextstrain/base:build-20240318T173028Z"
disk_size
Int
-
300
Outputs
Name
Type
Expression
time_tree
File
refine_augur_tree.tree_refined
auspice_input_json
File
export_auspice_json.virus_json
Calls
This workflow calls the following tasks or subworkflows:
Input Mappings (3)
Input
Value
raw_tree
raw_tree
msa_or_vcf
msa_or_vcf
metadata
sample_metadata
Input Mappings (3)
Input
Value
tree
refine_augur_tree.tree_refined
metadata
sample_metadata
columns
select_first([ancestral_traits_to_infer, []])
Input Mappings (2)
Input
Value
tree
refine_augur_tree.tree_refined
msa_or_vcf
msa_or_vcf
Input Mappings (3)
Input
Value
tree
refine_augur_tree.tree_refined
nt_muts
ancestral_tree.nt_muts_json
genbank_gb
genbank_gb
Input Mappings (5)
Input
Value
tree_nwk
refine_augur_tree.tree_refined
nt_muts_json
ancestral_tree.nt_muts_json
aa_muts_json
translate_augur_tree.aa_muts_json
ref_fasta
ref_fasta
clades_tsv
select_first([clades_tsv])
Input Mappings (4)
Input
Value
tree
refine_augur_tree.tree_refined
sample_metadata
sample_metadata
node_data_jsons
select_all([refine_augur_tree.branch_lengths, ancestral_traits.node_data_json, ancestral_tree.nt_muts_json, translate_augur_tree.aa_muts_json, assign_clades_to_nodes.node_clade_data_json])
auspice_config
auspice_config
Images
Container images used by tasks in this workflow:
⚙️ Parameterized
Configured via input:
docker
Used by 6 tasks:
refine_augur_tree
ancestral_tree
translate_augur_tree
export_auspice_json
ancestral_traits
assign_clades_to_nodes
Zoom In
Zoom Out
Fit
Reset
🖱️ Scroll to zoom • Drag to pan • Double-click to reset • ESC to close
flowchart TD
Start([augur_from_mltree])
N1["refine_augur_tree"]
subgraph C1 ["↔️ if defined(ancestral_traits_to_infer) && length(select_first([ancestral_traits_to_infer, []])) > 0"]
direction TB
N2["ancestral_traits"]
end
N3["ancestral_tree"]
N4["translate_augur_tree"]
subgraph C2 ["↔️ if defined(clades_tsv)"]
direction TB
N5["assign_clades_to_nodes"]
end
N6["export_auspice_json"]
N1 --> N2
N1 --> N3
N3 --> N4
N1 --> N4
N3 --> N5
N1 --> N5
N4 --> N5
N5 --> N6
N1 --> N6
N4 --> N6
N3 --> N6
N2 --> N6
Start --> N1
N6 --> End([End])
classDef taskNode fill:#a371f7,stroke:#8b5cf6,stroke-width:2px,color:#fff
classDef workflowNode fill:#58a6ff,stroke:#1f6feb,stroke-width:2px,color:#fff
version 1.0
import "../tasks/tasks_nextstrain.wdl" as nextstrain
workflow augur_from_mltree {
meta {
description: "Take a premade maximum likelihood tree (Newick format) and run the remainder of the augur pipeline (timetree modificaitons, ancestral inference, etc) and convert to json representation suitable for Nextstrain visualization. See https://nextstrain.org/docs/getting-started/ and https://nextstrain-augur.readthedocs.io/en/stable/"
author: "Broad Viral Genomics"
email: "viral-ngs@broadinstitute.org"
}
input {
File raw_tree
File msa_or_vcf
File sample_metadata
File ref_fasta
File genbank_gb
File auspice_config
File? clades_tsv
Array[String]? ancestral_traits_to_infer
}
parameter_meta {
raw_tree: {
description: "Maximum likelihood tree (newick format).",
patterns: ["*.nwk", "*.newick"]
}
msa_or_vcf: {
description: "Multiple sequence alignment (aligned fasta) or variants (vcf format).",
patterns: ["*.fasta", "*.fa", "*.vcf", "*.vcf.gz"]
}
sample_metadata: {
description: "Metadata in tab-separated text format. See https://nextstrain-augur.readthedocs.io/en/stable/faq/metadata.html for details.",
patterns: ["*.txt", "*.tsv"]
}
ref_fasta: {
description: "A reference assembly (not included in assembly_fastas) to align assembly_fastas against. Typically from NCBI RefSeq or similar.",
patterns: ["*.fasta", "*.fa"]
}
genbank_gb: {
description: "A 'genbank' formatted gene annotation file that is used to calculate coding consequences of observed mutations. Must correspond to the same coordinate space as ref_fasta. Typically downloaded from the same NCBI accession number as ref_fasta.",
patterns: ["*.gb", "*.gbf"]
}
ancestral_traits_to_infer: {
description: "A list of metadata traits to use for ancestral node inference (see https://nextstrain-augur.readthedocs.io/en/stable/usage/cli/traits.html). Multiple traits may be specified; must correspond exactly to column headers in metadata file. Omitting these values will skip ancestral trait inference, and ancestral nodes will not have estimated values for metadata."
}
auspice_config: {
description: "A file specifying options to customize the auspice export; see: https://nextstrain.github.io/auspice/customise-client/introduction",
patterns: ["*.json", "*.txt"]
}
clades_tsv: {
description: "A TSV file containing clade mutation positions in four columns: [clade gene site alt]; see: https://nextstrain.org/docs/tutorials/defining-clades",
patterns: ["*.tsv", "*.txt"]
}
}
call nextstrain.refine_augur_tree {
input:
raw_tree = raw_tree,
msa_or_vcf = msa_or_vcf,
metadata = sample_metadata
}
if(defined(ancestral_traits_to_infer) && length(select_first([ancestral_traits_to_infer,[]]))>0) {
call nextstrain.ancestral_traits {
input:
tree = refine_augur_tree.tree_refined,
metadata = sample_metadata,
columns = select_first([ancestral_traits_to_infer,[]])
}
}
call nextstrain.ancestral_tree {
input:
tree = refine_augur_tree.tree_refined,
msa_or_vcf = msa_or_vcf
}
call nextstrain.translate_augur_tree {
input:
tree = refine_augur_tree.tree_refined,
nt_muts = ancestral_tree.nt_muts_json,
genbank_gb = genbank_gb
}
if(defined(clades_tsv)) {
call nextstrain.assign_clades_to_nodes {
input:
tree_nwk = refine_augur_tree.tree_refined,
nt_muts_json = ancestral_tree.nt_muts_json,
aa_muts_json = translate_augur_tree.aa_muts_json,
ref_fasta = ref_fasta,
clades_tsv = select_first([clades_tsv])
}
}
call nextstrain.export_auspice_json {
input:
tree = refine_augur_tree.tree_refined,
sample_metadata = sample_metadata,
node_data_jsons = select_all([
refine_augur_tree.branch_lengths,
ancestral_traits.node_data_json,
ancestral_tree.nt_muts_json,
translate_augur_tree.aa_muts_json,
assign_clades_to_nodes.node_clade_data_json]),
auspice_config = auspice_config
}
output {
File time_tree = refine_augur_tree.tree_refined
File auspice_input_json = export_auspice_json.virus_json
}
}
version 1.0
import "../tasks/tasks_nextstrain.wdl" as nextstrain
workflow augur_from_mltree {
meta {
description: "Take a premade maximum likelihood tree (Newick format) and run the remainder of the augur pipeline (timetree modificaitons, ancestral inference, etc) and convert to json representation suitable for Nextstrain visualization. See https://nextstrain.org/docs/getting-started/ and https://nextstrain-augur.readthedocs.io/en/stable/"
author: "Broad Viral Genomics"
email: "viral-ngs@broadinstitute.org"
}
input {
File raw_tree
File msa_or_vcf
File sample_metadata
File ref_fasta
File genbank_gb
File auspice_config
File? clades_tsv
Array[String]? ancestral_traits_to_infer
}
parameter_meta {
raw_tree: {
description: "Maximum likelihood tree (newick format).",
patterns: ["*.nwk", "*.newick"]
}
msa_or_vcf: {
description: "Multiple sequence alignment (aligned fasta) or variants (vcf format).",
patterns: ["*.fasta", "*.fa", "*.vcf", "*.vcf.gz"]
}
sample_metadata: {
description: "Metadata in tab-separated text format. See https://nextstrain-augur.readthedocs.io/en/stable/faq/metadata.html for details.",
patterns: ["*.txt", "*.tsv"]
}
ref_fasta: {
description: "A reference assembly (not included in assembly_fastas) to align assembly_fastas against. Typically from NCBI RefSeq or similar.",
patterns: ["*.fasta", "*.fa"]
}
genbank_gb: {
description: "A 'genbank' formatted gene annotation file that is used to calculate coding consequences of observed mutations. Must correspond to the same coordinate space as ref_fasta. Typically downloaded from the same NCBI accession number as ref_fasta.",
patterns: ["*.gb", "*.gbf"]
}
ancestral_traits_to_infer: {
description: "A list of metadata traits to use for ancestral node inference (see https://nextstrain-augur.readthedocs.io/en/stable/usage/cli/traits.html). Multiple traits may be specified; must correspond exactly to column headers in metadata file. Omitting these values will skip ancestral trait inference, and ancestral nodes will not have estimated values for metadata."
}
auspice_config: {
description: "A file specifying options to customize the auspice export; see: https://nextstrain.github.io/auspice/customise-client/introduction",
patterns: ["*.json", "*.txt"]
}
clades_tsv: {
description: "A TSV file containing clade mutation positions in four columns: [clade gene site alt]; see: https://nextstrain.org/docs/tutorials/defining-clades",
patterns: ["*.tsv", "*.txt"]
}
}
call nextstrain.refine_augur_tree {
input:
raw_tree = raw_tree,
msa_or_vcf = msa_or_vcf,
metadata = sample_metadata
}
if(defined(ancestral_traits_to_infer) && length(select_first([ancestral_traits_to_infer,[]]))>0) {
call nextstrain.ancestral_traits {
input:
tree = refine_augur_tree.tree_refined,
metadata = sample_metadata,
columns = select_first([ancestral_traits_to_infer,[]])
}
}
call nextstrain.ancestral_tree {
input:
tree = refine_augur_tree.tree_refined,
msa_or_vcf = msa_or_vcf
}
call nextstrain.translate_augur_tree {
input:
tree = refine_augur_tree.tree_refined,
nt_muts = ancestral_tree.nt_muts_json,
genbank_gb = genbank_gb
}
if(defined(clades_tsv)) {
call nextstrain.assign_clades_to_nodes {
input:
tree_nwk = refine_augur_tree.tree_refined,
nt_muts_json = ancestral_tree.nt_muts_json,
aa_muts_json = translate_augur_tree.aa_muts_json,
ref_fasta = ref_fasta,
clades_tsv = select_first([clades_tsv])
}
}
call nextstrain.export_auspice_json {
input:
tree = refine_augur_tree.tree_refined,
sample_metadata = sample_metadata,
node_data_jsons = select_all([
refine_augur_tree.branch_lengths,
ancestral_traits.node_data_json,
ancestral_tree.nt_muts_json,
translate_augur_tree.aa_muts_json,
assign_clades_to_nodes.node_clade_data_json]),
auspice_config = auspice_config
}
output {
File time_tree = refine_augur_tree.tree_refined
File auspice_input_json = export_auspice_json.virus_json
}
}