WORKFLOW sarscov2_data_release
| File Path |
pipes/WDL/workflows/sarscov2_data_release.wdl
|
|---|---|
| WDL Version | 1.0 |
| Type | workflow |
Imports
| Namespace | Path |
|---|---|
ncbi_tools
|
../tasks/tasks_ncbi_tools.wdl
|
sarscov2
|
../tasks/tasks_sarscov2.wdl
|
terra
|
../tasks/tasks_terra.wdl
|
utils
|
../tasks/tasks_utils.wdl
|
Workflow: sarscov2_data_release
Submit data bundles to databases and repositories
Author: Broad Viral Genomics
Inputs
| Name | Type | Description | Default |
|---|---|---|---|
flowcell_id
|
String
|
- | - |
ncbi_ftp_config_js
|
File?
|
- | - |
genbank_xml
|
File
|
- | - |
genbank_zip
|
File
|
- | - |
sra_meta_tsv
|
File
|
- | - |
sra_bioproject
|
String
|
- | - |
sra_data_bucket_uri
|
String
|
- | - |
gisaid_auth_token
|
File?
|
- | - |
gisaid_csv
|
File?
|
- | - |
gisaid_fasta
|
File?
|
- | - |
gcs_out_reporting
|
String?
|
- | - |
cdc_s3_credentials
|
File?
|
- | - |
cdc_passing_fasta
|
File?
|
- | - |
cdc_final_metadata
|
File?
|
- | - |
cdc_cumulative_metadata
|
File?
|
- | - |
cdc_aligned_trimmed_bams
|
Array[File]
|
- | - |
cdc_s3_uri
|
String?
|
- | - |
dashboard_bucket_uri
|
String?
|
- | - |
nop_block
|
String?
|
- | - |
nop_block
|
String?
|
- | - |
nop_block
|
String?
|
- | - |
15 optional inputs with default values |
|||
Outputs
| Name | Type | Expression |
|---|---|---|
genbank_response
|
Array[File]
|
select_first([genbank_upload.reports_xmls, []])
|
sra_xml
|
File?
|
sra_tsv_to_xml.submission_xml
|
sra_response
|
Array[File]
|
select_first([sra_upload.reports_xmls, []])
|
Calls
This workflow calls the following tasks or subworkflows:
CALL
TASKS
genbank_upload
→ ncbi_sftp_upload
Input Mappings (5)
| Input | Value |
|---|---|
config_js
|
select_first([ncbi_ftp_config_js])
|
submission_xml
|
genbank_xml
|
additional_files
|
[genbank_zip]
|
target_path
|
"~{prefix}/genbank"
|
wait_for
|
"1"
|
CALL
TASKS
sra_tsv_to_xml
Input Mappings (4)
| Input | Value |
|---|---|
meta_submit_tsv
|
sra_meta_tsv
|
config_js
|
select_first([ncbi_ftp_config_js])
|
bioproject
|
sra_bioproject
|
data_bucket_uri
|
"~{sra_data_bucket_uri}/~{flowcell_id}"
|
CALL
TASKS
sra_upload
→ ncbi_sftp_upload
Input Mappings (5)
| Input | Value |
|---|---|
config_js
|
select_first([ncbi_ftp_config_js])
|
submission_xml
|
sra_tsv_to_xml.submission_xml
|
additional_files
|
[]
|
target_path
|
"~{prefix}/sra"
|
wait_for
|
"1"
|
CALL
TASKS
gisaid_uploader
Input Mappings (3)
| Input | Value |
|---|---|
gisaid_sequences_fasta
|
select_first([gisaid_fasta])
|
gisaid_meta_csv
|
select_first([gisaid_csv])
|
cli_auth_token
|
select_first([gisaid_auth_token])
|
CALL
TASKS
meta_sanitize
→ tsv_drop_cols
Input Mappings (3)
| Input | Value |
|---|---|
in_tsv
|
select_first([cdc_cumulative_metadata])
|
drop_cols
|
['internal_id', 'collaborator_id', 'matrix_id', 'hl7_message_id']
|
out_filename
|
"metadata-cumulative.txt"
|
CALL
TASKS
dashboard_delivery
→ gcs_copy
Input Mappings (2)
| Input | Value |
|---|---|
infiles
|
[meta_sanitize.out_tsv]
|
gcs_uri_prefix
|
select_first([dashboard_bucket_uri])
|
CALL
TASKS
meta_final_csv
→ tsv_to_csv
Input Mappings (1)
| Input | Value |
|---|---|
tsv
|
select_first([cdc_final_metadata])
|
CALL
TASKS
gcs_reporting_dump
→ gcs_copy
Input Mappings (2)
| Input | Value |
|---|---|
infiles
|
[meta_final_csv.csv]
|
gcs_uri_prefix
|
"~{gcs_out_reporting}/"
|
CALL
TASKS
today
Input Mappings (1)
| Input | Value |
|---|---|
timezone
|
"America/New_York"
|
CALL
TASKS
upload_complete
→ make_empty_file
Input Mappings (1)
| Input | Value |
|---|---|
out_filename
|
"uploadcomplete.txt"
|
CALL
TASKS
cumulative_meta_tsv
→ rename_file
Input Mappings (2)
| Input | Value |
|---|---|
infile
|
select_first([cdc_cumulative_metadata])
|
out_filename
|
"metadata-cumulative-~{today.date}.txt"
|
CALL
TASKS
s3_cdc_dump_cumulative
→ s3_copy
Input Mappings (3)
| Input | Value |
|---|---|
infiles
|
[cumulative_meta_tsv.out]
|
s3_uri_prefix
|
"~{cdc_s3_uri}/"
|
aws_credentials
|
select_first([cdc_s3_credentials])
|
CALL
TASKS
s3_cdc_dump_meta
→ s3_copy
Input Mappings (3)
| Input | Value |
|---|---|
infiles
|
select_all([cdc_final_metadata, cdc_passing_fasta])
|
s3_uri_prefix
|
"~{s3_prefix}/"
|
aws_credentials
|
select_first([cdc_s3_credentials])
|
CALL
TASKS
s3_cdc_dump_reads
→ s3_copy
Input Mappings (5)
| Input | Value |
|---|---|
infiles
|
cdc_aligned_trimmed_bams
|
s3_uri_prefix
|
"~{s3_prefix}/rawfiles/"
|
aws_credentials
|
select_first([cdc_s3_credentials])
|
disk_gb
|
3500
|
cpus
|
16
|
CALL
TASKS
s3_cdc_complete
→ s3_copy
Input Mappings (4)
| Input | Value |
|---|---|
infiles
|
[upload_complete.out]
|
s3_uri_prefix
|
"~{s3_prefix}/"
|
aws_credentials
|
select_first([cdc_s3_credentials])
|
nop_block
|
write_lines(flatten([s3_cdc_dump_reads.out_uris, s3_cdc_dump_meta.out_uris]))
|
Images
Container images used by tasks in this workflow:
Parameterized Image
⚙️ Parameterized
Configured via input:
docker
Used by 3 tasks:
-
genbank_upload -
sra_tsv_to_xml -
sra_upload
gisaid-cli
quay.io/broadinstitute/gisaid-cli:3.0
Used by 1 task:
-
gisaid_uploader
Parameterized Image
⚙️ Parameterized
Configured via input:
docker
Used by 1 task:
-
meta_sanitize
viral-baseimage
quay.io/broadinstitute/viral-baseimage:0.3.0
Used by 7 tasks:
-
dashboard_delivery -
gcs_reporting_dump -
today -
s3_cdc_dump_meta -
s3_cdc_dump_reads -
s3_cdc_complete -
s3_cdc_dump_cumulative
python
python:slim
Used by 1 task:
-
meta_final_csv
ubuntu
ubuntu
Used by 2 tasks:
-
upload_complete -
cumulative_meta_tsv