Workflow (hybrid) metagenomic assembly and binning + GEMs
Accepts both Illumina and Long reads (ONT/PacBio)
-
Workflow Illumina Quality: https://workflowhub.eu/workflows/336?version=1
-
Workflow LongRead Quality: https://workflowhub.eu/workflows/337
-
Kraken2 taxonomic classification of FASTQ reads
-
SPAdes/Flye (Assembly)
-
QUAST (Assembly quality report)
Workflow binning https://workflowhub.eu/workflows/64?version=11 (optional)
- Metabat2/MaxBin2/SemiBin
- DAS Tool
- CheckM
- BUSCO
- GTDB-Tk
Workflow Genome-scale metabolic models https://workflowhub.eu/workflows/372 (optional)
- CarveMe (GEM generation)
- MEMOTE (GEM test suite)
- SMETANA (Species METabolic interaction ANAlysis)
Other UNLOCK workflows on WorkflowHub: https://workflowhub.eu/projects/16/workflows?view=default
All tool CWL files and other workflows can be found here:
https://gitlab.com/m-unlock/cwl
How to setup and use an UNLOCK workflow:
https://m-unlock.gitlab.io/docs/setup/setup.html
Click and drag the diagram to pan, double click or use the controls to zoom.
Inputs
| ID | Name | Description | Type |
|---|---|---|---|
| identifier | Identifier used | Identifier for this dataset used in this workflow |
|
| illumina_forward_reads | Forward reads | Forward sequence file path |
|
| illumina_reverse_reads | Reverse reads | Reverse sequence file path |
|
| pacbio_reads | PacBio reads | File with PacBio reads locally |
|
| nanopore_reads | PacBio reads | File with PacBio reads locally |
|
| filter_references | Contamination reference file | bbmap reference fasta file paths for contamination filtering |
|
| use_reference_mapped_reads | Keep mapped reads | Continue with reads mapped to the given reference |
|
| keep_filtered_reads | Keep filtered reads | Keep filtered reads in the final output |
|
| deduplicate | Deduplicate reads | Remove exact duplicate reads with fastp |
|
| kraken_database | Kraken2 database | Absolute path with database location of kraken2 |
|
| gtdbtk_data | gtdbtk data directory | Directory containing the GTDBTK repository |
|
| busco_data | BUSCO dataset | Path to the BUSCO dataset download location |
|
| ont_basecall_model | ONT Basecalling model | Basecalling model used with guppy default r941_min_high. Available: r941_trans, r941_flip213, r941_flip235, r941_min_fast, r941_min_high, r941_prom_fast, r941_prom_high. (required) |
|
| pilon_fixlist | Pilon fix list | A comma-separated list of categories of issues to try to fix |
|
| metagenome | When working with metagenomes | Metagenome option for assemblers |
|
| run_spades | Use SPAdes | Run with SPAdes assembler |
|
| run_flye | Use Flye | Run with Flye assembler |
|
| run_pilon | Use Pilon | Run with Pilon illumina assembly polishing |
|
| binning | Run binning workflow | Run with contig binning workflow |
|
| run_GEM | Run GEM workflow | Run the community genomescale metabolic models workflow on bins |
|
| run_smetana | Run SMETANA | Run SMETANA (Species METabolic interaction ANAlysis) |
|
| threads | Number of threads | Number of threads to use for computational processes |
|
| memory | Memory usage (MB) | Maximum memory usage in megabytes |
|
| destination | Output Destination (prov only) | Not used in this workflow. Output destination used for cwl-prov reporting only. |
|
Steps
| ID | Name | Description |
|---|---|---|
| workflow_quality_illumina | Quality and filtering workflow | Quality assessment of illumina reads with rRNA filtering option |
| workflow_quality_nanopore | Nanopore quality and filtering workflow | Quality and filtering workflow for nanopore reads |
| nanopore_kraken2 | Kraken2 Nanopore | Taxonomic classification of nanopore FASTQ reads |
| illumina_kraken2 | Kraken2 Illumina | Taxonomic classification of illumina FASTQ reads |
| kraken2_compress | Compress kraken2 | Compress large kraken2 report file |
| kraken2_krona | Krona Kraken2 | Visualization of kraken2 with Krona |
| spades | SPAdes assembly | Genome assembly using spades with illumina/pacbio reads |
| compress_spades | SPAdes compressed | Compress the large Spades assembly output files |
| flye | Nanopore Flye assembly | De novo assembly of single-molecule reads with Flye |
| medaka | Medaka polishing of assembly | Medaka for polishing of assembled genome |
| metaquast_medaka | assembly evaluation | evaluation of polished assembly with metaQUAST |
| workflow_pilon | Pilon worklow | Illumina reads assembly polishing with Pilon |
| metaquast_pilon | Illumina assembly evaluation | Illumina evaluation of pilon polished assembly with metaQUAST |
| bbmap | BBmap read mapping | Illumina read mapping using BBmap on assembled contigs |
| sam_to_sorted_bam | sam conversion to sorted bam | Sam file conversion to a sorted indexed bam file |
| contig_read_counts | Samtools idxstats | Reports alignment summary statistics |
| workflow_binning | Binning workflow | Binning workflow to create bins |
| workflow_GEM | GEM workflow | CarveMe community genomescale metabolic models workflow from bins |
| keep_readfilter_files_to_folder | Read filtering output folder | Preparation of read filtering output files to a specific output folder |
| readfilter_files_to_folder | Read filtering output folder | Preparation of read filtering output files to a specific output folder |
| kraken2_files_to_folder | Kraken2 output folder | Preparation of Kraken2 output files to a specific output folder |
| spades_files_to_folder | SPADES output to folder | Preparation of SPAdes output files to a specific output folder |
| flye_files_to_folder | Flye output folder | Preparation of Flye output files to a specific output folder |
| metaquast_medaka_files_to_folder | Nanopore metaQUAST output folder | Preparation of metaQUAST output files to a specific output folder |
| medaka_files_to_folder | Medaka output folder | Preparation of Medaka output files to a specific output folder |
| metaquast_pilon_files_to_folder | Illumina metaQUAST output folder | Preparation of QUAST output files to a specific output folder |
| pilon_files_to_folder | Pilon output folder | Preparation of pilon output files to a specific output folder |
| assembly_files_to_folder | Flye output folder | Preparation of Flye output files to a specific output folder |
| binning_files_to_folder | Binning output to folder | Preparation of binning output files and folders to a specific output folder |
| GEM_files_to_folder | GEM workflow output to folder | Preparation of GEM workflow output files and folders to a specific output folder |
Outputs
| ID | Name | Description | Type |
|---|---|---|---|
| read_filtering_output_keep | Read filtering output | Read filtering stats + filtered reads |
|
| read_filtering_output | Read filtering output | Read filtering stats + filtered reads |
|
| kraken2_output | Kraken2 reports | Kraken2 taxonomic classification reports |
|
| assembly_output | Assembly output | Output from different assembly steps |
|
| binning_output | Binning output | Binning outputfolders |
|
| gem_output | Community GEM output | Community GEM output folder |
|
Version History
Version 2 (latest) Created 9th Sep 2025 at 13:28 by Bart Nijsse
Major changes: This version changes the way read filtering is performed and replaces DAStool with Binette.
Frozen
Version-2
d1190f4
WFP Created 16th Dec 2024 at 07:46 by Bart Nijsse
Workflow version used in analysis: "A metadata managed FAIR end-to-end workflow for microbial community Omics data analysis"
Frozen
WFP
7c7adba
Version 1 (earliest) Created 14th Jun 2022 at 09:14 by Bart Nijsse
Initial commit
Frozen
Version-1
1e42c47
Creators and SubmitterCreators
Submitter
Views: 7993 Downloads: 1360
Created: 14th Jun 2022 at 09:14
Last updated: 9th Sep 2025 at 14:47
AttributionsNone
Visit source
https://orcid.org/0000-0001-8172-8981