Workflows
What is a Workflow?Filters
Post-genome assembly quality control workflow using Quast, BUSCO, Meryl, Merqury and Fasta Statistics. Updates November 2023. Inputs: reads as fastqsanger.gz (not fastq.gz), and assembly.fasta. New default settings for BUSCO: lineage = eukaryota; for Quast: lineage = eukaryotes, genome = large. Reports assembly stats into a table called metrics.tsv, including selected metrics from Fasta Stats, and read coverage; reports BUSCO versions and dependencies; and displays these tables in the workflow ...
gene2phylo
gene2phylo is a snakemake pipeline for batch phylogenetic analysis of a given set of input genes.
Contents
Setup
The pipeline is written in Snakemake and uses conda to install the necessary tools.
It is strongly recommended to install conda using Mambaforge. See details here ...
The workflow takes a paired-reads collection (like illumina WGS or HiC), runs FastQC and SeqKit, trims with Fastp, and creates a MultiQC report. The main outputs are a paired collection of trimmed reads, a report with raw and trimmed reads stats, and a table with raw reads stats.
skim2rrna
skim2rrna is a snakemake pipeline for the batch assembly, annotation, and phylogenetic analysis of ribosomal genes from low coverage genome skims. The pipeline was designed to work with sequence data from museum collections. However, it should also work with genome skims from recently collected samples.
Contents
- Setup
- Example data
- Input
- Output
- Filtering contaminants
- [Assembly and annotation ...
skim2mt
skim2mt is a snakemake pipeline for the batch assembly, annotation, and phylogenetic analysis of mitochondrial genomes from low coverage genome skims. The pipeline was designed to work with sequence data from museum collections. However, it should also work with genome skims from recently collected samples.
Contents
- Setup
- Example data
- Input
- Output
- Filtering contaminants
- [Assembly and annotation ...
dada2 amplicon analysis for paired end data
The workflow has three main outputs:
- the sequence table (output of makeSequenceTable)
- the taxonomy (output of assignTaxonomy)
- the counts which allow to track the number of sequences in the samples through the steps (output of sequence counts)
The workflow takes raw ONT reads and trimmed Illumina WGS paired reads collections, the ONT raw stats table (calculated from WF1) and the estimated genome size (calculated from WF1) to run NextDenovo and subsequently polish the assembly with HyPo. It produces collapsed assemblies (unpolished and polished) and runs all the QC analyses (gfastats, BUSCO, and Merqury).
The workflow takes raw ONT reads and trimmed Illumina WGS paired reads collections, and the estimated genome size and Max depth (both calculated from WF1) to run Flye and subsequently polish the assembly with HyPo. It produces collapsed assemblies (unpolished and polished) and runs all the QC analyses (gfastats, BUSCO, and Merqury).
score-assemblies
A Snakemake-wrapper for evaluating de novo bacterial genome assemblies, e.g. from Oxford Nanopore (ONT) or Illumina sequencing.
The workflow includes the following programs:
...