Pipeline Stages
Read Processing
Long-read correction & trimming with Canu
SC1
Multi-Assembler
LJA, SPAdes, Canu, Flye, Unicycler — consensus merging via MAC2
SC2–SC8
Refinement
Evaluation, circularization, polishing (Arrow/Racon/Medaka + Polypolish), coverage
SC9–SC13
Quality Control
CheckM2 completeness & contamination assessment
SC14
Taxonomy
GTDB-Tk classification & de novo phylogenetic trees
SC15–SC16
Annotation
Bakta, Prokka, DeepFRI (deep-learning GO/EC), MicrobeAnnotator
SC17–SC20
Functional Screens
Plasmids, AMR, CAZymes, genomic islands, phages, CRISPR, IS elements
SC21–SC29
Integration
Consolidated results & interactive HTML report
SC30
At a Glance
30
Analysis modules
5
Assembly algorithms
PacBio + ONT
Sequencing platforms
Hybrid
Short-read support
Install
# Clone and install git clone https://github.com/SBUJordi/StrainCascade.git cd StrainCascade find scripts/ -type f -exec chmod +x {} \; ./scripts/StrainCascade_installation.sh
Quick Start
# From long-read sequencing data straincascade -i reads.fastq.gz -o output/ -s pacbio-hifi -t 32 # Hybrid assembly (long + short reads) straincascade -i longreads.fastq.gz -sr1 R1.fastq.gz -sr2 R2.fastq.gz -o output/ # From pre-assembled genome straincascade -a assembly.fasta -o output/ -t 32 # Deterministic mode for full reproducibility straincascade -i reads.fastq.gz -o output/ --deterministic
Citation
Jordi SBU, Baertschi I, Li J, Fasel N, Misselwitz B, Yilmaz B. StrainCascade: An automated, modular workflow for high-throughput long-read bacterial genome reconstruction and characterization. bioRxiv (2026). doi:10.64898/2026.02.04.698786