How Solu Analyzes Bacterial and Fungal Genomes
At Solu, our guiding principle is to combine scientific rigour with usability. We have built the platform to reflect the most recent scientific results, and we update it constantly to stay aligned with new discoveries, reference data, and best practices. The goal is to ensure that every analysis is built for accuracy and maintained as the field evolves, complemented with in-house tooling.
By sourcing from scientifically validated databases, integrating peer-reviewed tools, and maintaining version transparency, we provide professionals with outputs they can trust, without needing to configure or maintain complex pipelines themselves.
The platform runs a variety of genomic characterisation and phylogenetic analyses, designed to cover the majority of research and epidemiological use cases. The platform supports isolated WGS samples for bacteria and fungi.
Analysis Pipeline
%20(3).png)
Step 1: Data Input
The process begins with a simple drag-and-drop upload. Solu accepts:
- FASTQ files (Illumina paired-end or Oxford Nanopore basecalled reads)
- FASTA assemblies (for already assembled genomes)
FASTQ uploads are recommended because they allow the platform to run internal quality control and assembly steps, ensuring maximum accuracy.
Step 2: Quality Control & Assembly
Solu automatically performs QC on uploaded FASTQ files using FastQC and corrects and trims low-quality reads with fastp or fastplong (for Nanopore data).
- Illumina reads are assembled de novo with Shovill.
- Nanopore reads are assembled and polished with Dragonflye.
Assemblies are checked with QUAST, genome sizes are verified against the NCBI Genome API, and formats are standardized. These steps ensure that downstream analyses start with a clean, well-formed assembly.
Step 3: Species Identification
The assembled genome is classified using BactInspector against an extended reference database that includes bacterial and fungal species, including clade-level references for Candida auris. This step determines which downstream analyses will run, since parameters are tuned automatically to the detected species.
Step 4: Genomic Characterization
Once the species is identified, Solu performs a series of analyses to characterize the genome:
- MLST typing (with PubMLST)
- AMR gene detection (using AMRFinderPlus, with a curated fungal AMR database for Candida auris, C. albicans, and A. fumigatus)
- Virulence factor screening (with ABRicate and VFDB)
- Plasmid detection and typing (using MOB-suite)
- Public genome comparison (with Mash against curated reference collections)
This provides a detailed profile of the isolate’s genotype and potential clinical significance.
Step 5: Phylogenetic Analysis
For multiple samples of the same species, Solu performs SNP-based phylogeny:
- Reference-based alignment (with Snippy) for common species or
- Reference-free alignment (with SKA) when no suitable reference is available
Low-quality SNPs are filtered with in-house scripts, pairwise SNP distances are calculated with snp-dists, and phylogenetic trees are inferred with IQ-TREE (optionally time-scaled with TreeTime). Results include SNP distance matrices, clusters (using a 20-SNP threshold), and publication-ready phylogenetic trees.
Step 6: Species-Specific Modules
For certain pathogens, additional analyses are automatically run:
- Klebsiella pneumoniae complex - K/O typing and virulence loci detection with Kleborate
- Staphylococcus aureus - SCCmec cassette typing
- Streptococcus pyogenes - emm gene typing
Step 7: Transmission and Advanced Analyses
For supported species, Solu can also perform transmission reconstruction using a proprietary algorithm built on TreeTime, producing dated phylogenies and inferred transmission chains — a powerful tool for outbreak investigations.
Step 8: Results
When the analysis is complete, all results are presented through an intuitive interface.
The final output is an interactive dashboard showing:
- Species identification and MLST
- AMR, virulence, and plasmid profiles
- SNP distances, cluster assignments, and phylogenetic trees
- Optional transmission maps and species-specific results
All results are downloadable and ready to be cited in publications.
Fast Turnaround
Despite covering every major step of bacterial and fungal WGS analysis, the pipeline completes in 2–10 minutes per sample, fully automated in the cloud.
In Summary
Solu streamlines the entire WGS workflow, from raw reads to species ID, genomic characterization, phylogeny, and advanced analyses, while automatically tuning parameters based on the detected species. This ensures reproducible, high-quality results without manual setup, helping microbiologists and bioinformaticians get from data to insight quickly and reliably.
Note: Our methodology is updated consistently. This text includes the pipeline dated on October 2025.
Explore Further
Get started for free
Create your free Solu Platform account today to start analyzing genomes.