Purpose: This study compared two subsequent pipelines for high resolution WGS-based molecular typing.
Methods: First, whole genome multilocus sequence typing (wgMLST) was applied to WGS data from all 10,000+ isolates available on NCBI, with the purpose to detect clusters of highly related strains. Clusters, defined by wgMLST, were then characterized by whole genome single-nucleotide polymorphism (wgSNP) analysis. SNP variants detected by mapping the WGS reads to a reference chosen from within the cluster were used to maximize the resolution. As working examples, we identified clusters containing isolates originating from different food sources. Both analysis pipelines were run on the BioNumerics® Calculation Engine, which is fully integrated with the BioNumerics®7.6 software.
Results: We demonstrated that wgMLST was suitable for the analysis of very large (growing) datasets; making it a suitable technique for outbreak surveillance. The added resolution of wgSNP against an internal reference sequence increased the confidence in the detected clusters. This supports epidemiologists in their source tracking efforts, opening many perspectives for cost efficient food safety and public health monitoring programs.
Significance: BioNumerics® 7.6 software offers a powerful platform where both wgMLST and wgSNP analysis can be performed at high-throughput rates, and in combination with traditional typing data (MLST, PFGE, etc.) to rapidly provide a robust, portable, and high resolution picture of molecular typing data.