Purpose: We apply two pipelines for high resolution WGS-based molecular typing to the data from this outbreak: whole genome multilocus sequence typing (wgMLST) and whole genome single-nucleotide polymorphism analysis (wgSNP).
Methods: A wgMLST scheme is created from a set of reference sequences, from which all coding regions are extracted and used to create a set of discernible loci. Two independent approaches, an assembly-free and a BLAST-based allele calling algorithm, are used to determine locus presence and detect the allelic variants. The wgSNP algorithm detects SNP variants by mapping the WGS reads to a reference sequence, which can be internal or external to the data set. For both methods, all calculation intensive data processing steps are performed on the BioNumerics® Calculation Engine, which can be deployed locally or in the cloud.
Results: The BioNumerics® 7.6 software and its integrated calculation engine offer a powerful platform where both wgMLST and wgSNP can be performed and validated against traditional data such as MLST or PFGE, rapidly providing a robust, portable and high resolution picture of molecular typing data.
Significance: Rapid and automatic processing of WGS data ensures a reliable and easy to follow workflow in routine molecular surveillance, reducing the time needed to detect and contain an outbreak, eventually reducing the cost on public health and food safety.