Purpose: Improving the rapid identification of these pathogens is critical to outbreak investigation and response. Traditional methods involve labor-intensive biochemical and serological testing which are not always able to distinguish between Shigella and EIEC.
Methods: Phylogenetic analyses of 96 Shigella and EIEC isolates (capturing the diversity in the species), 18 Escherichia and 2 Salmonella isolates were conducted from whole genome sequence (WGS) data produced by an Illumina MiSeq or obtained from public databases. SNP analyses were used to construct a phylogenetic tree and to identify cluster specific genetic markers. Additional markers were obtained from clustering analysis of presence/absence of genes in annotated genomes.
Results: Based on 2,863 core SNPs, the Shigella and EIEC formed 10 phylogenetic clusters rendering those groups polyphyletic, which suggests that those bacteria have evolved independently multiple times and are closely related to each other and other pathogenic E. coli. Clustering of gene functions in metabolism, physiology and antibiotic resistance supports the SNP phylogeny. Multiple cluster specific genetic markers are presented here to assist rapid identification.
Significance: This comprehensive examination of Shigella, EIEC and other E. coli illustrates the very close relationships between these groups and underscores that the current Shigella nomenclature should be moved back to the E. coli group to reflect the phylogeny. In the meantime, the availability of markers will assist quicker and more discriminatory detection for improved food safety and public health.