P3-16 Microarray SNP Analysis as a Tool for Escherichia coli Subtyping

Tuesday, July 28, 2015
Hall B (Oregon Convention Center)
David Lacher , U.S. Food and Drug Administration , Laurel , MD
Mark Mammel , U.S. Food and Drug Administration , Laurel , MD
Jayanthi Gangiredla , U.S. Food and Drug Administration , Laurel , MD
Isha Patel , U.S. Food and Drug Administration , Laurel , MD
Keith Lampel , U.S. Food and Drug Administration , Laurel , MD
Christopher Elkins , U.S. Food and Drug Administration , Laurel , MD
Introduction: Escherichia coli is a highly diverse species with both pathogenic and non-pathogenic members, the former of which have been the focus of numerous food safety efforts.  As a result, a wealth of genomic sequence data is now available for this organism.  A rapid, cost-effective assay that makes use of this genomic data would be an invaluable tool in understanding the evolution and diversification of the species.

Purpose: We describe the use of a novel, high-density DNA microarray representing informative single nucleotide polymorphisms (SNPs) from E. coli mined from approximately 300 whole genome sequences.  The custom FDA-ECID microarray has been designed and manufactured using next-generation Affymetrix PEG-GeneAtlas technology.  This array is a rapid resequencing-based genomic tool for E. coli characterization and subtyping.

Methods: Three hundred whole genome sequences were used to identify ~125,000 conserved 25-mers each containing a central SNP.  Of these, ~10,000 informative SNPs were selected for inclusion and are represented on our custom FDA-ECID microarray using a SNP-typing probe strategy.

Results: Using our optimized SNP-calling algorithms, we have analyzed data from a vast collection of temporally and geographically diverse E. coli isolates.  The major phylogenetic lineages within the species were recapitulated using the array SNP data.  In addition, the array data was compared to the same in silico SNPs from whole genome sequence (WGS) data as well as to a more comprehensive set of chromosomal backbone SNPs mined from the WGS archive.  Comparisons of the microarray SNP data to the WGS data show greater than 95% similarity in classifying the isolates examined.

Significance: In summary, the FDA-ECID microarray is a powerful tool for molecular subtyping and phylogenetic analysis of E. coli.