Purpose: Adapt a bioinformatics pipeline strategy, EDNA, to generate inclusive pathogen profiles using metagenomic data from complex food samples.
Methods: Fecal coliforms, strains of Escherichia coli O157:H7, pathogenic E. coli strains, and Shiga toxin sequence-specific electronic queries (e-probes) were developed and tested against sixteen mock sample databases (MSDs) containing 10,000 genome segments of both pathogen and a model plant host, Vitus vinifera (grapevine), using BLASTn parsed with an e-value of 1x10-3. Decoy e-probe sets, designed to determine background positive levels, were developed and queried against the MSDs for statistical analysis. Precision (true positives/(false + true positives)) was calculated for all e-probe sets, and statistical confidence in positive calls was assessed via t-test.
Results: Optimum E-probe length was established by calculating precision. When microbe abundance in samples exceeded 0.5%, precision of E. coli e-probes of 20 and 40 nucleotides averaged 99% and 99.2%, respectively. The EDNA e-probes (20-40 nt) successfully detected E. coli at higher concentrations (> 0.5% abundance), and the Shiga toxin e-probes (80 nt) allowed detection when the toxin was > 1.0% abundance (precision = 100%). Longer E. colie-probes also identified the pathogen at concentrations above 5% abundance, but t-test statistical confidences were limited by the total number of e-probes available, despite the fact that precision was 100%.
Significance: This bioinformatics approach to microbial detection has the potential for simultaneous detection of all foodborne pathogens present in a food sample.