T4-12 A Unique Workflow Consisting of Metagenomic Sequencing and Bioinformatic Analysis to Routinely Recover High Quality Cyclospora cayetanensis Whole Genome Sequences from Clinical Samples

Monday, July 10, 2017: 4:45 PM
Room 16 (Tampa Convention Center)
Gopal Gopinath , U.S. Food and Drug Administration , Laurel , MD
Hediye Cinar , U.S. Food and Drug Administration–CFSAN, Office of Applied Research and Safety Assessment , Laurel , MD
Helen Murphy , U.S. Food and Drug Administration–CFSAN, Office of Applied Research and Safety Assessment , Laurel , MD
ChaeYoon Lee , U.S. Food and Drug Administration , Laurel , MD
Sonia Almeria , U.S. Food and Drug Administration–CFSAN, Office of Applied Research and Safety Assessment , Laurel , MD
Mauricio Durigan , U.S. Food and Drug Administration–CFSAN, Office of Applied Research and Safety Assessment , Laurel , MD
Alexandre da Silva , U.S. Food and Drug Administration–CFSAN, Office of Applied Research and Safety Assessment , Laurel , MD
Introduction: The foodborne coccidian parasite Cyclospora cayetanensis causes endemic and epidemic cyclosporiasis. Lack of molecular epidemiological tools hampers detection and strain identification of this organism in the food supply.

Purpose: In this technical session, a workflow to generate good quality assemblies of Cyclosporagenomes by deep sequencing of clinical samples is presented.

Methods: Total DNA from a Nepalese fecal sample, NF1, was extracted from isolated oocysts. Nextera, Nextera XT, and Ovation kits were used to prepare libraries for metagenomic sequencing on an Illumina Miseq platform. Bowtie2, CLC genome bench, Geneious, MEGA7, perl scripts, AUGUSTUS, Metaphlan2, and Companion were used for bioinformatic analysis of HCNY and HEN01 genome datasets, which were obtained from NCBI. Eimeria necatrix was used as a reference for gene prediction training. Apicomplexangenomes were obtained from NCBI.

Results: Ovation libraries with insert sizes from 800 to 1,000 bases had higher coverage of Cyclospora reads. Cyclospora cayetanensis HCNY WGS assembly (44 MB) was used for mapping metagenomic reads. Metaphlan2 analysis showed negligible bacterial contamination in the mapped reads, which were trimmed and assembled to generate a 42.3 MB WGS assembly with 1,786 contigs. The AUGUSTUS program predicted 7,402 and 8,037 proteins, respectively, from NF1 and HCNY. Approximately 5,000 homologs of Eimeria were identified in both Cyclospora genomes. Complete apicoplast and mitochondrial genomes recovered from these assemblies were identical to the reference genomes. Strain level differences between NF1, HCNY, and HEN01 assemblies were obtained by allele detection in exons. Evolutionary analysis of NF1 Cyclospora proteins with other Eucoccidioridan apicomplexansconfirmed a divergent phylogenetic relationship among these important parasites.

Significance: The presented workflow to recover WGS assemblies of C. cayetanensis from fecal samples will enable the availability of more genomes for the food safety community. Larger allelic data from C. cayetanensis genes will facilitate the development of molecular fingerprinting tools for source-tracking of this important foodborne parasite.