T2-08 De Novo Assembly and Comparative Sequence Analysis of Cyclospora cayetanensis Apicoplast Genomes Originating from Diverse Geographical Regions

Monday, August 1, 2016: 10:45 AM
242 (America's Center - St. Louis)
Hediye Nese Cinar, U.S. Food and Drug Administration, Laurel, MD
Yvonne Qvarnstrom, CDC, Atlanta, GA
Yuping Wei-Pridgeon, CDC, Atlanta, GA
Wen Li, CDC, Atlanta, GA
Fernanda Nascimento, CDC, Atlanta, GA
Michael Arrowood, CDC, Atlanta, GA
Helen Murphy, U.S. Food and Drug Administration, Laurel, MD
AhYoung Jang, U.S. Food and Drug Administration, Laurel, MD
Eunje Kim, U.S. Food and Drug Administration, Laurel, MD
RaeYoung Kim, U.S. Food and Drug Administration, Laurel, MD
Alexandre DaSilva, U.S. Food and Drug Administration, Laurel, MD
Gopal Gopinath, U.S. Food and Drug Administration, Laurel, MD
Introduction: Cyclospora cayetanensis is a coccidian parasite causing foodborne and waterborne disease called cyclosporiasis. Since the 1990s outbreaks of cyclosporiasis occur almost every year in the U.S. and large outbreaks sickened hundreds of persons yearly from 2013 to 2015. Outbreak investigations are currently hampered because no molecular epidemiological tools are available for traceback analysis.

Purpose: The apicoplast is a non-photosynthetic plastid with an independent genome found in most apicomplexan parasites including C. cayetanensis. Genetic markers identified in apicoplast genomes of other parasites have been useful for detection and traceback analysis. Distinct differences in the apicoplast genomes of C. cayetanensis could be potentially useful to design advanced rapid molecular methods for rapid detection, subtyping and geographical source attribution, being applicable to outbreak investigations and surveillance.

Methods: We sequenced the C. cayetanensis genomic DNA extracted from stool samples from patients with cyclosporiasis using the Illumina MiSeq platform.  Bioinformatic workflow included tools like Mulan, Bowtie2, Geneious, CLC workbench, RATT, MAKER2, and NCBI Blast++.  The draft genome was manually curated using NGS data from C. cayetanensis in our collections. Raw reads from many samples originated from Nepal, New York, Texas, Indonesia and elsewhere were mapped to the apicoplast reference. Multiple alignment of apicoplast genomes with the reference assembly was carried out using MEGA6.

Results: Comparative analysis using curated and annotated circular 34146 bp reference genome resulted in assembly 20+ SNPs and some small indels spanning the reference genome, and a 31 bp-sequence repeat at the terminal spacer region unique to some Nepalese samples.  Phylogenetic analysis of apicoplast genomes from C. cayetanensis displayed a familiar pattern of tight clustering with Eimeria.

Significance: This is the first report of end-sequence curated and annotated complete reference genome for the C. cayetanensis apicoplast.  SNPs and sequence-repeats from this study can be used as genetic markers for geographic differentiation applicable to traceback investigations.