INVESTIGATING RATES OF HETEROPLASMY IN THE MITOCHONDRIAL DNA CONTROL REIGON OF AFRICAN, ASIAN, AND LATINO POPULATION GROUPS USING MASSIVELY PARALLEL SEQUENCING
Open Access
Author:
Demchak, Emily Louise
Area of Honors:
Forensic Science
Degree:
Bachelor of Science
Document Type:
Thesis
Thesis Supervisors:
Dr. Mitchell Mark Holland, Thesis Supervisor Dr. Mitchell Mark Holland, Thesis Honors Advisor Jennifer McElhoe, Faculty Reader
Keywords:
DNA mtDNA Heteroplasmy NGS MPS
Abstract:
Massively parallel sequencing (MPS), a high-throughput form of next generation sequencing, allows increased resolution of mitochondrial (mt) DNA heteroplasmy and is at the forefront of efforts to expand the utility of forensic mtDNA typing. Heteroplasmy is a heterogeneous collection of sequence variants in the cytoplasm of the cell. It is hypothesized that there is potential for differences in rates of heteroplasmy linked to population haplogroups, based on assumption and empirical observation that the position and rate of heteroplasmy may be linked to the haplotype sequence. The current project has used an MPS approach to measure, analyze, and report rates of heteroplasmy on per sample and per nucleotide basis for 377 samples in population groups reporting to be non-European (NIJ-2016-DN-BX-0171). Buccal cells were collected from unrelated non-European individuals and MPS analysis conducted on the control region (CR) of the mtDNA genome using Nextera ® XT library preparation and 150X150 paired-end reads on an Illumina MiSeq. Secondary analysis was performed using GeneMarker HTS software to evaluate haplotype and heteroplasmy, and HaploGrep2 to determine haplogroups. Heteroplasmy was shown to occur in the African population at a rate of 33%, the Asian population at 31.4%, and the Latino population at 26.4%. If heteroplasmy did occur, it was more likely to be at a minor variant percentage below 10% and one site of heteroplasmy within an individual was the most common occurrence. Position 16093 is a consistent hot-spot for heteroplasmy across all populations, with other hot spots at homopolymeric regions and in the range of positions 185-215. This thesis shows partial results of a larger study