Open Access Pre-Print
The Combined DNA Index System (CODIS) loci comprise a standard microsatellite marker set widely used for distinguishing among individuals in forensic DNA identity testing for medicolegal casework in the United States and in other countries. In anthropological genetic research, CODIS markers have become an important tool for uses extending beyond case investigations to quantify ancestry proportions, reveals patterns of admixture, and trace population histories. These investigations are especially prevalent in studies of Latin American population structure. Nevertheless, the accuracy of the ancestry estimates computed from the CODIS loci for highly admixed Latino populations has not been formally tested. Longstanding arguments have been made that small ancestry panels, including the CODIS loci specifically, are not suitable for ancestry inference in admixed populations, due to high heterozygosity and limited number of loci used. Recent studies on ancestry inference using the CODIS loci suggest that these do confer more information of population-level identifiability than recognized in forensic genetic scholarship and by the medicolegal community. Here, we formally test the ability of CODIS and CODIS-proxy (e.g., high-heterozygosity and individual-identifiability loci) marker panels to accurately estimate admixture proportions of individuals, including a sample of Latinos with a wide range of ancestry proportions. Using the same individuals to make direct comparisons of the outcomes, the authors produced ancestry estimates from (a) a small CODIS/CODIS-proxy locus panel and (b) a robust and validated microsatellite ancestry-informative panel. They found evidence (e.g., ρ = 0.80–0.88) that supports the use of CODIS/ CODIS-proxy loci to capture the general ancestry estimation trends of a sample. This finding is in line with results of studies using CODIS on Latin American populations: the ancestry estimations generated by CODIS present trends supported by documented population histories (e.g., colonialism and population movements) and microevolutionary events (e.g., gene flow) in Latin America. However, this study also highlights the limitations of CODIS for making individual-level inferences of ancestry: the associated estimates for an acceptable level of statistical confidence (95%) are too broad to make any nuanced inferences regarding an individual’s actual ancestry composition.
Hughes, Cris E.; Algee-Hewitt, Bridget F.B.; and Konigsberg, Lyle W.
"Population Identifiability from Forensic Genetic Markers: Ancestry Variation in Latin America,"
3, Article 1.
Available at: https://digitalcommons.wayne.edu/humbiol/vol90/iss3/1