Open Access Pre-Print
The origin and history of the Ashkenazi Jewish population have long been of great interest, and advances in high-throughput genetic analysis have recently provided a new approach for investigating these topics. We and others have argued on the basis of genome-wide data that the Ashkenazi Jewish population derives its ancestry from a combination of sources tracing to both Europe and the Middle East. It has been claimed, however, through a reanalysis of some of our data, that a large part of the ancestry of the Ashkenazi population originates with the Khazars, a Turkic-speaking group that lived to the north of the Caucasus region ~1,000 years ago. Because the Khazar population has left no obvious modern descendants that could enable a clear test for a contribution to Ashkenazi Jewish ancestry, the Khazar hypothesis has been difficult to examine using genetics. Furthermore, because only limited genetic data have been available from the Caucasus region, and because these data have been concentrated in populations that are genetically close to populations from the Middle East, the attribution of any signal of Ashkenazi-Caucasus genetic similarity to Khazar ancestry rather than shared ancestral Middle Eastern ancestry has been problematic. Here, through integration of genotypes on newly collected samples with data from several of our past studies, we have assembled the largest data set available to date for assessment of Ashkenazi Jewish genetic origins. This data set contains genome-wide single-nucleotide polymorphisms in 1,774 samples from 106 Jewish and non-Jewish populations that span the possible regions of potential Ashkenazi ancestry: Europe, the Middle East, and the region historically associated with the Khazar Khaganate. The data set includes 261 samples from 15 populations from the Caucasus region and the region directly to its north, samples that have not previously been included alongside Ashkenazi Jewish samples in genomic studies. Employing a variety of standard techniques for the analysis of population-genetic structure, we find that Ashkenazi Jews share the greatest genetic ancestry with other Jewish populations, and among non-Jewish populations, with groups from Europe and the Middle East. No particular similarity of Ashkenazi Jews with populations from the Caucasus is evident, particularly with the populations that most closely represent the Khazar region. Thus, analysis of Ashkenazi Jews together with a large sample from the region of the Khazar Khaganate corroborates the earlier results that Ashkenazi Jews derive their ancestry primarily from populations of the Middle East and Europe, that they possess considerable shared ancestry with other Jewish populations, and that there is no indication of a significant genetic contribution either from within or from north of the Caucasus region.
Behar, Doron M.; Metspalu, Mait; Baran, Yael; Kopelman, Naama M.; Yunusbayev, Bayazit; Gladstein, Ariella; Tzur, Shay; Sahakyan, Hovhannes; Bahmanimehr, Ardeshir; Yepiskoposyan, Levon; Tambets, Kristiina; Khusnutdinova, Elza K.; Kushniarevich, Alena; Balanovsky, Oleg; Balanovsky, Elena; Kovacevic, Lejla; Marjanovic, Damir; Mihailov, Evelin; Kouvatsi, Anastasia; Triantaphyllidis, Costas; King, Roy J.; Semino, Ornella; Torroni, Antonio; Hammer, Michael F.; Metspalu, Ene; Skorecki, Karl; Rosset, Saharon; Halperin, Eran; Villems, Richard; and Rosenberg, Noah A.
"No Evidence from Genome-Wide Data of a Khazar Origin for the Ashkenazi Jews,"
6, Article 9.
Available at: http://digitalcommons.wayne.edu/humbiol/vol85/iss6/9
Supplemental Table 1.xlsx (23 kB)
Supplemental Table 2.xlsx (10 kB)
Behar Fig2.pdf (2048 kB)
Figure 2: Principal components analysis and spatial ancestry analysis
Behar Fig4.pdf (3298 kB)
Figure 4: Correlation of population-level mean membership proportions
Behar SuppFig1-2.pdf (591 kB)
Supplemental Figures 1-2: Scatter plot of the first, second, and third principal components for all samples included in the study
Behar SuppFig3.pdf (9564 kB)
Supplemental Figure 3: Admixture plots
Behar SuppFig6.pdf (179 kB)
Supplemental Figure 6: Clumpp scores versus log-likelihood (LL) differences
Behar SuppFig7.pdf (1509 kB)
Supplemental Figure 7: Clumpp scores (right y-axis) versus log-likelihood (LL) differences for different values of K