Document Type



DNA typing offers a unique opportunity to identify individuals for medical and forensic purposes. Probabilistic inference regarding the chance occurrence of a match between the DNA type of an evidentiary sample and that of an accused suspect, however, requires reliable estimation of genotype and allele frequencies in the population. Although population-based data on DNA typing at severalhypervariable loci are being accumulated at various laboratories, a rigorous treatment of the sample size needed for such purposes has not been made from population genetic considerations. It is shown here that the loci that are potentially most useful for forensic identification of individuals have the intrinsic property that they involve a large number of segregating alleles, and a great majority of these alleles are rare. As a consequence, because of the large number of possible genotypes at the hypervariable loci that offer the maximum potential for individualization, the sample size needed to observe all possible genotypes in a sample is large. In fact, the size is so large that even if such a huge number of individuals could be sampled, it could not be guaranteed that such a sample was drawn from a single homogeneous population. Therefore adequate estimation of genotypic probabilities must be based on allele frequencies, and the sample size needed to represent all possible alleles is far more reasonable. Further economization of sample size is possible if one wants to have representation of only the frequent alleles in the sample, so that the rare allele frequencies can be approximated by an upper bound for forensic applications.