The number of random SNP loci needed to correctly classify CHB and JPT from the HapMap data. Boxplots show the statistics of predicted origin vs. known origin for CHB and JPT estimated with different numbers of SNP loci. Each number of SNPs was randomly sampled 100 times from 22 autosomal chromosomes. Horizontal lines are drawn at the 1st quartile, 3rd quartile and median and are connected to form the box. A vertical dashed line is drawn down from the 1st quartile to the most extreme data point within a distance of 1.5 interquartile range (IQR). A similar line is drawn up from the 3rd quartile. The ends of the vertical lines are indicated by short horizontal lines. Outliers are marked by dots. Red diamonds are the means of the classification error rate for the whole sample for each number of SNP loci tested and red arrows are mean ± standard deviation.