An EM algorithm based on an internal list for estimating haplotype distributions of rare variants from pooled genotype data

Table 6 Induced collapsed data frequencies

Haplotypey	f(y)	g(y)
Positions of ‘1’s	TRUE	k=1	k=2	k=3	k=4
None	0.7995	0.6392	0.4085	0.2611	0.1669
1	0.0509	0.0839	0.1143	0.1169	0.1065
2	0.0034	0.0055	0.0070	0.0067	0.0058
3	0.0436	0.0716	0.0967	0.0980	0.0883
5	0.0034	0.0055	0.0070	0.0067	0.0058
6	0.0034	0.0055	0.0070	0.0067	0.0058
9	0.0073	0.0117	0.0151	0.0146	0.0125
11	0.0034	0.0055	0.0070	0.0067	0.0058
15	0.0034	0.0055	0.0070	0.0067	0.0058
19	0.0068	0.0109	0.0141	0.0136	0.0117
20	0.0068	0.0109	0.0141	0.0136	0.0117
21	0.0034	0.0055	0.0070	0.0067	0.0058
22	0.0102	0.0164	0.0213	0.0206	0.0178
23	0.0034	0.0055	0.0070	0.0067	0.0058
24	0.0102	0.0164	0.0213	0.0206	0.0178
1, 3	0.0040	0.0117	0.0307	0.0482	0.0610
1, 9	0.0029	0.0058	0.0105	0.0135	0.0148
3, 14	0.0034	0.0057	0.0082	0.0088	0.0084
6, 7	0.0204	0.0332	0.0439	0.0435	0.0384
3, 6, 7	0.0034	0.0077	0.0164	0.0231	0.0271
1, 6, 7, 24	0.0034	0.0060	0.0097	0.0119	0.0132
1, 12, 13, 22, 25	0.0034	0.0059	0.0087	0.0097	0.0096
Sum of haplotype probabilities	1.0000	0.9751	0.8822	0.7650	0.6462

Haplotype frequencies f(y) for a 25-loci case and the induced collapsed data frequencies g(y) for various pool sizes k.

ISSN: 2730-6844