Construction and analysis of tag single nucleotide polymorphism maps for six human-mouse orthologous candidate genes in type 1 diabetes

Background One strategy to help identify susceptibility genes for complex, multifactorial diseases is to map disease loci in a representative animal model of the disorder. The nonobese diabetic (NOD) mouse is a model for human type 1 diabetes. Linkage and congenic strain analyses have identified several NOD mouse Idd (insulin dependent diabetes) loci, which have been mapped to small chromosome intervals, for which the orthologous regions in the human genome can be identified. Here, we have conducted re-sequencing and association analysis of six orthologous genes identified in NOD Idd loci: NRAMP1/SLC11A1 (orthologous to Nramp1/Slc11a1 in Idd5.2), FRAP1 (orthologous to Frap1 in Idd9.2), 4-1BB/CD137/TNFRSF9 (orthologous to 4-1bb/Cd137/Tnrfrsf9 in Idd9.3), CD101/IGSF2 (orthologous to Cd101/Igsf2 in Idd10), B2M (orthologous to B2m in Idd13) and VAV3 (orthologous to Vav3 in Idd18). Results Re-sequencing of a total of 110 kb of DNA from 32 or 96 type 1 diabetes cases yielded 220 single nucleotide polymorphisms (SNPs). Sixty-five SNPs, including 54 informative tag SNPs, and a microsatellite were selected and genotyped in up to 1,632 type 1 diabetes families and 1,709 cases and 1,829 controls. Conclusion None of the candidate regions showed evidence of association with type 1 diabetes (P values > 0.2), indicating that common variation in these key candidate genes does not play a major role in type 1 diabetes susceptibility in the European ancestry populations studied.

Results: Re-sequencing of a total of 110 kb of DNA from 32 or 96 type 1 diabetes cases yielded 220 single nucleotide polymorphisms (SNPs). Sixty-five SNPs, including 54 informative tag SNPs, and a microsatellite were selected and genotyped in up to 1,632 type 1 diabetes families and 1,709 cases and 1,829 controls.
Conclusion: None of the candidate regions showed evidence of association with type 1 diabetes (P values > 0.2), indicating that common variation in these key candidate genes does not play a major role in type 1 diabetes susceptibility in the European ancestry populations studied.

Background
Type 1 diabetes is a common, multifactorial disease believed to be caused in a proportion of cases by an autoimmune destruction of pancreatic β-cells by an inflammatory infiltrate comprising T lymphocytes, dendritic cells and macrophages. This process results from a complex interaction between genetic and environmental risk factors. Genetically, it is under the control of the major histocompatibility complex (MHC) [1] and many other genes of smaller effect and mostly unknown identity.
A murine model of type 1 diabetes, the NOD mouse, spontaneously develops an autoimmune-mediated diabetes that has many similarities to the human disease. It is likely that components of the pathophysiology and genetic predisposition are conserved across species, and indeed two loci have already been shown to affect type 1 diabetes susceptibility in both species, namely the immunoregulatory MHC HLA class II and CTLA-4 genes. The other causative gene(s) in the known Idd regions controlling type 1 diabetes susceptibility in the NOD mouse could also determine susceptibility in humans, even though this depends on the frequency of susceptibility alleles in human populations, which affects statistical power, and that the correct candidate gene has been chosen from the Idd interval. These Idd intervals might contain many genes, including several involved in the immune response [2]. Nevertheless, in contrast to studies in humans based on linkage, the localisation of a type 1 diabetes locus to a specific chromosome region in the mouse genome using congenic strain breeding defines with certainty a set of genes, one or more of which is definitely a susceptibility gene [3,4].
The central importance of T cell development and function in type 1 diabetes is evident from the susceptibility genes identified so far. The MHC class II genes are important etiologically in two rat models of type 1 diabetes, the Biobreeding (BB) and KDP strains [5,6], the NOD mouse strain [3] and in humans [1], with their essential function not only in T cell activation and expansion but also in T cell repertoire formation in the thymus and clonal deletion of autoreactive cells. The BB rat type 1 diabetes susceptibility locus Ian4/Iddm1 [7] affects T lymphocyte development whereas the Cblb (KDP rat) [8] and CTLA4 [9] (in humans and NOD mice) susceptibility genes highlight the importance of the regulation of T cell activation, expansion and homeostasis in the periphery, and perhaps in the thymus as well.

Results and discussion
A tag SNP approach to test for association was adopted for all genes, except for 4-1BB [16], in order to achieve costsavings in genotyping. A multi-locus test was used to evaluate the association between type 1 diabetes and the tag SNPs due to linkage disequilibrium (LD) with one or more causal variants [17]. Coding and untranslated regions of NRAMP1 (MIM 600266), FRAP1 (MIM 601231), 4-1BB (MIM 602250), CD101 (MIM 604516), B2M (MIM 109700) and VAV3 (MIM 605541) were resequenced in 32 or 96 randomly chosen UK white patients with type 1 diabetes to identify SNPs and for the selection of tag SNPs. As LD between 4-1BB SNPs was weak, eight out of nine common SNPs were genotyped (minor allele frequency, MAF ≥ 0.03; one SNP could not be genotyped due to assay technical difficulties) and analysed using single-locus tests.
A total of 110 kb of re-sequenced regions yielded 220 SNPs, including six deletion/insertion polymorphisms (DIPs) (see Table 2 and Additional files 2, 3,4,5,6 and7). No coding changes or obvious candidates for variants that could change the function or expression of 4-1BB, FRAP1, or B2M were observed. A synonymous change was detected in exon 3 of NRAMP1 (MAF = 0.32) and a nonsynonymous SNP (nsSNP) in exon 15 (MAF = 0.02), causing a conservative amino acid change: Asp543Asn (DIL5202/ss23142243). Interestingly, as in the case of its mouse orthologue [10], several nsSNPs were discovered in exons 3, 4, 5, and 8 of CD101 (see Additional file 5). Re-sequencing of the three alternative transcripts of VAV3, called VAV3 (27 exons), VAV3β (unique exon 1 and exons 4 to 27) and VAV3.1 (unique exon 18 and exons 19 to 27) yielded six exonic SNPs (see Additional file 7). Two SNPs, Pro611Ser (MAF = 0.13) and Gln613His (MAF = 0.13) are located in the SH3 domain of the VAV3 protein and, therefore, could result in VAV3 having altered protein interactions. In order to facilitate the computation of the selection of tag SNPs, VAV3 was divided into three sections as suggested by the pattern of LD across the gene.
Two common nsSNPs (MAF ≥ 0.05; DIL1521/rs7528153 and DIL3809/ss23142432) from VAV3 and a microsatellite from NRAMP1 were genotyped a priori in the whole  family collection (step 1 and 2) and a single nsSNP from CD101 in step 1 families only (DIL3794/rs3754112). The nsSNP DIL3810/ss23142433 in VAV3 was not tested because it was in quite strong LD with DIL3809/ ss23142432 (R 2 = 0.64), so that only DIL3809/ ss23142432 was genotyped. Note that in our tag approach, the two VAV3 nsSNPs (DIL1521/rs7528153 and DIL3809/ss23142432) were chosen deliberately as tag SNPs.
In a pragmatic, phased genotyping strategy, in step 1, the multi-locus test P values for association between type 1 diabetes and candidate gene tag SNPs all exceeded 0.2, as did the single-locus test P values for 4-1BB SNPs. Consequently, we did not proceed to genotype in step 2 samples for any of the candidate genes (Table 3 and 4). Note that none of the nsSNPs of VAV3 and CD101 or the microsatellite of NRAMP1 showed evidence of association (Table  5). Allele A3 of the NRAMP1 microsatellite promoter (GT) n has previously shown linkage and association with autoimmune disease, and allele A2 with infectious disease susceptibility [18][19][20]. The relative risks of allele A3 and genotype A3/A3 in our type 1 diabetes samples was 0.96 (95% CI = 0.94 -1.17) and 0.90 (95% CI = 0.70 -1.16), respectively.
With regards to our association study in humans, intronic and potential regulatory regions were not sequenced in the candidate genes since these cover large genomic regions, which will have to wait for much more extensive polymorphism maps [21]. For example, for VAV3, which spans almost 400 kb, less than 10% of the genomic region of VAV3 was re-sequenced to identify SNPs. The general importance of intronic and intergenic regulatory sequences as candidates for disease susceptibility is well recognised. Hence, potential unidentified causal variants in introns or flanking regions of the genes may have been missed, and remain a target for future analyses. Despite finding no evidence of association, it remains possible that there exists a common disease variant in one or more of the six candidate genes tested, which either has an effect smaller than would be detected with this study or is in much weaker LD with the tag SNPs than any other SNP known to us [22].
Finally, the possibility of one or more rare disease variants in a locus needs to be considered [23]. The best candidates for rare disease variants in the six genes studied here were thus genotyped in an expanded case-control collection of up to 3,704 type 1 diabetes cases and 3,930 controls: DIL5202/ss23142243 causes a non-conservative change in NRAMP (Asp543Asn, MAF = 0.02) and DIL3799/ ss23142349 in CD101 (Val839Ile; MAF = 0.03). For both SNPs, P values above 0.05 were obtained (P = 0.19 for DIL5202/ss23142243 and P = 0.80 for DIL3799/ ss23142349), therefore, making it less likely that these rare variants contribute to susceptibility to type 1 diabetes. Nevertheless, causal variants with MAFs less than 0.01 [24] may well remain undetected in our re-sequencing panels of 32 or 96 case DNAs. However, the re-sequencing of several hundred cases and controls is beyond the scope of the present study in which we have investigated variants with MAF ≥ 0.03.

Conclusion
Taken together, these data make an association between type 1 diabetes and common variation in coding and untranslated regions of the six functional candidate genes in the investigated human-mouse orthologue regions less likely. Several possibilities may account for this. A gene (or several genes) in an Idd interval may account for disease susceptibility in the NOD mouse, but the human orthologous region may lack this susceptibility variant. The scenario, in which candidate genes in the NOD Idd interval may not necessarily be harbouring a functional, causal variant in their human orthologue genes, was discussed previously [25]. It is also possible that the selected candidate gene in the Idd interval may not be the gene causing susceptibility to disease.
The tag SNP maps described here will be useful for association studies of other diseases. They will be integrated into future SNP maps encompassing the entire orthologous regions and all regulatory sequences and genes encoded within them.

Subjects
All family members were white and of European ancestral origin. The type 1 diabetes families comprised two parents and a least one affected child. The 748 type 1 diabetes families used in 'step 1' were as described previously [26]:  For 'step 2' genotyping of NRAMP1, the 748 type 1 diabetes families described above were used in addition to 343 multiplex/simplex families from the UK, 159 Norwegian simplex families, 322 Romanian simplex families, and 60 multiplex families from the USA totalling the combined DNA sets to 1,632 type 1 diabetes families, as described previously [26].

Sequencing
Nested PCR products from DNA from 96 or 32 type 1 diabetes patients were sequenced using an Applied Biosystems (ABI) 3700 capillary sequencer (Foster City, CA), and SNPs identified using the Staden Package [29].

Genotyping
SNPs were genotyped using the Invader ® assay (Third Wave Technologies, Inc. Madison WI) [30] and TaqMan MGB chemistry (ABI) [31]. The NRAMP1 microsatellite was genotyped on an ABI3700 sequencer using fluorescent primers as previously described [32]. Full details of primers and probes used for genotyping are available upon request. All genotyping data was doublescored independently.

Statistical analysis
The program for the selection of tag SNPs [17] and association analysis used here are implemented in the Stata statistical system and may be downloaded from our website [35]. All genotyping data were in Hardy-Weinberg equilibrium (P > 0.05).