Skip to main content

Multilocus and interaction-based genome scan for alcoholism risk factors in Caucasian Americans: the COGA study


In this paper, we applied the nonparametric linkage regression approach to the Caucasian genome scan data from the Collaborative Study on the Genetics of Alcoholism to search for regions of the genome that exhibit evidence for linkage to putative alcoholism-predisposing genes. The multipoint single-locus model identified four regions of the genome with LOD scores greater than one. These regions were on 7p near D7S1790 (LOD = 1.31), two regions on 7q near D7S1870 (LOD = 1.15) and D7S1799 (LOD = 1.13) and 21q near D21S1440 and D21S1446 (LOD = 1.78). Jointly modeling these loci provided stronger evidence for linkage in each of these regions (LOD = 1.58 on 7q11, LOD = 1.61 on 11q23, and LOD = 1.95 on 21q22). The evidence for linkage tended to increase among pedigrees with earlier mean age of onset at 8q23 (p = 0.0016), 14q21 (p = 0.0079), and 18p12 (p = 0.0021) and with later mean age of onset at 4q35 (p = 0.0067) and 9p22 (p = 0.0008).


The Collaborative Study on the Genetics of Alcoholism (COGA) is a study designed to identify the genetic influences of alcoholism. Although alcoholism itself and the corresponding risk factors are heritable, they are strongly believed to be complex genetic traits. Thus, in the search for genes that influence these traits we expect significant genetic heterogeneity, gene × gene, and gene × environment interactions. Statistical methods that have the flexibility to simultaneously consider multiple loci and environmental factors are potentially valuable tools in the search for putative disease-predisposing loci. The purpose of this paper is to examine the evidence for linkage using multilocus nonparametric linkage regression modeling and explore whether the evidence for linkage varies by the age of onset of alcoholism [1, 2].


The genotyped sample provided by COGA to the Genetics Analysis Workshop 14 consists of 102 Caucasian pedigrees (1,078 individuals) and 41 non-Caucasian pedigrees (526 individuals). Given the limited number of pedigrees of non-Caucasian ethnicity, this paper focuses on the self-reported Caucasian sample genotyped on 315 microsatellite markers and 15,406 autosomal single-nucleotide polymorphisms (SNPs). The alcohol dependence diagnosis required that an individual have DSM-III-R alcohol dependence and Feighner alc definite. This yielded four affection status classifications: 1) unaffected, 2) never drank, 3) unaffected with some symptoms, and 4) affected. The primary focus of these analyses will use affection status 4 only.

The initial genome scan linkage analyses were computed using the nonparametric linkage (NPL) (pairs) and NPL (all) statistics under 1) the exponential allele-sharing model implemented in GENEHUNTER PLUS [3] and 2) a conditional logistic regression parameterization denoted NPL regression [1, 2]. This regression-based approach provides a one degree of freedom test of the evidence for linkage conditional on the evidence for linkage at the other loci in the model. Model building was performed using step-wise regression techniques. To test for an interaction between two loci, we included the two loci and their statistical interaction in the model and computed the one degree of freedom test of the interaction coefficient. In addition, we tested for interactions between the degree of sharing (identity by descent (IBD)) at a locus and the pedigree-specific mean age at alcoholism diagnosis. The p-value should be interpreted as a point-wise p-value and was not adjusted for the number of comparisons across the genome. All analyses are based on multipoint IBD estimates.

Ordered subset analyses (OSAs) [4] were computed to investigate the influence of a pedigree's mean age at alcoholism diagnosis on the evidence for linkage. Analyses were conducted ranking the mean family age of onset in ascending, and then in descending order. Linkage analyses were computed on contiguous subsets of pedigrees based on the mean age of onset ranking. The statistical significance of the change in the LOD score was evaluated by a permutation test under the null hypothesis that the ranking of the covariate is independent of the LOD score of the family on the target chromosome. Thus, the families were randomly permuted with respect to the covariate ranking and an analysis proceeded as above for each permutation of these data. The resulting empirical distribution of the change in the LOD score yielded a chromosome-wide p-value (Δp). MERLIN [5] was also used to perform a genome scan and was subsequently used on the SNP data. Due to computation time only the chromosomes that showed linkage with the microsatellites were run through MERLIN for linkage analysis. Cox et al. [6] examined the decay of linkage disequilibrium (LD) across the genome and found little evidence that adjacent markers exhibited significant LD, thus validating the use of the SNP data for linkage analysis. More specifically, in the absence of parental genotype data LD between markers can inflate the type 1 error rate in linkage analysis. The allele frequencies in the MERLIN analyses were computed in MERLIN using founders. As above, multipoint IBD estimates were computed and the NPL regression analysis was computed based on the NPL (pairs) and NPL (all) statistics.

Using the strictest criteria of affection status, there were 643 affected individuals from 102 families. Among these families there were 656 relative pairs including: 404 full-sib pairs, 9 half-sib pairs, 8 grandparent-grandchild pairs, 178 avuncular pairs and 19 other relative pairs. The families consisted of pedigrees with two (n = 5), three (n = 30), four (n = 32), five (n = 21), six (n = 8) and seven or greater (n = 6) individuals diagnosed with alcoholism.


Single-locus models

Two chromosomes had maximum LOD scores greater than 1.0. Ordered by the magnitude of the LOD score, these regions were chromosome 21q22 (LOD = 1.78, 58 cM near D21S1440 and D21S1446), 7p21 (LOD = 1.31, 17 cM near D7S1790), 7q11 (LOD = 1.15, 112 cM nearest D7S1870), and 7q22 (LOD = 1.13, 145 cM near D7S1799). Chromosome 11q23 near D11S1998 (120 cM) provided modest evidence for linkage in the single-locus model (LOD = 0.81). Figure 1 displays the LOD score curves for the single-locus and multilocus analyses of chromosomes 7, 11, and 21. The linkage analysis of the SNP data using the NPL (pairs) statistic from MERLIN and NPL regression continued to provide evidence for linkage on chromosome 7p21 (LOD = 1.78) and 7q21 (LOD = 1.51) regions. However, the wide gap between microsatellite markers on chromosome 21 versus the high number of SNPs in this region led to differing results. Specifically, chromosome 21 no longer provided evidence for linkage to alcoholism. The information content for chromosome 21 went from an average across the chromosome of 60% to 94%. Chromosome 7 also had a marked increase in information content when moving from the SNP data to the microsatellite data, though the peaks remained comparably significant for this chromosome.

Figure 1
figure 1

NPL single-locus and multiple locus regression.

Multilocus models

The results of the multilocus NPL regression model building using the microsatellite data are summarized in Table 1. The three peaks associated with chromosome 7q, 11q, and 21q continued to provide evidence for linkage. For all three positions in the model the conditional LOD score was larger and the interval of interest, as defined by the LOD-1 interval, was smaller than the corresponding single-locus LOD score and LOD-1 interval. Figure 1 displays the LOD score curves for each of these models. Interestingly, after adjusting for the evidence of linkage at chromosome 7 at 112 cM, the remaining regions of chromosome 7 no longer exhibited statistically significant evidence for linkage. There was not strong evidence of an interaction among these loci.

Table 1 Single- and multiple-locus NPL regression results

Linkage and age of onset interaction analysis

Table 2 summarizes the results of the NPL regression interaction analysis with age of onset of alcoholism. The evidence for linkage tended to be stronger among pedigrees with an earlier mean age of onset of alcoholism at 8q23 (p = 0.0016), 14q21 (p = 0.0079), and 18q12 (p = 0.0021). Conversely, the evidence for linkage tended to be greater among those pedigrees with later age of onset of alcoholism at 4q35 (p = 0.0067) and 9p22 (p = 0.0008). The difference in mean age of onset tended to be about two to three years among those pedigrees that linked to these regions versus those that did not link (Table 2). None of the positions identified in the initial genome scan are among those regions that exhibited linkage evidence that varied by age of onset, and the positions identified in the interaction analysis with age of onset did not show significant evidence for linkage in the initial genome scan. The only possible exception was at 7q21 where pedigrees that had a greater mean age of onset tended to have increased evidence for linkage (p = 0.0350). A similar result was also found with OSA, subsetting on the families with the later age of onset of alcoholism increased the LOD score on 7q21 (LOD = 2.62 Δp = 0.11). Subsetting on later age of onset yielded a significant increase in the LOD score on 17q23 (LOD = 2.01, Δp = 0.0409). Subsetting on earlier mean age of onset only significantly increased the evidence for linkage at 9q33 (LOD = 1.72, Δp = 0.0409). The result on 7p21 was within 10 cM of a similar result for earlier age of onset of alcoholism found with the NPL regression interaction analysis (p = 0.02586).

Table 2 NPL regression interaction analyses for age at diagnosis


Upon reviewing two previously published genome scans of alcohol dependence, two of our regions were identified in these published studies. The strongest evidence of linkage when looking across both the microsatellites and SNPs in this set of COGA data was on chromosome 7. The region on 7p was also identified in an American Indian population [8] within 10 cM of our peak with a nominal regression p-value of 0.009. The 7q peak is a series of peaks from about 100 cM to 160 cM; this region was found in the original COGA analyses [7] as well. The 11q peak was also replicated in the American Indian population [2] within 10 cM of our peak with a nominal regression p-value of 0.02. The 21q22 result showed up in both the ASM analysis and the NPL regression analysis, but completely disappeared when the SNP analysis was done.

Alcoholism is a genetically complex disease and therefore requires sophisticated consideration of multigenic and phenotypic influences. In this study methods that consider genetic heterogeneity, gene × gene interactions, gene × age-of-onset interactions, and joint modeling of multiple loci increased the evidence for linkage at three chromosomal locations, two of which had been previously identified as being associated with alcohol dependence. These methods reduced the linkage support interval at all three loci. In addition, testing for a dependence of the evidence for linkage on age of onset identified five additional regions of interest. These results suggest the potential utility of incorporating characteristics of complex genetic traits in the analysis.



Collaborative Study on the Genetics of Alcoholism


Identity by descent


Linkage disequilibrium


Nonparametric linkage


Ordered subset analysis


Single-nucleotide polymorphism


  1. Davis CC, Brown WM, Lange EM, Rich SS, Langefeld CD: Nonparametric linkage regression II: Identification of influential pedigrees in test for linkage. Genet Epidemiol. 2001, 21 (Suppl 1): S123-S129.

    PubMed  Google Scholar 

  2. Langefeld CD, Davis CC, Brown WM: Nonparametric linkage regression I: Combined Caucasian CSGA and German genome scans for asthma. Genet Epidemiol. 2001, 21 (S1): S136-S141.

    PubMed  Google Scholar 

  3. Kong A, Cox NJ: Allele-sharing models: LOD scores and accurate linkage tests. Am J Hum Genet. 1997, 61: 1179-1188. 10.1086/301592.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  4. Hauser ER, Watanabe RM, Duren WL, Langefeld CD, Boehnke M: Ordered subset analysis in genetic linkage mapping of complex traits. Genet Epidemiol. 2004, 27: 53-63. 10.1002/gepi.20000.

    Article  PubMed  Google Scholar 

  5. Abecasis GR, Cherny SS, Cookson WO, Cardon LR: Merlin-rapid analysis of dense genetic maps using sparse gene flow trees. Nat Genet. 2002, 30: 97-101. 10.1038/ng786.

    Article  CAS  PubMed  Google Scholar 

  6. Cox ME, Campbell JK, Langefeld CD: An exploration of sex-specific linkage disequilibrium on chromosome X in Caucasians from the COGA study. BMC Genet. 2005, 6 (Suppl 1): S81-10.1186/1471-2156-6-S1-S81.

    Article  PubMed Central  PubMed  Google Scholar 

  7. Reich T, Edenberg HJ, Goate A, Williams JT, Rice JP, Van Eerdewegh P, Foroud T, Hesselbrock V, Schuckit MA, Bucholz K, Porjesz B, Li TK, Conneally PM, Nurnberger JI, Tischfield JA, Crowe RR, Cloninger CR, Wu W, Shears S, Carr K, Crose C, Willig C, Begleiter H: Genome-wide search for genes affecting the risk for alcohol dependence. Am J Med Genet. 1998, 81: 207-215. 10.1002/(SICI)1096-8628(19980508)81:3<207::AID-AJMG1>3.0.CO;2-T.

    Article  CAS  PubMed  Google Scholar 

  8. Long JC, Knowler WC, Hanson RL, Robin RW, Urbanek M, Moore E, Bennett PH, Goldman D: Evidence for genetic linkage to alcohol dependence on chromosome 4 and 11 from an autosome-wide scan in an American Indian population. Am J Med Genet. 1998, 81: 216-2221. 10.1002/(SICI)1096-8628(19980508)81:3<216::AID-AJMG2>3.0.CO;2-U.

    Article  CAS  PubMed  Google Scholar 

Download references


The authors acknowledge Stephanie R. Beck and Joel K. Campbell for their assistance in preparing the data.

Author information

Authors and Affiliations


Corresponding author

Correspondence to Adrienne H Williams.

Additional information

Authors' contributions

AHW, WMB and CDL contributed to the identification of the question and wrote the manuscript. AHW conducted all analyses.

Rights and permissions

Open Access This article is published under license to BioMed Central Ltd. This is an Open Access article is distributed under the terms of the Creative Commons Attribution License ( ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Reprints and permissions

About this article

Cite this article

Williams, A.H., Brown, W.M. & Langefeld, C.D. Multilocus and interaction-based genome scan for alcoholism risk factors in Caucasian Americans: the COGA study. BMC Genet 6 (Suppl 1), S37 (2005).

Download citation

  • Published:

  • DOI: