Data
Problem I data consists of 2885 individuals from 330 pedigrees. Of the 2885 individuals, 1213 were from Cohort 1. Cohort 1 individuals were followed every 2 years for a total of 21 measurements. Cohort 2 individuals are offspring of Cohort 1 individuals, and they were followed every 4 years for a total of five measurements. At each follow-up, extensive amount of medical information was obtained, including SBP, age, height, weight, and high-blood pressure treatment information.
Longitudinal analysis
We are interested in finding the genes that increase the risk for cardiovascular (CV) disease. We used SBP as a surrogate to CV. We first analyzed the longitudinal data following the methods of Levy et al., denoted as Method 1. Specifically, we calculated the mean SBP,
, for the ith individual. We then used linear regression to regress
on (
-
) and (
-
), where
is the mean age of an individual,
is the mean BMI of an individual,
is the sample mean age, and
is the sample mean BMI. The residuals from this regression analysis are then used as the quantitative phenotype for the linkage analysis.
We also analyzed the longitudinal data with an alternative approach (Method 2). We first found the residuals for each time point. We then calculate the average residual over all time points for each person to use as the phenotype in the linkage analysis. To calculate the residuals, we used generalized estimating equations (GEE) [4]. This is a linear models approach that accounts for the dependency among the time points. We used an exchangeable working correlation matrix in the analysis and regressed SBP on age and BMI at the ith time point. Residuals were then obtained for each time point.
For both approaches, we analyzed the phenotype data from males and females and from Cohorts 1 and 2 data separately, resulting in four longitudinal analyses. This was done to allow for different rates of change for age and BMI for each of the male/female and Cohort 1/Cohort 2 combinations. The residuals obtained from each of these four analyses were combined into one set of residuals, which was then used in the linkage analysis. The correlation between the two phenotypes used in the linkage analysis is 0.97.
Linkage analysis
Multipoint linkage analysis of the residuals were completed using a variance-component approach, which tests for linkage by testing whether the variance component associated with a particular chromosomal location is significantly greater than zero. The analyses were performed using SOLAR [5]. Since we adjusted for age and BMI in the longitudinal analysis and ran separate longitudinal analyses for each sex and cohort combination, these effects were not modeled in the linkage analysis.