Data
The Framingham Heart Study data includes 330 pedigrees originally selected for a genome-scan analysis. The pedigrees consisted of 4692 subjects, of whom 2885 have participated in the Framingham Heart Study. Longitudinal SBP data were analyzed for 25,263 examinations on 2662 individuals. Height, weight, gender, age, and hypertensive treatment information were required but if height was missing, the most recent measurement was imputed. Because there might be important variation in individual SBP measurement among younger and older subjects, we also restricted the sample to individuals aged between 25 and 75 years, as in Levy et al. [2]. The following selection criteria were also defined: 1) There had to be at least 10 years between a subject's initial and final examinations within the age range; 2) at least four examinations within the age range were required for the original cohort and at least three for offspring cohort participants [2]. Data from 24,840 examinations on 2530 individuals were available in the selected sample. For the genome-wide scan analysis, 1702 genotyped individuals were included (394 from the Cohort 1 and 1308 from the Cohort 2).
Multilevel analysis of the longitudinal SBP model
Let the random variable Y
ijk
denote the SBP measurement at the ith examination for the jth individual in pedigree k. We then assume that Y
ijk
satisfies the following general multilevel model:
Within-subject model – Level 1
where i = 1,...,21 for Cohort 1 subjects and i = {11, 15, 17, 19, 21} for Cohort 2 subjects. Age
ijk
, BMI
ijk
, Treat
ijk
are the age, body mass index and hypertension treatment (1 for subjects treated and 0 for subjects untreated) at the ith exam for the jth individual in pedigree k,
and
are the mean values across all exams for the jth individual, and ε
ijk
are the error components that account for the within-individual variability. The ε
ijk
are assumed to be normally distributed with mean vector zero and variance-covariance matrix Σ defined by a first-order autoregressive structure. The intercept b0jkrepresents the average SBP for an untreated subject of average age and BMI across all of the subject's examinations. The regression coefficient b1jkis used to model the linear variation of SBP with age. We found that every individual profile could be well approximated by a quadratic function of time, measured by the age at examination. We also tested a cubic effect, but it was not significant when we allowed for the individual's linear time trend to differ in each treatment group (interaction between age and treatment). Random effects were added to reflect the natural heterogeneity in the population. In this model, both the intercept and the linear effect for age were allowed to vary across individuals and the individual-specific regression coefficients (random effects) were defined at the second level:
Subject random-intercept model – Level 2
Subject random-slope model – Level 2
and
are the sample means for age and body mass index, Sex and Cohort are two indicator variables, coded 1 for males, 0 for females and 0 for Cohort 1 subjects, 1 for Cohort 2 subjects. The random components u0jkand u1jkmeasure the variation of each individual's mean SBP and slope from their average in pedigree k. The intercept b00krepresents the average SBP in pedigree k for males in Cohort 1 with average age and BMI and the intercept b10krepresents the average slope in pedigree k for males in Cohort 1 with average BMI. To account for the correlation of individuals within a pedigree, these two intercepts were allowed to vary between pedigrees. The random effects at different levels of the model are assumed independent.
Pedigree random-intercept model – Level 3
b00k= β000 + v00k, k = 1,...,N
Pedigree random-slope model – Level 3
The random components v00kand v01kmeasure the variation of each pedigree's mean SBP and mean slope from their average in the whole sample.
Statistical tests in the multilevel model
Analyses were conducted in both the unselected and selected samples and with and without adjustment for BMI. Multilevel models were fitted using SAS PROC MIXED [3]. Parameter estimates are obtained by restricted maximum likelihood estimation (REML). An F-statistic was used to test the significance of the fixed effects with number of degrees of freedom computed using the containment method [4]. The likelihood ratio statistic based on REML likelihoods was used to test the significance of the random effects. The null distribution of this statistic is a mixture of
and
with equal weights 0.5, where q and q + 1 are the number of random effects estimated under H0 and H1, respectively.
Genome-wide linkage analysis
We used the estimates of the random effects at the subject and pedigree levels to define two new outcomes that were used in the genome-wide linkage analysis. The two outcomes were defined as
and
, which measure the random variation of each individual's SBP mean and slope, respectively, from the sample average after adjustment for the fixed effects. A third outcome was also defined using the residuals from a sample-wide regression in which each individual's mean SBP (across all exams) was regressed on his mean age (centered), mean BMI (centered), gender and cohort, as in Levy et al.'s paper [2]. Estimation of heritability and two-point linkage analyses were performed on the pedigree data using the variance component models implemented in the SOLAR package [5].