Data
We analyzed the repeated measurements of systolic blood pressure (SBP) in the first replicate of the simulated GAW13 data with missing values, including all 4692 individuals in all 330 pedigrees. Analyses were done blind to the answers.
Each missing value of the covariates 'drink' (number of drinks per week) and 'CPD' (number of cigarettes per day) was replaced with the mean of the covariate for that individual in the two observations surrounding the missing observation, enabling us to increase the number of useful observations without excluding important covariates from the model. The phenotypic component of the model included 25,055 observations for 2851 individuals. These observations all had non-missing values for SBP, height, drink, and CPD.
Model
The models fitted in this analysis are extensions of our previously described variance components models for extended pedigrees with normal, binary, or censored survival phenotypes [1, 2]. The original models are GLMMs with components of variance due to additive polygenic effects (σ2A), common family environment (σ2C), common sibling environment (σ2Cs), and residual error (σ2E). The complex familial correlation structure is modelled through the use of shared random effects for each variance component. The model for the jth individual in the ith pedigree can be represented as:
μij = β0 + βTXij + Aij + Cij + Csij
yij ~ N(μij, σ2E),
where yij is the observed phenotype, Xij is a matrix of covariates,
A
i
~ N(0, VA), C
i
~ N(0, VC), Cs
i
~ N(0, VCs),
and the random effects for each family are sampled from appropriate multivariate normal distributions with variance-covariance matrices (VA, VC, and VCs) parameterized by σ2A, σ2C, and σ2Cs, respectively, and structured as described previously [1]. Individuals without phenotypic data were used in generating the random effects but not in the main model.
The extensions not only allow for repeated measures on individuals, but use the extra information available to assess evidence for genetic effects on the rate of change of the outcome over time. At the kth observation for the jth individual in the ith pedigree, the model can be expressed as
μijk = β0 + βTXijk + Aij + Cij + Csij + ageijk × (βage + A.timeij + C.timeij + Cs.timeij + E.timeij)
yijk ~ N(μijk, σ2E),
where yijk is the observed phenotype, Xijk is a matrix of covariates, βage is the mean gradient of age (for all individuals), A
i
~ N(0, VA), C
i
~ N(0, VC), Cs
i
~ N(0, VCs), A.time
i
~ N(0, VA.time), C.time
i
~ N(0, VC.time), Cs.time
i
~ N(0, VCs.time), and E.timeij ~ N(0, σ2E.time).
In this model time was measured using the covariate 'age'. The random effects for the gradient of age, A.timeij, C.timeij, Cs.timeij, and E.timeij, have a covariance structure equivalent to that described previously [1, 2], with each variance component replaced by its counterpart for the gradient, so (for example) in this part of the model σ2A is represented by σ2A.time. However, the variance components for the intercept (σ2A, σ2C, σ2Cs, and σ2E) are independent of the variance components for the gradient (σ2A.time, σ2C.time, σ2Cs.time, and σ2E.time), and the estimated random effects for each variance component for the intercept are different from the estimated random effects for each variance component for the gradient. All random effects are constant across observations on the same individual; they do not change over time. Just as the size of σ2A can indicate whether additive genetic effects influence the outcome, the size of σ2A.time can indicate whether additive genetic effects influence the rate of change of the outcome with time.
All covariates were centered before inclusion in the model. The main model included the covariates male sex, cohort, height, drink, CPD, and some models also included BMI ((weight in kg)/(height in m)2) as a continuous variable. To allow for a nonlinear increase in SBP with age, quadratic and cubic age terms were also included.
The models were fitted using Gibbs sampling in WinBUGS v1.3 [3]. Vague normal N(0, 1000) priors were specified for fixed regression coefficients. Vague Pareto (1,0.001) priors were specified on the precisions (inverses of the variances) of all random effects; these were equivalent to specifying uniform priors on the corresponding variances. The priors for the intercept variance components (and associated random effects) were independent of the priors for the gradient variance components (and associated random effects). Models were run for 20,000 iterations after a burn-in of 5000 iterations. Although the model is Bayesian, the effect of each covariate was also assessed using a pseudo-likelihood approach.
Blood pressure adjustment
The problem of how to model continuous SBP when some individuals are on blood pressure treatment is complex. As Cui et al. state, "the usual regression techniques used to adjust for the effects of antihypertensive treatment are inappropriate because they result in treated levels having an average of zero residuals rather than the extreme residuals they deserve, given their pretreatment pressures" [4]. Since complex modelling of SBP was not the main aim of this particular analysis, we chose a simple method of adjustment, which involved adding a constant (10 mm Hg) to each phenotype in which the individual was on treatment, to reflect the 'true' SBP that might have been observed if the individual had not been on treatment [4].
Linkage analysis
As described previously [1, 5] the random effects due to σ2A (Aij, the sigma-squared A residuals or SSARs) can be used as a continuous phenotype in a linkage analysis. The SSARs for the gradient (A.timeij, the sigma-squared A time residuals, or SSATRs) may also be used in such a way. The estimated SSARs and SSATRs were used as continuous phenotypes in a variance components linkage analysis that was performed using MERLIN [6]. All 399 markers were included.