The RCR model is a two-stage model which admits both individual-level and population-level effects. It is known to be robust against data that are not missing completely at random [1].
Let y be a trait measured at n time points in m individuals, yielding m time series of n measurements each: y
ij
(i = 1, ..., n ; j = 1, ..., m). Note that some y
ij
may be missing. Let us denote by y
j
the time series of observations for the jth participant. Population level effects, i.e., covariates which are assumed to affect the trait y in the same manner for all subjects, are modeled by C, an n × p matrix of regressors. The jth participant's values for these p covariates are given by a p × 1 data vector, ξ
j
(j = 1, ..., m).
Individual-level effects, that is the effects of covariates which may impact each subject differently, are modeled by a family of q × 1 data vectors, ζ
j
(j = 1, ..., m), consisting of the subjects observed values for the q individual level covariates, and n × q matrices of regressors, B
j
. Thus, there are different matrices of regressors (B
j
) for each individual. An example of an individual level effect would be a linear dependence of a trait on age with slope and intercept values that vary from individual to individual. The conditional expected value of the time series is then a sum of the population level and individual level effects:
E(y
j
| ξ
j
, ζ
j
) = C ξ
j
+ B
j
ζ
j
.
We make a homoscedasticity assumption in that we assume that the conditional variance of the time series, V = Var(y
j
| ξ
j
, ζ
j
), depends on the individual only through the number of, and time between, observations. We assume further that observations, while being correlated within individuals, are independent between individuals and that the distribution of the conditional time series is multivariate normal.
For the GAW13 simulated data, many possible RCR models were tested on the time series of observations of the serum glucose phenotype. The RCR modeling framework assumes that the individual level parameters, i.e., those coefficients of the individual level regressor matrices B
j
which are variable, are drawn from a joint multivariate normal distribution. Individual level covariance structures tested included linear, quadratic, and exponential models of age dependence. We shall hereafter refer to the estimates of the regression coefficient(s) in the individual level effects that correspond to age dependence as the growth or slope parameter(s).
Population-level covariance structures tested included those with and without body mass index (BMI) and sex effects. Note that these models, while differing in the number of covariates or particular structure assigned to the individual and/or population levels of variance effects, all take the same basic form within the framework of the RCR model paradigm. Models were selected based upon their fit as quantified by the Akaike Information Criterion (AIC). The modeling and fitting was performed in SAS using the PROC MIXED procedure.
For comparison, a two time-point subset of the sample was selected by taking the first and third observations in the first cohort (constituting a four-year interval) and the first and second observations in the second cohort (constituting a five-year interval). The slope between these observations was then used as an alternative growth parameter phenotype. In addition, the glucose levels at each of the two time points in this subset were separately analyzed as phenotypes. This cross-sectional approach serves as a contrast with the linkage analysis using the growth parameter(s).
All four phenotypes (Time 1 and 2 glucose levels, two-point slope, and the slope parameter(s) from the RCR model) were used as input phenotypes for a linkage analysis in SEGPATH. SEGPATH performs variance-components linkage analysis on complex, extended pedigrees [3, 4].