### Subjects and phenotypes

We analyzed both real and simulated (replicate 59) data sets. Each data set comprised two cohorts: the "original cohort," enrolled in 1948, was examined every two years for total of 21 visits; the "offspring cohort," enrolled in 1971, was examined every four years (following an initial 8-year interval) for total of 5 visits. We analyzed the phenotype systolic blood pressure (SBP), together with covariates cohort, age at exam, sex, hypertension treatment (HRX), and body mass index (BMI). All subjects with phenotype data on at least three visits (*N* = 2583 and 2686 from two cohorts combined for the simulated, S, and real data, R, respectively) were included in this analysis. The average number of intermittent missing SBP observations was 0.88 (SD = 1.97, ranging from 0 to 17) in Cohort 1 for the simulated data, and 0.68 (SD = 1.70, ranging from 0 to 12) for the real data. For Cohort 2, the means were about 0.20 and 0.06 (both SD = 0.7, ranging from 0 to 2). For the last visits, only 41.3% (R) and 21.5% (S) had SBP values in Cohort 1, whereas most of Cohort 2 returned (both 90.9%).

Since exams were scheduled at regular intervals and height and weight did not vary much from visit to visit, we imputed intermittent missing values in these variables using deterministic rules (missing data due to premature truncation were not imputed). For Cohort 1, missing ages were imputed by adding two years to the age at the previous visit; for Cohort 2, four years were added to the age at previous visit, except for the second visit, when eight years were added to the age of the first visit. If the imputed value was greater than or equal to age at the next visit, three years were subtracted from the age at the following visit. Missing heights and weights were replaced by the most recent value. Several subjects had no height information (three in the real data and one in the simulated data), so we estimated their heights by single imputation, based on a regression of height on weight from subjects with available data.

### Multiple imputation

We assumed that SBP values were MAR and multivariate normal. Multiple imputation inference assumes that the model used to analyze the imputed data (the analysis model) is the same as the model used to impute missing values (the imputation model) [7]. However, this is not a critical assumption as long as the variables that appear in one model but not in the other are not related to the dependent variable, and additional variables can be used to improve the imputation that are not needed in the analysis model [5]. In our analyses, age, BMI, HRX, and SBP were included in the imputation, but only sex, BMI, and SBP were included in the analysis model. Hypertension treatment was addressed separately before the analysis, as described below.

Two different imputation methods, propensity score [8, 9] and regression methods [3, 10], were compared to address the problem of potentially informative missingness. We also compared the two imputation methods to the case-wise deletion method for real data and complete data analysis for simulated data. The propensity score and regression methods require a monotone missing data pattern. Since these data had an arbitrary missingness pattern (for example, HRX was missing when SBP was observed and/or SBP at visit 4 was missing when SBP at visit 5 was observed), we applied these methods in a "time-wise" two-stage manner to data sets with intermittent missing values. First, we dealt with missing HRX and SBP at each visit chronologically. HRX was imputed first using age and BMI at that visit *t* and SBP from all previous visits *t*^{-} since HRX at visit *t* only depends on HRX and SBP at visit *t*^{-}. Then SBP at visit *t* was imputed, conditional on recorded or imputed HRX at visit *t*. Only the intermittent missing data (subjects who returned for subsequent visit) were imputed; further "premature truncations" were not imputed.

The propensity score method is a semiparametric approach, based on the following steps. First, for each variable with missing values, a logistic model is fitted for the probability of missingness (the "propensity score") as a function of all previous variables in the data set. The observations are then grouped based on these propensity scores, and an approximate Bayesian bootstrap imputation is applied to each group. (This is done first by drawing a sample with replacement from the set of nonmissing observations, and then assigning the missing observations by sampling from this subset of nonmissing values.)

The regression method is a parametric approach, in which a regression model is fitted for each variable with missing values, using the previous observations as covariates. Based on the fitted regression coefficients, a new regression model is simulated from the posterior predictive distribution of the parameters and is used to impute the missing values for each variable. The process is repeated sequentially for variables with missing values. The regression method yielded a continuous value for imputed HRX, which was then converted to a binary variable as follows: the imputed value less than or equal to 0 was assigned as 0, and the value greater than or equal to 1 was assigned as 1 in the final imputed value. If the value was in between 0 and 1, the subject was assigned to treatment with corresponding probability. For those subjects who were either known or imputed to have received hypertension treatment at a given observation, SBP was adjusted further with Levy's algorithm to estimate their untreated SBP [11].

For comparison purposes, we also used the imputed and adjusted SBP values to form age-interval-specific residuals as in the GAW13 contribution from Kraft et al. [12]. We first averaged each subject's imputed SBP and BMI measurements over the age interval 35-50. Then the average imputed SBP was regressed on gender and average BMI. We used the residuals from this regression in a variance-components analysis with a fixed mean effect, a random additive polygenic effect, and an independent random error. Thus there were three parameters to estimate: the mean μ, the polygenic variance σ_{a}^{2}, and the random error variance σ_{e}^{2}.

Ten imputed data sets were generated for each of the three methods for the real and simulated missing data, and each was analyzed using the variance components method. The results from the multiple analyses were then combined for final summary. Parameter estimates are given by simple averages of the estimates over all imputations. The within-imputation variance

(the mean of the sampling variance estimates from each imputation) and the between-imputation variance *B* (the sample variance of the estimates across imputed data sets) were calculated, and the total variance is given by

where *m* is the number of imputations. The relative increase in variance due to nonresponse [3] is calculated by