We used the GAW13 simulated complete data set for Replicate 1 consisting of 2860 genotyped individuals in 330 families. The data set was examined using three genetic analysis programs: SIBLINK , Ordered Subset Analysis (OSA) , and Sequential Oligogenic Linkage Analysis Routines (SOLAR) . We chose to analyze chromosomes 5, 7, 13, 15, and 21 because they contained genes that determine blood pressure, acting either at baseline or in change over time. We analyzed four null chromosomes (2, 4, 6, 10) to get a sense of the number of type I errors using OSA based on the mean covariate values in affected/unaffected individuals. We found that the false positive rate was consistent with the nominal significance level (data not shown).
Subject-specific slopes reflecting individual change over time of SBP were computed with SAS-PROC MIXED  using three different models: unadjusted slope, adjusted slope, and the "true" slope. The unadjusted slopes fit a line to the observed longitudinal BP values. The adjusted slopes fit a model including individual environmental variables that we considered important in the absence of knowledge about the true simulation model. Fixed effects were fit for age, gender, smoking, drinking, and cohort, and both fixed and random effects were fit for hypertension treatment. We also computed slope values that were as close as possible to the "true" values by using the simulating model for SBP provided in the GAW13 answers, including the correct transformations of the involved variables. To account for the different number of time points contributing to the slope estimates for each subject, we calculated normalized slope values (slope divided by standard deviation of the estimate).
Variance components analysis (SOLAR) was used to perform multipoint linkage analysis using the normalized individual slope values for SBP (unadjusted, adjusted, and true) as the quantitative trait values. All families were included in this analysis. A common constant was added to make all estimated slope values positive, then the data were log-transformed and outliers were eliminated to approximate a normal trait distribution. Heritability was estimated from the best-fitting polygenic model, starting with a model that included age (at onset of HBP for affected individuals, at exam for unaffecteds) and average values (over all time points) of cholesterol, glucose, HDL, triglycerides, height, weight, and body mass index (BMI). Once this model was generated, multipoint linkage analysis using a variance components approach was carried out using SOLAR.
SIBLINK v. 3.0 was used to perform multipoint ASP linkage analysis using HBP as indicated in the data set as a dichotomous disease trait. There were 171 families with at least one ASP included in our SIBLINK analysis, for a total of 575 ASPs with hypertension. SIBLINK was used to calculate family-specific multipoint LOD scores across each chromosome, based on estimated identity by descent (IBD) status among affected sibling pairs. The OSA program then utilized the multipoint LOD scores and covariate values from each family, attempting to identify homogeneous subsets of families presenting increased evidence for linkage. In theory, examining the data from a homogeneous subset of linked families yields a more accurate estimate of disease gene location because the location estimate is not influenced by unlinked families. The OSA program takes as input, any multipoint set of additive, family-specific linkage scores, such as LOD scores from SIBLINK or nonparametric LOD scores from GENEHUNTER Plus (Kong and Cox). The OSA program ranked families by mean values of the covariates: the calculated slope values for SBP and additional covariates age (age at onset of HBP for affected individuals, age at exam for unaffecteds), cholesterol, glucose, HDL, triglycerides, height, weight, and BMI. Family-specific means were calculated i) for all affected individuals and ii) for all family members regardless of affection status (individual values were averages over all time points). Using one covariate at a time, family-specific multipoint LOD scores were added in the covariate-based rank order and the maximum LOD score for any subset of families was determined for that covariate. The significance of the increase in the subset-based LOD score over the global LOD score from all families was assessed with an empirical p-value based on randomly permuting the order in which families were added.