- Research article
- Open Access
- Published:

# Bayesian estimation of genetic parameters for multivariate threshold and continuous phenotypes and molecular genetic data in simulated horse populations using Gibbs sampling

*BMC Genetics*
**volumeÂ 8**, ArticleÂ number:Â 19 (2007)

## Abstract

### Background

Requirements for successful implementation of multivariate animal threshold models including phenotypic and genotypic information are not known yet. Here simulated horse data were used to investigate the properties of multivariate estimators of genetic parameters for categorical, continuous and molecular genetic data in the context of important radiological health traits using mixed linear-threshold animal models via Gibbs sampling. The simulated pedigree comprised 7 generations and 40000 animals per generation. Additive genetic values, residuals and fixed effects for one continuous trait and liabilities of four binary traits were simulated, resembling situations encountered in the Warmblood horse. Quantitative trait locus (QTL) effects and genetic marker information were simulated for one of the liabilities. Different scenarios with respect to recombination rate between genetic markers and QTL and polymorphism information content of genetic markers were studied. For each scenario ten replicates were sampled from the simulated population, and within each replicate six different datasets differing in number and distribution of animals with trait records and availability of genetic marker information were generated. (Co)Variance components were estimated using a Bayesian mixed linear-threshold animal model via Gibbs sampling. Residual variances were fixed to zero and a proper prior was used for the genetic covariance matrix.

### Results

Effective sample sizes (ESS) and biases of genetic parameters differed significantly between datasets. Bias of heritability estimates was -6% to +6% for the continuous trait, -6% to +10% for the binary traits of moderate heritability, and -21% to +25% for the binary traits of low heritability. Additive genetic correlations were mostly underestimated between the continuous trait and binary traits of low heritability, under- or overestimated between the continuous trait and binary traits of moderate heritability, and overestimated between two binary traits. Use of trait information on two subsequent generations of animals increased ESS and reduced bias of parameter estimates more than mere increase of the number of informative animals from one generation. Consideration of genotype information as a fixed effect in the model resulted in overestimation of polygenic heritability of the QTL trait, but increased accuracy of estimated additive genetic correlations of the QTL trait.

### Conclusion

Combined use of phenotype and genotype information on parents and offspring will help to identify agonistic and antagonistic genetic correlations between traits of interests, facilitating design of effective multiple trait selection schemes.

## Background

Use of linear models for the estimation of genetic parameters for categorical traits violates basic assumptions of mixed linear model methodology. Algorithms have been developed to transform linear model estimates to the underlying liability scale in order to compensate for the estimation bias caused by analysis of non-linear traits in linear models, but transformed genetic parameter estimates might still be significantly biased [e.g., [1â€“4]]. Use of threshold models for estimation of genetic parameters directly accounts for the non-linear nature of categorical traits, and threshold model estimates should be more reliable than linear model estimates or transformed linear model estimates [e.g., [5â€“7]].

Markov chain Monte Carlo (MCMC) methods such as Gibbs sampling (GS) make it feasible to implement multivariate threshold models or multivariate mixed linear-threshold models. Animal models fully use the available pedigree information, but accuracy of genetic variance and covariance estimates and convergence of GS chains might be a problem in the case of low trait prevalences and few observations per individual [e.g., [5, 8â€“10]]. Therefore, in practical situations implementation of multivariate animal threshold models is not always straightforward. The inclusion of continuous traits, i.e. use of a multivariate linear-threshold model, is expected to increase the reliability of genetic parameter estimates [e.g., [2, 11â€“13]].

In animal breeding, health data are often recorded using discrete categories, whilst most performance traits are continuous. In the horse, binary coding has been used for investigations on radiographic findings in the equine limbs, and high prevalences of radiologically visible alterations, mostly in the range of 10â€“25%, have been determined in the limbs of young horses [e.g. [14â€“17]]. This promoted the search for preventive rather than curative measures. Because strength and soundness of the equine locomotory system is of great importance in all sectors of the horse industry, inclusion of radiological health traits in current breeding schemes of the Warmblood riding horse has been suggested [18, 19]. Reliably estimated genetic parameters for these categorical traits provide the basis to do so. For the analyzed radiographic health data, applicability of transformation factors according to Dempster and Lerner [20] and Vinson et al. [21] to linear animal model estimates obtained with residual maximum likelihood (REML) has been proven via simulation [22]. However, rate of over- or underestimation, which is caused by analysis of binary traits in linear models and has to be compensated via transformation, depends on data structure. Re-evaluation of transformation procedure will therefore be required for analyses of data of different structure with respect to distribution and kind of available information. Implementation of a Bayesian multivariate animal threshold or mixed linear-threshold model for (co)variance component estimation may provide a worthwhile alternative.

Quantitative trait loci (QTL), i.e. genome regions which include genes that influence the phenotype of an individual with respect to a particular trait, have been identified for production and health traits in different species [23]. Increasing knowledge on genetic determination of radiological health traits [24, 25] implies further opportunities for improvement of genetic evaluation and selection schemes in the horse. Conditions for efficient use of marker-assisted selection have been described [26], but the effects of combined use of phenotypic and genotypic data on the accuracy of genetic parameter estimation for categorical traits has not been studied yet. Requirements for successful implementation of multivariate animal threshold models including phenotypic and genotypic information are unknown.

The aim of this study was to characterize the properties of multivariate estimation of genetic parameters for categorical, continuous and molecular genetic data using linear-threshold animal models and Gibbs sampling. Impact of data structure and quality of molecular genetic data on the accuracy of genetic parameter estimates was studied in the context of important radiological health traits in Warmblood horses.

## Results

### Gibbs chains

The number of rounds of Gibbs sampling that had to be discarded as burn-in ranged between 5000 and 53000. Mean lengths of burn-in ranged between 8100 and 13866 rounds in the analyses of the different datasets. Mean, minimum and maximum length of burn-in were lowest in the analyses of dataset A2 and highest in the analyses of dataset B1.

Mean, minimum and maximum ESS of heritabilities and additive genetic correlations by dataset and quality of genotype information on T2 are given in Tables 1 and 2. For heritability of the continuous trait (T1) ESS ranged between 286.9 and 858.4 in the analyses of data on animals from one generation (datasets A1, B1 and C1) and between 1222.7 and 1949.0 in the analyses of data on animals from two subsequent generations (datasets A2, B2 and C2). For heritability of the binary traits ESS ranged between 23.0 and 139.3 for T2, between 24.1 and 147.5 for T3, between 15.4 and 115.2 for T4, and between 50.6 and 289.6 for T5 in the analyses of all datasets. Mean ESS was 810.7 to 1325.4 for the additive genetic correlation between T1 and T5, 175.7 to 500.7 for the additive genetic correlations of T1 with T2, T3 and T4, and 86.4 to 376.2 for the additive genetic correlation between traits T2, T3, T4 and T5. Significant influence on ESS was determined for the analyzed dataset (P < 0.001), but not for the quality of genetic marker information. There were no significant differences between ESS for all heritabilities and all additive genetic correlations in analyses of datasets B1 and C1 and between ESS for all heritabilities and all additive genetic correlations but r_{g12} in analyses of datasets B2 and C2.

Mean MCE was 0.016 to 0.017 for heritability of T1 in analyses of datasets B1 and C1, and 0.002 to 0.010 for heritability of T1 in analyses of datasets A1, A2, B2 and C2 and heritabilities of the binary traits (T2 to T5) in analyses of all datasets. Mean MCE for additive genetic correlations ranged between 0.001 and 0.004 in all analyses.

### Parameter estimates

Genetic parameter estimates differed little between scenarios r0p9, r1p9 and r0p7. In the analyses of the six datasets ranges of heritability estimates were h^{2} = 0.487â€“0.667 for T1, h^{2} = 0.058â€“0.171 for T2, h^{2} = 0.193â€“0.343 for T3, h^{2} = 0.081â€“0.170 for T4, and h^{2} = 0.210â€“0.354 for T5. Ranges of estimates for the simulated additive genetic correlations were r_{g12} = 0.009â€“0.327, r_{g13} = 0.102â€“0.316, r_{g14} = -0.295 to -0.034, and r_{g45} = -0.506 to 0.150.

Mean, minimum and maximum bias of heritabilities and additive genetic correlations by dataset and quality of genotype information on T2 are given in Tables 3 and 4. Bias of heritability estimates for T1 ranged between -0.075 and 0.221 in the analyses of datasets A1, B1 and C1, and between -0.089 and -0.021 in the analyses of datasets A2, B2 and C2. Mean bias of heritability estimates for T2 was -0.134 to 0.205 in the analyses of datasets A1, A2, B1 and B2, 0.94 to 0.99 in analyses of dataset C1, and 0.47 to 0.52 in analyses of dataset C2. Mean bias of heritability estimates was -0.057 to 0.024 for T3, 0.065 to 0.231 for T4, and -0.018 to 0.095 for T5 in all analyses. Extremes of negative and positive bias of heritability estimates were -0.414 and 1.811 for T2, -0.233 and 0.316 for T3, -0.201 and 0.649 for T4, and -0.156 and 0.423 for T5. Additive genetic correlations between T1 and T2, T1 and T3, and T1 and T4 had a mean bias of -0.503 to 0.177 and a bias range from -0.958 to 0.637. Additive genetic correlation between T4 and T5 had a mean bias of 0.127 to 0.376 and a bias range from -1.748 to 1.532. The analyzed dataset had a significant influence on the bias of heritability estimates for T1, T2, T4 and T5 (P < 0.01) and of estimated additive genetic correlations between T1 and T2, T1 and T3, and T1 and T4 (P < 0.001). Bias of heritability estimate for T2 was further significantly dependent on the quality of genetic marker information (P < 0.01), with lower means in scenario r0p7 than in scenarios r0p9 and r1p9.

## Discussion

### Mixed linear-threshold model analyses via Gibbs sampling

Simulated categorical, continuous and molecular genetic data were used for multivariate estimation of genetic parameters in mixed linear-threshold animal models via Gibbs sampling. General statements on the properties of multivariate analyses in mixed linear-threshold models require validation using independent sample datasets. For this study, simulation of multiple populations per study scenario and random sampling of one set of datasets per population was straightforward, but computationally expensive. In a preliminary study we therefore tested an alternative approach, with simulation of one population per study scenario and repeated random sampling of sets of datasets from one population. Because results regarding effective sample sizes and biases were not statistically different under these two approaches (results not shown), the computationally less expensive approach was chosen for the main study, i.e. replicate datasets were generated by repeated random sampling from single multi-generation populations.

Datasets used for the analyses differed in the number of animals with trait information, the distribution of animals with trait information and the availability of marker genotype information. The simulated pedigree structure resembled that of the Hanoverian Warmblood horse, and simulated data resembled the distribution of radiographic findings in the limbs of young Warmblood riding horses [18]. The traits of interest were binary, but one continuous trait was included in the multivariate analyses in order to improve mixing and convergence of the Gibbs sampler. Even in the analyses of the smallest datasets burn-in periods of 5000 to 15000 rounds were mostly sufficient for the Gibbs chains to converge. However, effective sample size of heritability of the continuous trait was considerably larger than effective sample sizes of heritabilities and additive genetic correlations of the binary traits. Effective sample sizes mentioned in literature for heritabilities of binary traits and uni- or bivariate animal threshold models were often in the same order despite larger numbers of informative animals and longer chain lengths [27â€“30]. In a previous simulation study inclusion of a continuous trait in animal threshold model analyses of binary traits had a positive effect on convergence, but not on effective sample size [8]. We found significant dependence of effective sample sizes on amount and distribution of available information. Given a certain number of animals with trait records, effective sample sizes for heritabilities were larger when animals were from two subsequent generations instead of one generation. There was no such clear effect on effective sample sizes for additive genetic correlations. Despite fixation of residual covariances to zero, effective sample sizes for additive genetic covariances and correlations were in many cases larger than effective sample sizes for additive genetic variances and heritabilities of the binary traits. Results from literature are not consistent in this respect [31].

### Genetic parameter estimates

Bias of genetic parameter estimates is the major argument against transformation of linear model estimates and in favor of threshold model analysis of binary traits [e.g. [1, 2]]. Heritabilities of the continuous trait and the moderately heritable binary traits (h^{2} = 0.25) were over- or underestimated by maximally 10 percent. If heritability of the binary trait was lower (h^{2} = 0.10) and only phenotype information was considered, mean bias increased to up to 25 percent over- or underestimation. These values are in agreement with literature [1, 10, 32] and clearly smaller than [2] or similar to [22] reported bias of transformed linear model estimates. Inclusion of fixed genotype effect in the model resulted in marked overestimation of heritability of the QTL trait. If trait information was available on animals in only one generation, upward bias was larger than 90 percent, indicating failure to reliably estimate heritability of the polygenic component. Heritability estimates were close to simulated overall heritabilities. If trait information was available on animals in two subsequent generations, i.e. to parents and offspring, upward bias of polygenic heritability estimates was reduced to about 50 percent.

Bias of additive genetic correlations between continuous and binary traits was dependent on heritability of the binary trait, but largely independent of the sign of the simulated correlation. Low heritability of the binary trait (h^{2} = 0.10) resulted in downward bias of continuous-binary correlation estimates by up to 50 percent, i.e. estimates for both positive and negative additive genetic correlations were closer to zero than the simulated values. Estimates for the positive additive genetic correlation between the continuous and the moderately heritable binary trait (h^{2} = 0.25) were on average less biased, with upward bias of up to 18 percent in the small datasets and downward bias of up to 8 percent in the larger datasets. Ranges of biases were similar for all continuous-binary correlation estimates. Estimates for the positive additive genetic correlation between two of the binary traits showed considerable variation of bias and were on average overestimated by 13 to 38 percent. Variation of bias was larger if information on animals from only one generation was considered, and overestimation was larger if information on animals from two subsequent generations was considered. Underestimation of positive additive genetic correlation between one continuous trait of moderate heritability (h^{2} = 0.25) and binary traits of low heritability (h^{2} = 0.05 or 0.10) and low prevalence (0.05 or 0.15) has been reported previously [33]. However, in this and a similar study estimates for positive additive genetic correlation between the two binary traits were biased downward as well [2, 33]. Differences in simulation parameters and method of analysis may be responsible for the different results. In contrast with our study, simulated additive genetic correlations were all 0.5, residual correlations of 0.3 or 0.2 or -0.2 were simulated, and analyses were performed in linear animal model with residual maximum likelihood (REML). Theoretically, estimates for additive genetic correlations should not be affected by violation of assumption of normality when using linear models for analysis of binary data [21, 34]. However, opposite directions of biases seem to arise from use of linear and threshold models for estimation of genetic correlations between binary traits.

### Amount and structure of data

Amount and structure of available information had little influence on accuracy of heritability estimates for the continuous trait and the binary traits of moderate heritability. Increase of the amount of available information increased accuracy of heritability estimates for the binary traits of low heritability and of additive genetic correlation estimates between continuous and binary traits, but also increased the upward bias of additive genetic correlation estimates between two binary traits. Influence of quality of genetic marker information on accuracy of heritability estimates for the binary QTL trait and additive genetic correlation estimates between the continuous trait and the binary QTL trait was less distinct. Given high PIC of genetic markers, estimates were largely unaffected by presence or absence of recombination between markers and QTL. Decrease of PIC resulted in lower heritability estimates, i.e. increased negative bias or decreased positive bias of heritability estimates, and larger range of bias of heritability estimates for the QTL trait. Estimates of additive genetic correlations between the continuous trait and the binary QTL trait were least biased when phenotype and genotype information was considered, genotype information included highly informative markers, and trait information referred to animals from two subsequent generations. There are no reports of similar investigations on influences on accuracy of genetic parameter estimates. Simulated heritabilities and additive genetic correlations were aligned with results of previous real data analyses and kept constant in this study. Furthermore, we simulated only four of the ten additive genetic correlations and no residual correlations in the multivariate setting of one continuous and four binary traits. Correlations between the binary QTL trait and another binary trait were not simulated. Impact of quality of genetic marker information on accuracy of additive genetic correlation estimates between two binary traits, one of which is influenced by QTL, requires further investigation. Simulation can be extended by modification of heritability levels, closeness and sign of genetic correlations and inclusion of more than one QTL trait. This study documents general feasibility of combined use of phenotype and genotype data for estimation of genetic parameters in the horse using multivariate mixed linear-threshold animal models and Gibbs sampling.

## Conclusion

It is feasible to perform multivariate estimation of genetic parameters for categorical, continuous and molecular genetic data in mixed linear-threshold animal models via Gibbs sampling using data and pedigree structures similar to those encountered in the Hanoverian Warmblood horse. Use of trait information on two subsequent generations of animals can increase accuracy of parameter estimates more than merely increasing the number of informative animals from one generation. Impact of different quality of genetic marker information with respect to recombination rate and PIC was minor in the scenarios studied. Consideration of marker genotype information as a fixed effect in the model is likely to result in overestimation of polygenic heritability of a QTL trait, but may be advantageous for quantification of additive genetic correlations between traits of interest. Improved identification of agonistic and antagonistic genetic correlations between traits of interests will facilitate design of effective multiple trait selection schemes.

## Methods

### Data simulation

Simulated data, which resembled the situation encountered in the population of the Hanoverian Warmblood horse, were used for this study. Characteristics of this population include a high percentage of artificial insemination and coexistence of many small breeding farms and few large studs. The number of horses used for breeding is rather low when compared to the total number of horses, but data collection for genetic analyses is not limited to broodmares and sires. We simulated fixed effects, residual and additive genetic variances for one continuous trait (T1) and liabilities of four categorical traits (T2 to T5), and QTL effects for the liability of one of the categorical traits (T2). Heritabilities were set to 0.50 (T1), 0.25 (T3, T5) and 0.10 (T2, T4). Simulated additive genetic correlations were positive between T1 and T2 and between T1 and T3 (r_{g} = 0.20), and negative between T1 and T4 and between T4 and T5 (r_{g} = -0.20). For T2 two QTL and two flanking markers per QTL with five randomly distributed and equally prevalent alleles each were simulated. Linkage between markers and corresponding QTL was assumed, with one of the marker alleles being associated with the unfavorable QTL allele, therewith indicating increased probability of affection with regard to T2. We assumed that the closely linked markers and the unfavorable allele of each of the markers had been identified in previous association studies, allowing the distinction between individuals homozygous positive, heterozygous or homozygous negative for the unfavorable allele of the respective marker. Only additive effects and no dominance effects were simulated, and total QTL variance was set equal to the polygenic variance. In order to study the effects of different quality of QTL information on the estimation of genetic parameters, three scenarios with respect to recombination rate between markers and QTL (r) and polymorphism information content (PIC) of markers were simulated: r = 0.00 and PIC = 0.9 for all markers (r0p9); r = 0.01 and PIC = 0.9 for all markers (r1p9); r = 0.00 and PIC = 0.7 for all markers (r0p7). In this context, a recombination rate of 0.01 between a marker and the QTL will mean that given the unfavorable marker allele, the QTL allele will be unfavorable with probability p = 0.99. A recombination rate of 0.00 between a marker and the QTL will mean that given the unfavorable marker allele, the QTL allele will be unfavorable with probability p = 1.00. This case is identical with the situation in which the causal mutation is definitely known. For the categorical traits, simulation of liabilities on the linear scale was followed by dichotomization in order to obtain trait prevalences of 25% (T_{2}, T_{5}) or 10% (T_{3}, T_{4}). Details on the simulation procedure are summarized in Table 5.

Each of the three simulated populations included 7 generations and 40000 animals per generation. Samples of 10000 animals were drawn at random from the fourth generation, and the pedigree of these animals was traced back over three generations. Within each of ten replicates which were generated this way, six different datasets were created for the genetic analyses. Dataset A1 included all 10000 animals with records for the continuous trait and the four binary traits, information on the fixed effects of sex and contemporary group, and pedigree information over three generations. Dataset B1 included 5000 animals, randomly chosen from the animals included in dataset A1, with respective information on traits, sex, contemporary group and pedigree. Dataset C1 included the same animals and the same information as dataset B1, and additional information on the marker genotype of the animals. Dataset B2 included the same animals as datasets B1 and C1 plus their parents with information on traits, sex, contemporary group and pedigree. Dataset C2 included the same animals and the same information as dataset B2, and additional information on the marker genotype of the animals. Dataset A2 included the same animals as datasets B2 and C2 plus the additional 5000 animals which were included in dataset A1 with information on traits, sex, contemporary group and pedigree. The average size of paternal halfsib groups ranged between 16.30 and 29.28, and the average size of maternal halfsib groups ranged between 1.23 and 1.55 among the animals with trait records in the six datasets. Distribution of trait and pedigree information in the different datasets used for the genetic analyses is summarized in Table 6.

### Statistical analyses

Genetic parameters were estimated using Gibbs sampling with the threshold version of the Multiple Trait Gibbs Sampler for Animal Models (MTGSAM) [35], a software which supports multivariate genetic analyses of any combination of continuous and categorical traits. Random and residual effects are assumed to be normally distributed, and flat priors are used for the fixed effects. The user can specify starting values and priors for additive genetic and residual (co)variance matrices. For our analyses, a starting value of one was chosen for all additive genetic variances, a starting value of zero was chosen for all additive genetic covariances, and the residual covariances between all traits were fixed to zero. In uni- and multivariate binary threshold models identifiability of the model is ensured by fixing the values for the thresholds and the residual variances to values of zero and one, respectively [36]. Methods for effective sampling of the residual covariance matrix subject to the restriction of diagonal elements fixed at one are still under development. However, fixation of the residual covariances for this study was justified by the results of previous real data analyses [14] and fit our simulated data. Because residual covariances were negligible in the real data and were accordingly set to zero in the data simulation, possible gain of further information seemed to be disproportionate to the additional costs of sampling of the residual covariances. Residual covariances were therefore fixed to zero in this study. For the genetic covariance matrix a proper prior using an inverse Wishart distribution with minimum shape parameter (i.e. Î½_{IW} = 7) was adopted in order to ensure posterior propriety. The fixed effects of sex and contemporary group were considered in all analyses. The fixed effect of marker genotype was considered in the analyses of datasets C1 and C2 only, distinguishing between individuals homozygous negative for the unfavorable alleles of all genetic markers, individuals heterozygous for the unfavorable allele of at least one of the genetic markers, and individuals homozygous for the unfavorable allele of at least one of the four genetic markers.

*y*_{
ijlm
}= *Î¼* + *SEX*_{
i
}+ *CONT*_{
j
}+ *a*_{
l
}+ *e*_{
ijlm
}(datasets A1, A2, B1 and B2), and

*y*_{
ijklm
}= *Î¼* + *SEX*_{
i
}+ *CONT*_{
j
}+ *QTL*_{
k
}+ *a*_{
l
}+ *e*_{
ijklm
}(datasets C1 and C2),

with *y*_{
ijlm
}(*y*_{
ijklm
}) = observation on trait T1 (continuous) or on trait T2, T3, T4 or T5 (binary) for the l^{th} animal,

*Î¼* = model constant,

*SEX*_{
i
}= fixed effect of the sex of the animal (i = 1â€“2),

*CONT*_{
j
}= fixed effect of the contemporary group (j = 1â€“5 in analyses of datasets A1, B1 and C1; j = 1â€“8 in analyses of datasets A2, B2 and C2),

*QTL*_{
k
}= fixed effect of the QTL marker genotype (n = 1â€“3),

*a*_{
l
}= random additive genetic effect of the l^{th} animal (l = 1â€“30533 to 30766 for dataset A1, l = 1â€“37028 to 37292 for dataset A2, l = 1â€“19791 to 20050 for datasets B1 and C1, and l = 1â€“26253 to 26664 for datasets B2 and C2), and

*e*_{
ijlm
}(*e*_{
ijklm
}) = random residual.

The total length of the Gibbs chain was set to 205000 in all analyses, and all samples after 5000 rounds of burn-in were saved. Convergence of the Gibbs chain and the need for additional rounds of burn-in to be discarded was checked by visual inspection of sample plots. Effective sample size (ESS) and Monte Carlo error (MCE) was calculated for all (co)variance estimates by the times series method implemented in the post-Gibbs analysis program POSTGIBBSF90 [37] with a thinning rate of ten. Un-thinned chains were used to calculate posterior means of additive genetic (co)variance, heritability and additive genetic correlation estimates. Posterior means rather than modes were chosen as point estimates, because in preliminary analyses the means were in most cases closer to the true, i.e. simulated, values than the modes, a finding which is in agreement with previous studies [8, 38]. Bias of heritability and additive genetic correlation estimates was calculated as the mean relative deviation of the estimated values (par_{est}) from the true, i.e. simulated, values (par_{true}).

*bias* = (*par*_{
est
}- *par*_{
true
})/*par*_{
true
}

The influence of data structure and quality of genetic marker information on ESS and bias was tested via analysis of variance using the procedure GLM of Statistical Analysis Systems, (SAS), version 9.1.3 (SAS Institute, Cary, NC, USA, 2005). Effective sample size or bias were considered as dependent variable, and dataset (A1, A2, B1, B2, C1, C2) and quality of genetic marker information (r0p9, r1p9, r0p7) were considered as fixed effect.

## References

Abdel-Azim GA, Berger PJ: Properties of threshold model predictions. Journal of Anim Sci. 1999, 77: 582-590.

MÃ¤ntysaari EA, Quaas RL, GrÃ¶hn YT: Simulation study on covariance component estimation for two binary traits in an underlying continuous scale. J Dairy Sci. 1991, 74: 580-591.

Van Vleck LD: Estimation of heritability of threshold characters. J Dairy Sci. 1972, 55: 218-225.

Van Vleck LD, Gregory KE: Multiple-trait restricted maximum likelihood for simulated measures of ovulation rate with underlying multivariate normal distributions. J Anim Sci. 1992, 70: 57-61.

Hoeschele I, Tier B: Estimation of variance components of threshold characters by marginal posterior modes and means via Gibbs Sampling. Genet Sel Evol. 1995, 27: 519-540.

Matos CAP, Thomas DL, Gianola D, Tempelman RJ, Young LD: Genetic analysis of discrete reproductive traits in sheep using linear and nonlinear models. I. Estimation of genetic parameters. J Anim Sci. 1997, 75: 76-87.

Meijering A, Gianola D: Linear versus nonlinear methods of sire evaluation for categorical traits: a simulation study. Gen Sel Evol. 1985, 17: 115-132.

Luo MF, Boettcher PJ, Schaeffer LR, Dekkers JCM: Bayesian inference for categorical traits with an application to variance component estimation. J Dairy Sci. 2001, 84: 694-704.

Misztal I, Gianola D, Foulley JL: Computing aspects of a nonlinear method of sire evaluation for categorical data. J Dairy Sci. 1989, 72: 1557-1568.

Moreno C, Sorensen D, GarcÃa-CortÃ©s LA, Varona L, Altarriba J: On biased inferences about variance components in the binary threshold model. Gen Sel Evol. 1997, 29: 145-160.

Jensen J: Bayesian analysis of bivariate mixed models with one continuous and one binary trait using the Gibbs Sampler. Proceedings of the 5th World Congress on Genetics Applied to Livestock Production: 7â€“12. 1994, 18: 33-336. August

Simianer H, Schaeffer LR: Estimation of covariance components between one continuous and one binary trait. Gen Sel Evol. 1989, 21: 303-315.

Wang CS, Quaas RL, Pollak EJ: Bayesian analysis of calving ease scores and birth weights. Gen Sel Evol. 1997, 29: 117-143.

Stock KF, Distl O: Genetic correlations between osseous fragments in fetlock and hock joints, deforming arthropathy in hock joints and pathologic changes in the navicular bones of Warmblood riding horses. Livest Sci. 2006, 105: 35-43. 10.1016/j.livsci.2006.04.027.

Stock KF, Distl O: Genetic analyses of radiographic appearance of the navicular bones in the Warmblood horse. Am J Vet Res. 2006, 67: 1013-1019. 10.2460/ajvr.67.6.1013.

Willms F, RÃ¶he R, Kalm E: Genetische Analyse von Merkmalskomplexen in der Reitpferdezucht unter BerÃ¼cksichtigung von GliedmaÃŸenverÃ¤nderungen. 1. Mitteilung: ZÃ¼chterische Bedeutung von GliedmaÃŸenverÃ¤nderungen. ZÃ¼chtungskunde. 1999, 71: 330-345.

Winter D, Bruns E, Glodek P, Hertsch B: Genetische Disposition von GliedmaÃŸenerkrankungen bei Reitpferden. ZÃ¼chtungskunde. 1996, 68: 92-108.

Stock KF, Distl O: Prediction of breeding values for osseous fragments in fetlock and hock joints, deforming arthropathy in hock joints and pathologic changes in navicular bones of Hanoverian Warmblood horses. Livest Prod Sci. 2005, 92: 77-94. 10.1016/j.livprodsci.2004.08.009.

Stock KF, Distl O: Expected response to selection when accounting for orthopedic health traits in a population of Warmblood riding horses. Am J Vet Res. 2005, 66: 1371-1379. 10.2460/ajvr.2005.66.1371.

Dempster ER, Lerner IM: Heritability of threshold characters. Genetics. 1950, 35: 212-235.

Vinson WE, White JM, Kliewer RH: Overall classification as a selection criterion for improving categorically scored components of type in Holsteins. J Dairy Sci. 1976, 59: 2104-2114.

Stock KF, Hamann H, Distl O: Estimation of genetic parameters for the prevalence of osseous fragments in limb joints of Hanoverian Warmblood horses. J Anim Breed Genet. 2005, 122: 271-280. 10.1111/j.1439-0388.2005.00527.x.

Dekkers JCM: Commercial application of marker- and gene-assisted selection in livestock: strategies and lessons. J Anim Sci (E Suppl). 2004, 82: E313-E328.

BÃ¶neker C, LÃ¶hring K, Wittwer C, Distl O: Genome-wide search for markers associated with osteochondrosis in Hanoverian Warmblood horses.

Dierks C, Distl O: Identification of a quantitative gene locus on equine chromosome 4 responsible for osteochondrosis in hock joints of Hanoverian Warmblood horses.

Lande R, Thompson R: Efficiency of marker-assisted selection in the improvement of quantitative traits. Genetics. 1990, 124: 743-756.

Brka M, Reinsch N, Kalm E: Frequency and heritability of supernumerary teats in German Simmental and German Brown Swiss cows. J Dairy Sci. 2002, 85: 1881-1886.

Hansen M, Lund MS, Pedersen J, Christensen LG: Genetic parameters for stillbirth in Danish Holstein cows using a Bayesian threshold model. J Dairy Sci. 2004, 87: 706-716.

Lund MS, Jensen J, Petersen PH: Estimation of genetic and phenotypic parameters for clinical mastitis, somatic cell production deviance, and protein yield in dairy cattle using Gibbs sampling. J Dairy Sci. 1999, 82: 1045-1051.

Piles M, Rafel O, Ramon J, Varona L: Genetic parameters of fertility in two lines of rabbit of different reproductive potential. J Anim Sci. 2005, 83: 340-343.

Janssens S, Vandepitte W, Bodin L: Genetic parameters for litter size in sheep: natural versus hormone-induced oestrus. Gen Sel Evol. 2004, 36: 543-562. 10.1051/gse:2004016.

Hoeschele I, Gianola D, Foulley JL: Estimation of variance components with quasi-continuous data using Bayesian methods. J Anim Breed Genet. 1987, 104: 334-339.

Gates P, Johansson K, Danell B: "Quasi-REML" correlation estimates between production and health traits in the presence of selection and confounding: a simulation study. J Anim Sci. 1999, 77: 558-568.

Gianola D: Theory and analysis of threshold characters. J Anim Sci. 1982, 54: 1079-1096.

Van Tassell CP, Van Vleck LD: Multiple-trait Gibbs sampler for animal models: flexible programs for Bayesian and likelihood-based (co)variance component inference. J Anim Sci. 1996, 74: 2586-2597. [http://www.aipl.arsusda.gov/curtvt/mtgsam.html]

Harville DA, Mee RW: A mixed-model procedure for analyzing ordered categorical. Biometrics. 1984, 40: 393-408. 10.2307/2531393.

Tsurata S: POSTGIBBSF90. [http://nce.ads.uga.edu/~ignacy/newprograms.html]

Van Tassell CP, Casella G, Pollak EJ: Effects of selection on estimates of variance components using Gibbs sampling and Restricted Maximum Likelihood. J Dairy Sci. 1995, 78: 678-692.

## Acknowledgements

This work was supported by a fellowship of KFS within the Postdoc-program of the German Academic Exchange Service (Deutscher Akademischer Austauschdienst, DAAD). IH acknowledges financial support from the US National Institute of Health (GM66103-01).

## Author information

### Authors and Affiliations

### Corresponding author

## Additional information

### Authors' contributions

KFS designed and carried out the simulation study, and drafted the manuscript. OD conceived of the study, participated in its design, and revised the manuscript. IH participated in conception and design of the study, coordinated it, and revised the manuscript. All authors read and approved the final manuscript.

## Rights and permissions

This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

## About this article

### Cite this article

Stock, K.F., Distl, O. & Hoeschele, I. Bayesian estimation of genetic parameters for multivariate threshold and continuous phenotypes and molecular genetic data in simulated horse populations using Gibbs sampling.
*BMC Genet* **8**, 19 (2007). https://doi.org/10.1186/1471-2156-8-19

Received:

Accepted:

Published:

DOI: https://doi.org/10.1186/1471-2156-8-19

### Keywords

- Quantitative Trait Locus
- Heritability Estimate
- Polymorphism Information Content
- Effective Sample Size
- Quantitative Trait Locus Allele