Skip to main content
  • Research article
  • Open access
  • Published:

Effects of single nucleotide polymorphism marker density on degree of genetic variance explained and genomic evaluation for carcass traits in Japanese Black beef cattle



Japanese Black cattle are a beef breed whose meat is well known to excel in meat quality, especially in marbling, and whose effective population size is relatively low in Japan. Unlike dairy cattle, the accuracy of genomic evaluation (GE) for carcass traits in beef cattle, including this breed, has been poorly studied. For carcass weight and marbling score in the breed, as well as the extent of whole genome linkage disequilibrium (LD), the effects of equally-spaced single nucleotide polymorphisms (SNPs) density on genomic relationship matrix (G matrix), genetic variance explained and GE were investigated using the genotype data of about 40,000 SNPs and two statistical models.


Using all pairs of two adjacent SNPs in the whole SNP set, the means of LD (r2) at ranges 0–0.1, 0.1–0.2, 0.2–0.5 and 0.5–1 Mb were 0.22, 0.13, 0.10 and 0.08, respectively, and 25.7, 13.9, 10.4 and 6.4% of the r2 values exceeded 0.3, respectively. While about 90% of the genetic variance for carcass weight estimated using all available SNPs was explained using 4,000–6,000 SNPs, the corresponding percentage for marbling score was consistently lower. With the conventional linear model incorporating the G matrix, correlation between the genomic estimated breeding values (GEBVs) obtained using 4,000 SNPs and all available SNPs was 0.99 for carcass weight and 0.98 for marbling score, with an underestimation of the former GEBVs, especially for marbling score.


The Japanese Black is likely to be in a breed group with a relatively high extent of whole genome LD. The results indicated that the degree of marbling is controlled by only QTLs with relatively small effects, compared with carcass weight, and that using at least 4,000 equally-spaced SNPs, there is a possibility of ranking animals genetically for these carcass traits in this breed.


Most economically important traits in beef cattle, including carcass traits, are controlled by many quantitative trait loci (QTLs), which usually have relatively small individual effects. For such traits, genomic evaluation (GE) and selection (GS), as proposed by Meuwissen et al.[1], is expected to chase the QTLs simultaneously using single nucleotide polymorphism (SNP) markers, given that at least one SNP is in linkage disequilibrium (LD) with each QTL. In concept, successful GS is expected to accelerate genetic improvement by reducing the generation interval and increasing the accuracy of genetic evaluation [2].

The recent development of various SNP chips has enabled high-throughput genotyping and allowed animal breeders to study and conduct GE and GS. By simulating 50,000 genome-wide high-density biallelic markers like SNPs, VanRaden [3] showed better performance of best linear unbiased prediction (BLUP) using a genomic relationship matrix (G matrix) relative to that using an additive relationship matrix (A matrix), based on pedigree information [4]. In dairy cattle, GS has already been adopted in some countries and is an effective method for increasing the rate of genetic improvement. In beef cattle, on the other hand, its adoption has been slower, because the accuracy of the genomic estimated breeding value (GEBV) is much lower because of less availability of sires with highly accurate results in progeny tests.

Habier et al.[5] proposed the use of lower-density and equally-spaced SNP panels for effective GE, irrespective of trait. If such SNPs can explain substantial proportions of genetic variations in carcass traits and be almost as effective as higher-density panels in evaluating GEBVs, their lower cost would make them useful, especially in beef breeding females. Traits that are measured after slaughter, as well as those that are difficult or expensive to record, are also traits for which GS could substantially improve genetic gain. However, for carcass traits, including degree of marbling in beef cattle, the effects of differing densities of SNPs used to estimate genetic variance and GE have been poorly studied.

Japanese Black cattle are the primary breed of Wagyu, which are the modern native beef cattle in Japan, and are well known for meat qualities such as marbling. This breed has also been distributed for beef production in North and South America and Australia. In Japan, native Japanese cattle were crossed with British and Continental breeds during an approximately 10 year period in the early 1900s, and then, under a completely closed breeding system, the four breeds of Wagyu, including the Japanese Black, were fixed through strict selection over many years [6]. Moreover, since the relaxation of beef import restrictions in Japan in 1991, beef quality traits such as marbling have received more emphasis in the domestic production of the Japanese Black. In the same year, genetic evaluation of carcass traits using the mixed model methodology began [7]. These factors have led to intensive use of a low number of sires with high predicted breeding values for meat quality, and consequently a sharp decline in effective population size [8].

The accuracy of GE depends on the extent of LD between SNP markers and QTLs, the number of animals with phenotypes and genotypes in the reference population, the heritability, and the distribution of QTL effects for the trait [9]. The first of these factors is closely related to effective population size, and the density of SNP markers used that can be under the control of animal breeders. In this study, effects of density of equally spaced genome-wide SNPs on genetic variance explained and GE were investigated for carcass traits in beef cattle, using Japanese Black data and assuming two statistical models.

Results and discussion

Extent of linkage disequilibrium

For the extent of LD, summary statistics of the squared correlation (r2) and the distance (d) for all pairs of two adjacent SNPs in each SNP set are presented in Table 1. Figure 1 depicts the changes in means of r2 and d, together with all values of r2. With all available SNPs, the means of r2 and d were 0.204 and 0.07 Mb, respectively. When the number of SNPs used was decreased to 20,000, 10,000, 8,000, 6,000 and 4,000, the average r2 values became 0.144, 0.096, 0.086, 0.077 and 0.066, respectively, and the corresponding means of d were 0.13, 0.26, 0.33, 0.44 and 0.65 Mb in turn. With all the SNPs, the means of r2 at ranges 0–0.1, 0.1–0.2, 0.2–0.5 and 0.5–1 Mb was 0.22, 0.13, 0.10 and 0.08, respectively, and 25.7, 13.9, 10.4 and 6.4% of the r2 values exceeded 0.3, respectively.

Figure 1
figure 1

Change in mean of r 2 against mean of d (black circles), together with all values of r 2 (gray circles).

Table 1 Extent of linkage disequilibrium and distance between two adjacent SNPs and correlations for elements of G matrices

Investigating the overall average of r2 using 2,670 SNPs for eight breeds, including Japanese Black cattle, McKay et al.[10] reported that the average for 65 Japanese Black cattle was approximately 0.58 for all SNP pairs ≤ 1 kb apart, and 0.07 for all SNP pairs ≤ 2 Mb apart. In the current analysis, the mean r2 for pairs of two adjacent SNPs, ≤ 1 kb–≤ 2 Mb apart, was 0.81–0.20, and most of the average r2 values obtained using all the SNPs in each given distance range were higher than those reported for the eight breeds (data not shown). Furthermore, in Dutch and Australian Holstein-Friesian, Australian Angus and New Zealand Friesian and Jersey cattle, using about 3,000–7,000 SNPs, the average r2 of 0.35 for inter-marker distances of 0–0.01 Mb declined to 0.22 for 0.02–0.04 Mb and 0.14 for 0.04–0.1 Mb [11]. As shown in Figure 1, a largely similar pattern of decreasing LD was observed with the current data for Japanese Black cattle. However, it should be noted that most samples used in these previous studies were from a subpopulation, especially in representative dairy and beef breeds, or many small-scale families of the breed, including sires in some cases, which would be a factor responsible for constructed haplotype blocks in the population. In contrast to this, the samples used in the current study were collected at two large-scale meat markets in Japan, to which fattened Japanese Black animals are sent from all over Japan. Therefore, samples used in this study are considered to reflect the effective size and LD extent of the national Japanese Black population.

Using the genotype data from 18,098 SNPs with minor allele frequencies (MAF) greater than 10% for 25 artificial insemination (AI) sires of Brazilian Gyr dairy cattle, Silva et al.[12] found that means of r2 and d for two adjacent SNPs ranged from 0.24–0.17 and from 0.12–0.18 Mb, respectively, at the autosome-wide level. In the current study of Japanese Black cattle, also with a relatively low effective population size, the mean r2 was nearly the same, but the mean d was about half that at the autosome-wide level (data not shown). Silva et al.[13] also observed that at ranges 0–0.1, 0–0.2, 0–0.5 and 0–1 Mb, mean r2 was 0.20, 0.18, 0.14 and 0.11, respectively, and that the proportion of SNP pairs exhibiting r2 higher than 0.3 was 22.9, 19.7, 14.1 and 9.5% for the same ranges, respectively. In this study, for the same ranges, mean r2 was 0.22, 0.21, 0.20 and 0.20, respectively, and the proportion of SNP pairs was 25.7, 24.1, 23.6 and 23.6%, respectively. From the current results, it is therefore likely that the extent of LD between more distant SNPs is relatively higher in Japanese Black cattle.

In addition, calculating the r2 of all possible SNP pairs by chromosome, from more than 30,000 SNPs distributed genome-wide, the extent of LD and the structure of haplotype blocks were examined for 19 breeds, including Indicus, African and the composite cattle, in addition to some dairy and beef breeds [13], and for Angus, Charolais and crossbred beef cattle [14]. Also, for Nellore cattle, whole genome LD was investigated using about 450,000 SNPs [15]. Yan et al.[16], using 632 maize lines genotyped for 1,229 SNP markers, demonstrated an increase in r2 values between the markers, especially between closer SNP pairs, with an increasing MAF threshold and an increase, particularly between more distant pairs, with decreasing sample size. The MAF threshold we used was smaller than those in previous studies [1015], and sample size in the current study was larger than those for most of used in these studies. In this study, we only calculated the r2 of all pairs of two adjacent SNPs, avoiding a heavy computational burden. When our limited results were compared with the results of the previous studies, we found that, while the extent of whole genome LD in Zebu cattle, such as the Nellore, was relatively low, whole genome LD in the Japanese Black was likely to be higher than, or equal to, the whole genome LD in Angus, which was higher than in Charolais.

Change in the genomic relationship matrix

Table 1 also shows correlations between the diagonal (r D ), upper triangular (r N ) and all the elements (r A ) of a given G matrix, and the corresponding elements of the G matrix constructed using all available SNPs. The r N was 0.51, 0.79, 0.88, 0.94, 0.97 and 0.99 using 100, 500, 1,000, 2,000, 4,000 and 8,000 SNPs, respectively. The changes in r D , r N and r A with increasing SNP density are depicted in Figure 2. A correlation of 0.73 was observed between r N and mean r2, showing a very high linear relationship especially for SNP sets with smaller numbers. Analysing data from the 50K chip for 1,707 AI sires, along with the records of 698 steers of the Angus breed, Rolf et al.[17] showed that the average correlation of upper triangular elements between G matrices constructed from all available SNPs, and from its subset of SNPs selected randomly, reached nearly 0.8 using 1,000 SNPs, and exceeded 0.9 using 2,500 SNPs, suggesting that 2,500–10,000 SNPs distributed throughout the genome are required to robustly estimate a G matrix for feed efficiency traits with heritability ranging from 0.09–0.14. The changing patterns in Figure 2 are similar to those of [18], although we used a scheme of equally-spaced selection in the number of SNPs. Compared with [17], however, the correlation (r N ) in the current study reached 0.9 using a lower number of SNPs and 0.99 using 8,000 SNPs, which would, at least in part, be due to a smaller effective size of the Japanese Black population.

Figure 2
figure 2

Changes in r D , r N and r A with increasing density of SNPs used to construct G matrix. Circles: r D ; triangles: r N ; squares: r A .

Genetic variance explained

Results of variance component estimation for carcass weight and marbling score, using all available SNPs or subsets, by the conventional linear model (model 1) are presented in Tables 2 and 3, respectively. Figure 3 depicts the changes in proportions of estimated genetic variances for both traits. Genetic and residual variances, or σ g 2 and σ e 2 , estimated with the G matrix using all available SNPs, were 1096.3 and 928.1 kg2 for carcass weight and 8.30 and 3.81 score2 for marbling score, respectively, which resulted in heritability estimates of 0.54 and 0.68, respectively. These estimates of heritability were similar to those previously estimated in the Japanese Black population using pedigree information [18], although we note that our estimate for marbling score might be somewhat overestimated because of the distribution of the records used. Heritability of human height was estimated to be 0.45 using 565,040 autosomal SNPs from over 10,000 unrelated individuals [19], which is lower than the estimates of 0.8–0.9 reported in previous family and twin studies [20]. The effective population size of humans was estimated to be 10,000 [21], and therefore, for human polygenic traits like height, many more SNPs for a much higher LD level with causative variations would be needed to capture the total genetic variation. In contrast, the effective population sizes of cattle breeds would be much smaller, usually 100 or lower. In the case of Japanese Black cattle, where the effective size is only about 30 [8], it is likely that a large part of the genetic variance for the carcass traits studied here could be captured by using all available SNPs within the 50K chip.

Figure 3
figure 3

Changes in proportions of estimated genetic variances in model 1, with increasing density of SNPs used to construct G matrix. Circles: carcass weight; triangle: marbling score.

Table 2 Variance components estimated with model 1 for carcass weight
Table 3 Variance components estimated with model 1 for marbling score

As expected, as the number of SNPs used became higher, estimated residual and genetic variances gradually decreased and increased, respectively. This is mainly because the higher the SNP marker density, the higher the LD levels between SNP markers and true QTL regions. For instance, in the case of carcass weight, correlations between mean r2 and the estimates of σ e 2 and σ g 2 in model 1 were −0.79 and 0.80, respectively. For both carcass traits, considering standard errors, a largely constant value of phenotypic variance σ p 2 was obtained, even with the different numbers of SNPs used. It was also observed that the value of genetic variance per SNP became larger when fewer SNPs were used (data not shown), which would be partly due to the additional variance explained by the correlated effect of SNPs around those used to construct the G matrix. However, the proportion of genetic variance explained by SNPs decreased slightly with an increase from 4,000 to 6,000 and from 6,000 to 8,000 SNPs, for carcass weight and marbling score, respectively (Figure 3). This could be interpreted partly as a reflection of the genetic background and architecture, or the distribution of real QTL regions and their effects relevant to each trait, in Japanese Black cattle.

For carcass weight, approximately 90 and 97% of the genetic variance estimated with the G matrix using all available SNPs was obtained using 4,000–6,000 and 10,000 SNPs, respectively. For marbling score, the proportion of the genetic variance accounted for by a given number of SNPs was consistently low when compared with carcass weight, particularly when a relatively small number of SNPs were used. This finding may indicate that the degree of marbling is controlled by only QTLs with relatively small effects, compared with the carcass weight. In fact, three QTLs for carcass weight, called CW-1, -2 and -3, have been identified in genome-wide association studies (GWAS), in which their allele substitution effects were relatively large [2225], whereas no such QTLs have been detected for the degree of marbling until now. Using 10,000 SNPs, however, as much as 92% of genetic variance in marbling score was accounted for in this study.

Accuracy of genomic estimated breeding value

Correlations and linear regressions on GEBVs obtained with the different densities of SNPs used are shown in Table 4. When 4,000 and 10,000 SNPs were used in model 1, the correlations between the GEBVs and those obtained using all available SNPs were both 0.99 for carcass weight and 0.98 and 0.99 for marbling score, respectively, with the corresponding linear regression coefficients of 0.94 and 0.98 for the former trait and 0.82 and 0.94 for the latter trait. This showed a trend of underestimation of GEBVs with a lower number of SNPs used, particularly for the latter trait. The different levels of underestimation of GEBVs could be because of different genetic architectures of the two traits. As stated previously, while three QTLs with relatively large effects on carcass weight in Japanese Black cattle have been found [2225], no such QTLs have been found for degree of marbling. Thus, considering the results of the estimated genetic variances, the lower underestimation of GEBVs observed for carcass weight relative to marbling score might reflect the observation that relatively larger effects of SNPs linked to the carcass weight QTLs could be better captured, even with a lower number of SNPs.

Table 4 Correlation between and linear regression of GEBVs obtained with model 1 using a given SNP set and all available SNPs

Use of imputed genotype information

Accuracy of imputation, expressed as the percentage of correctly predicted genotypes, was 93.4 ± 2.5 and 97.4 ± 1.2% (average ± standard deviation) for 38,502 genotypes imputed from 4,000 and 10,000 SNPs, respectively. Variance components estimated using the imputed genotype data are shown in Tables 2 and 3. Scatter plots of the GEBVs obtained for carcass weight and marbling score against those obtained using all the available SNPs without imputation are shown in Figure 4. Use of the imputed data resulted in a similar level of estimated variances as the level obtained using all the SNPs without imputation. Correlations between the GEBVs obtained with imputation and those obtained from all the SNPs without imputation were higher than 0.99 for both the traits. Imputation of SNP genotypes from low density to high density is now a standard procedure for using low-density marker panels in GS schemes [5, 26]. Our results using the imputed SNP information support the use of the imputation from low-density marker panels.

Figure 4
figure 4

Scatter plots for GEBVs obtained with model 1 using 4,000 (top panels) or 10,000 (bottom panels) SNPs and those using all available SNPs, for carcass weight (left panels) and marbling score (right panels) with and without imputation. Black circles: with imputation; gray circles: without imputation.

Estimation using a threshold model

The proportions of genetic variances to phenotypic variances in the underlying scale estimated for marbling score using the threshold model (model 2) are presented in Table 5. For both the binary and more categorical treatments, the estimations were successful only when relatively small numbers of SNPs were used. The changes in the estimated proportions, relative to the proportions estimated using model 1, are depicted in Figure 5. The values of some correlations between GEBVs obtained with models 1 and 2 are listed in Table 6. It has been noted that generalized linear animal models are plagued by extremely slow mixing in implementations of Markov chain Monte Carlo methods [27]. For both the successful and unsuccessful estimations, a single chain of 10,000,000 samples was run with the first 3,000,000 samples being discarded. The results showed that the estimates presented in Table 5 were not substantially different from those obtained, while there was still no convergence for any of the unsuccessful cases. The failures of the estimations using the larger numbers of SNPs may be attributed largely to the limited number of animals used in this study.

Table 5 Proportion of genetic to phenotypic variances estimated with model 2 for marbling score
Figure 5
figure 5

Changes in proportions of phenotypic variances explained with model 1 and 2 for marbling score, with increasing density of SNPs used to construct G matrix. Circles: model 1; triangle: model 2 (binary); square: model 2 (categorical with 11 categories).

Table 6 Correlations between GEBVs obtained with models 1 and 2 using a given SNP set and all available SNPs for marbling score

The proportions of the genetic to the phenotypic variances estimated with model 2 were observed to be consistently larger than the corresponding estimates with model 1. The estimates of heritability for marbling score of Japanese Black cattle reported in the literature [18] indicate that the genetic variances in the underlying scale obtained with model 2 in this study may be somewhat inflated; for instance, 0.66 with 2,000 SNPs for the binary case and 0.82 with 10,000 SNPs for the case of 11 categories . However, for a given set of selected SNPs, the correlations between GEBVs obtained with model 2 and GEBVs obtained with model 1 using all the SNPs were found to be similar to the corresponding values in the analyses with model 1. The correlations between the GEBVs obtained with both the models using a given set of SNPs were very high overall. These correlations between the GEBVs would generally support the validity of the results for marbling score obtained with model 1.

Overall discussion

Equally-spaced panels with various densities are already used in many situations. Such panels have the advantage of being applicable irrespective of trait and population, for which the density of SNPs plays an important role in GE and GS according to the extent of LD between SNP markers and real QTLs. At this stage, the 50K chip is most commonly used. Using higher density-panels, such as the 50K and 770K, may account for a high to very high proportion of genetic variation. In addition, as shown also in this study, use of genotypes imputed from low-density to high-density can take account of genetic variance largely. However, while increasing density of SNPs used could increase the extent of the LD, it could also increase the number of uninformative and collinear SNPs [28]. Thus, for robust prediction it is important to exclude collinear nuisance SNPs, since their inclusion in the analyses may increase error and sampling variances in estimation of SNP effects on the training population or allow a single QTL to be attributed to a number of highly correlated SNPs, which would be likely to reduce the predictability of GEBVs and its persistence across generations. Schulz-Streeck et al.[29] confirmed this by simulation, finding that excluding the markers with negligible or inconsistent effects by pre-selection increases the accuracy of GE.

From this perspective, even the 3K chip has been suggested to be a useful tool in dairy GE [30]. Also, evaluating the predictive ability of subsets of SNPs, Moser et al.[31] concluded that accurate GE of Holstein bulls and cows can be accomplished with 3,000–5,000 equally spaced SNPs. From the viewpoint of the relationship of r2 to the accuracy of GEBV, a simulation study showed that while the accuracy of GEBVs for unphenotyped animals ranged from about 0.65, for the mean r2 of 0.1 between adjacent markers, to more than 0.80, for the mean r2of 0.2, the accuracy for phenotyped animals exceeded 0.8, with a mean r2 of 0.1, with heritability of 0.5 [32]. The mean of r2 was almost 0.1 when 10,000 SNPs were used in the current study (Table 1), and the level of heritability estimated using all available SNPs was more than 0.5 for both the traits (Tables 2 and 3). Therefore, using 10,000 equally-spaced SNPs, which is relatively few compared with all available SNPs in the 50K chip, might be sufficient to cover both of the carcass traits, even in validation and application populations. Moreover, as far as genetic evaluation for ranking animals is concerned, the current results might suggest a possibility of using 4,000–6,000 equally-spaced SNPs for these carcass traits in the Japanese Black population in Japan, since the downward bias in GEBV values observed in this study with lower densities of SNPs would not substantially influence the ranking of animals. Such lower density panels could be used practically in pre-selection, especially of young breeding females whose number in the population is definitely high. This could be beneficial, even with the current degree of accuracy, in dramatically reducing the total cost of the genetic evaluation, since carcass traits are usually measured only on their relatives. If necessary, the imputation of SNP genotypes from the lower density panels to higher density panels, as indicated in [33], could help to achieve an additional increase in the accuracy of GE. On the other hand, young breeding bulls to be selected as future elite AI sires should be genotyped with a high-density panel for more reliable GE and GS, since the contribution of elite AI sires to genetic improvement is significant.

There are several reports on ways of choosing unequally-spaced SNPs, as well as equally-spaced SNPs as a subset, particularly in a relatively low-density panel, and on the utility of low-density marker panels (e.g., [3436]). Of these ways, choosing SNPs ranked highly in the magnitude of the absolute value of estimated effect is typical. In most cases, prediction of GEBVs with high-ranking SNPs is somewhat more accurate and reliable than with equally-spaced SNPs, when the same number of SNPs is used in the prediction (e.g., [35, 36]). For Japanese Black cattle, only one previous study, conducted from the viewpoint of GE, performed the estimation of variance for carcass traits [37]. This study used 50K SNP genotype data from 673 steers to simply perform linear regression analysis of each SNP for each trait, and subsets of SNPs with various significance levels for the association with each trait were used to account for variances. Including this study, however, use of SNPs ranked highly based on certain criteria would generally be applicable only to a particular trait and population. One approach is to integrate the optimal subset of the SNPs for each of several important traits into one set, which is as cheap as possible to use in the target population, as ordinary selection is often implemented for certain multiple traits, although this strategy still requires the re-selection of SNPs with process of generation. While use of an equally-spaced SNP panel deals with all the genome regions, according to density, a trait-specific panel would frequently deal with only parts of the genome. Thus, a compromise plan, as suggested by [34], might be practical, in which a large part of the whole SNP set is composed of equally-spaced SNPs, and SNPs that are in high LD with the causative variants are also included. An example of the latter SNPs for carcass weight in Japanese Black cattle is those linked tightly with CW-1, -2 and -3, found by [2225]. In addition, since pedigree data are important information irrespective of traits [35], if deep and wide pedigree data can be combined with a SNP set, as mentioned above, more effective GE and GS might be possible. More studies of sophisticated approaches to construct an optimal SNP set for valid and cost-effective GE of carcass traits in beef cattle are required.

In this study, we employed a scheme of equally-spaced selection of SNPs to investigate carcass weight and marbling score, which are representative traits for carcass quantity and meat quality, respectively. We have provided important basic information on the relationships between SNP marker density and genetic variance explained and accuracy and bias of GEBVs obtained. However, as the size of the dataset available in this study was limited, all of the available animals were used in the estimation analyses. In the analyses, the number of animals available (about 900) was well exceeded by the number of SNPs in most settings of SNP selection and use. Thus, we note that the genetic variance explained and the accuracy of GEBVs obtained in the current study may be somewhat inflated, relative to those values obtained using many independent validation animals. Therefore, further research is needed to confirm the current findings, especially from the perspective of prediction, and accumulating a much larger volume of relevant data.


To our knowledge, this is the first study reporting the level of whole genome LD in Japanese Black cattle, using about 40,000 genome-wide SNPs, as well as the effects of their equally-spaced subsets on the elements of the G matrix, the degree of genetic variance explained and accuracy of GEBVs for carcass weight and marbling score, which are representative traits for carcass quantity and meat quality, respectively. Our study revealed that the Japanese Black is likely to be in a breed group with a relatively high extent of whole genome LD, and that the degree of marbling is controlled by only QTLs with relatively small effects, compared with carcass weight. The possibility of effective GE with at least 4,000 equally-spaced SNPs was suggested for these traits.


Ethics statement

Animal care and use was according to the protocol approved by the Shirakawa Institute of Animal Genetics Animal Care and Use Committee, Nishigo, Japan (ACUCH21-1).

Phenotype data

Cold carcass weights and marbling scores of 872 Japanese Black fattened steers, whose ages ranged between 15.3–43.0 months, were used for the current analyses. These records were collected from 2000–2009 at two large meat markets in Japan, namely Tokyo Metropolitan Central Wholesale Market and Osaka Municipal South Port Wholesale Market. Marbling score is the degree of marbling, ranging from null (1) to very abundant (12), assessed on the ribeye of the carcass dissected at the sixth and the seventh rib section, according to the Japan carcass grading standards [38]. The distributions of carcass weight and marbling score are shown in Figure 6. The mean (± standard deviation) was 496.6 (± 48.0) kg for carcass weight and 6.8 (± 3.5) for marbling score.

Figure 6
figure 6

Distribution of carcass weights (a) and marbling scores (b).

Genotype data

DNA samples were extracted from perirenal adipose tissues. Sample DNA was quantified and genotyped using the BovineSNP50 BeadChip (hereafter referred to as the 50K chip). The 50K assay contains 54,001 SNPs with an average probe spacing of 51.5 kb and a median spacing of 37.3 kb. A total of 38,502 SNPs were included in the statistical analyses, based on the following criteria: MAF and genotype call rates were larger than 0.01 and 0.95, respectively, were in Hardy-Weinberg equilibrium (p > 0.001) and had position information. As a few percent of genotype data was missing, missing genotype filling was conducted using “Beagle 3.3.2” package [39].

Statistical analyses

Data were analysed using a following linear model (denoted as model 1):

y = Xb + g + e

where y is the vector of records, b is the vector of fixed discrete effects of market and year at slaughter and the continuous effects of the linear and quadratic covariates of month of age at slaughter, g is the vector of additive genetic effects being assumed to follow N 0 , G σ g 2 with the genetic (polygenic) variance and the genomic relationship matrix represented by σ g 2 and G, respectively, e is the vector of residuals assumed to follow N 0 , I σ e 2 with the residual variance and the identity matrix denoted by σ g 2 and I, respectively, and X is incidence matrix.

Using the SNP genotype data, the G matrix was constructed according to [3] by:

G = M 2 P M 2 P / 2 i n p i 1 p i

where M is the matrix whose row elements include the number of minor alleles in each animal at each SNP locus, P is the matrix whose row elements contain the MAF at each SNP locus, p i is the MAF at the ith SNP locus, and n is the number of SNPs used.

In this study, 12 different G matrices were constructed and employed by selecting from 100 to 30,000 equally-spaced SNPs in number or using all available SNPs. To make the G matrices always positive definite, 10-4 I was added to G in construction. We note that pedigree information for the animals, consequently the A matrix, was not available in this study.

For each of the 12 sets of SNPs, including the set of all available SNPs, the extent of LD was measured by the squared correlation (r2) of the alleles at two loci for all pairs of two adjacent SNPs on all chromosomes [40]. The mean and standard deviation of the distance (d) between two adjacent SNPs were also calculated. In addition, correlations between the diagonal, upper triangular, and all the elements of a given G matrix, and the corresponding elements of the matrix constructed using all available SNPs, were examined (denoted as r D , r N and r A , respectively).

To assess the relationships between GEBVs g ^ in each model, correlations were computed between GEBVs incorporating the G matrix constructed using all available SNPs, and those incorporating the G matrix using a given number of SNPs. Also, linear regressions were fit, where the dependent variables were GEBVs incorporating the G matrix constructed using a given number of SNPs, and the independent variable was GEBVs obtained using all available SNPs.

Additionally, choosing the two lower-density subsets, or those of 4,000 and 10,000 SNPs, we attempted to carry out the genotype imputation with “Beagle 3.3.2” from those to all the 38,502 SNPs, in which as a reference, phased haplotype data of 494 animals not having records of both the traits whose data were collected at the same two markets as the 872 animals. Then, using the imputed data, the analyses with model 1 were also conducted.

Furthermore, the distribution of marbling score was obviously far from a normal distribution, as shown in Figure 6. Then, for this trait, a threshold model (model 2) was also fit, as follows:

η = Xb + g + e

where η is the vector of unobserved variables in the underlying scale, assuming that σ e 2 = 1 . Two sorts of analysis were conducted regarding the outward phenotype as either a binary trait in which the observed scores 2–6 and 7–12 were each classified into one class, or an ordered categorical trait using actually observed scores.

All the parameters in model 1 were estimated via the Bayesian framework using Gibbs sampling in “BLR” package under R environment [41, 42]. A flat prior distribution was used for the nuisance parameters (b), and multivariate normal distributions were employed as priors for the additive genetic effects. As prior distributions for σ g 2 and σ e 2 , independent scaled inverted chi-square distributions were used with degree of belief and scale parameters of −2 and 0, respectively, assuming that there was no prior information. The “BGLR” package, or an improved version of the BLR software [43], was used to estimate the parameters in model 2. A single chain of 110,000 samples was run, and the first 10,000 samples were discarded as burn-in. Posterior summaries, or mean and standard deviation here, were computed using a thinning rate of 10.



Quantitative trait loci


Genomic evaluation


Genomic selection


Single nucleotide polymorphisms


Linkage disequilibrium


Best linear unbiased prediction

G matrix:

Genomic relationship matrix

A matrix:

Additive relationship matrix


Genomic estimated breeding value


Minor allele frequencies


Artificial insemination


Genome-wide association study.


  1. Meuwissen THE, Hayes BJ, Goddard ME: Prediction of total genetic value using genome-wide dense marker maps. Genetics. 2001, 157: 1819-1829.

    PubMed  CAS  PubMed Central  Google Scholar 

  2. Meuwissen THE, Hayes BJ, Goddard ME: Accelerating improvement of livestock with genomic selection. Annu Rev Anim Biosci. 2013, 1: 221-237. 10.1146/annurev-animal-031412-103705.

    Article  PubMed  Google Scholar 

  3. VanRaden PM: Efficient methods to compute genomic prediction. J Dairy Sci. 2008, 91: 4414-4423. 10.3168/jds.2007-0980.

    Article  PubMed  CAS  Google Scholar 

  4. Henderson CR: Application of Linear Models in Animal Breeding. 1984, Guelph, Canada: University of Guelph

    Google Scholar 

  5. Habier D, Fernando RL, Dekkers JCM: Genomic selection using low-density marker panels. Genetics. 2009, 182: 343-353. 10.1534/genetics.108.100289.

    Article  PubMed  CAS  PubMed Central  Google Scholar 

  6. Namikawa K: Wagyu: Japanese Beef Cattle - Historical Breeding Processes of Japanese Beef Cattle and Preservation of Genetic Resources as Economic Farm Animal. 1992, Kyoto, Japan: Wagyu Registry Association

    Google Scholar 

  7. Wagyu Registry Association: Breeding and Improvement of Wagyu. 2007, Kyoto, Japan: Wagyu Registry Association, in Japanese, 2

    Google Scholar 

  8. Nomura T, Honda T, Mukai F: Inbreeding and effective population size of Japanese Black cattle. J Anim Sci. 2001, 79: 366-370.

    PubMed  CAS  Google Scholar 

  9. Hayes BJ, Bowman PJ, Chamberlain AJ, Goddard ME: Invited review: genomic selection in dairy cattle: progress and challenge. J Dairy Sci. 2009, 92: 433-443. 10.3168/jds.2008-1646.

    Article  PubMed  CAS  Google Scholar 

  10. McKay SD, Schnabel RD, Murdoch BM, Matukumalli LK, Aerts J, Coppieters W, Crews D, Neto ED, Gill CA, Gao C, Mannen H, Stothard P, Wang Z, Van Tassel CP, Williams JL, Taylor JF, Moore SS: Whole genome linkage disequilibrium maps in cattle. BMC Genet. 2007, 8: 74-

    Article  PubMed  PubMed Central  Google Scholar 

  11. De Roos APW, Hayes BJ, Spelman RJ, Goddard ME: Linkage disequilibrium and persistence of phase in Holstein-Friesian, Jersey and Angus cattle. Genetics. 2008, 179: 1503-1512. 10.1534/genetics.107.084301.

    Article  PubMed  CAS  PubMed Central  Google Scholar 

  12. Silva CR, Neves HHR, Queiroz SA, Sena JAD, Pimentel ECG: Extent of linkage disequilibrium in Brazilian Gyr dairy cattle based on genotypes of AI sires for dense SNP markers. Proceedings of the 9th World Congress on Genetics Applied to Livestock Production, August 2010. Edited by: German Society for Animal Science. 2010, Leipzig, 1-29.

    Google Scholar 

  13. Villa-Angulo R, Matukumalli LK, Gill CA, Choi J, Van Tassell CP, Grefenstette JJ: High-resolution haplotype block structure in the cattle genome. BMC Genet. 2009, 10: 19-

    Article  PubMed  PubMed Central  Google Scholar 

  14. Lu D, Sargolzaei M, Kelly M, Li C, Voort GV, Wang Z, Plastow G, Moore S, Miller SP: Linkage disequilibrium in Angus, Charolais, and Crossbred beef cattle. Front Genet. 2012, 3: 152-

    Article  PubMed  PubMed Central  Google Scholar 

  15. Espigolan R, Baldi F, Boligon AA, Souza FRP, Gordo DGM, Tonussi RL, Cardoso DF, Oliveira HN, Tonhati H, Sargolzaei M, Schenkel FS, Carvalheiro R, Ferro JA, Albuquerque LG: Study of whole genome linkage disequilibrium in Nellore cattle. BMC Genomics. 2013, 14: 305-10.1186/1471-2164-14-305.

    Article  PubMed  CAS  PubMed Central  Google Scholar 

  16. Yan J, Shah T, Warburton ML, Buckler ES, McMullen MD, Crouch J: Genetic characterization and linkage disequilibrium estimation of a global maize collection using SNP markers. PLoS One. 2009, 4: e8451-10.1371/journal.pone.0008451.

    Article  PubMed  PubMed Central  Google Scholar 

  17. Rolf MM, Taylor JF, Schnabel RD, McKay SD, McClure MC, Northcutt SL, Kerley MS, Weaber RL: Impact of reduced marker set estimation of genomic relationship matrices on genomic selection for feed efficiency in Angus cattle. BMC Genet. 2010, 11: 24-

    Article  PubMed  PubMed Central  Google Scholar 

  18. Oyama K: Genetic variability of Wagyu cattle estimated by statistical approaches. Anim Sci J. 2011, 82: 367-373. 10.1111/j.1740-0929.2011.00895.x.

    Article  PubMed  Google Scholar 

  19. Yang J, Manolio TA, Pasquale LR, Boerwinkle E, Caporaso N, Cunningham JM, de Andrade M, Feenstra B, Feingold E, Hayes MG, Hill WG, Landi MT, Alonso A, Lettre G, Lin P, Ling H, Lowe W, Mathias RA, Melbye M, Pugh E, Cornelis MC, Weir BS, Goddard ME, Visscher PM: Genome partitioning of genetic variation for complex traits using common SNPs. Nat Genet. 2011, 45: 519-525.

    Article  Google Scholar 

  20. Visscher PM, Hill WG, Wray NR: Heritability in the genomics era-concepts and misconceptions. Nat Rev Genet. 2008, 9: 255-266.

    Article  PubMed  CAS  Google Scholar 

  21. Takahata N: Allelic genealogy and human evolution. Mol Biol Evol. 1993, 10: 2-22.

    PubMed  CAS  Google Scholar 

  22. Mizoshita K, Watanabe T, Hayashi H, Kubota C, Yamakuchi H, Todoroki J, Sugimoto Y: Quantitative trait loci analysis for growth and carcass traits in a half-sib family of purebred Japanese Black (Wagyu) cattle. J Anim Sci. 2004, 82: 3415-3420.

    PubMed  CAS  Google Scholar 

  23. Takasuga A, Watanabe T, Mizoguchi Y, Hirano T, Ihara N, Takano A, Yokouchi K, Fujikawa A, Chiba K, Kobayashi N, Tatsuda K, Oe T, Furukawa-Kuroiwa M, Nishimura-Abe A, Fujita T, Inoue K, Mizoshita K, Ogino A, Sugimoto Y: Identification of bovine QTL for growth and carcass traits in Japanese Black cattle by replication and identical-by-descent mapping. Mamm Genome. 2007, 18: 125-136. 10.1007/s00335-006-0096-5.

    Article  PubMed  Google Scholar 

  24. Setoguchi K, Furuta M, Hirano T, Nagao T, Watanabe T, Sugimoto Y, Takasuga A: Cross-breed comparisons identified a critical 591-kb region for bovine carcass weight QTL (CW-2) on chromosome 6 and the Ile-442-Met substitution in NCAPG as a positional candidate. BMC Genet. 2009, 10: 43-

    Article  PubMed  PubMed Central  Google Scholar 

  25. Nishimura S, Watanabe T, Mizoshita K, Tatsuda K, Fujita T, Watanabe N, Sugimoto Y, Takasuga A: Genome-wide association study identified three major QTL for carcass weight including the PLAG1-CHCHD7 QTN for stature in Japanese Black cattle. BMC Genet. 2012, 13: 40-

    Article  PubMed  CAS  PubMed Central  Google Scholar 

  26. Khatkar MS, Moser G, Hayes BJ, Raadsma HW: Strategies and utility of imputed SNP genotypes for genomic analysis in dairy cattle. BMC Genomics. 2012, 13: 538-10.1186/1471-2164-13-538.

    Article  PubMed  CAS  PubMed Central  Google Scholar 

  27. Hoeschele I, Tier B: Estimation of variance components of threshold characters by marginal posterior modes and means via Gibbs sampling. Genet Sel Evol. 1995, 27: 519-540. 10.1186/1297-9686-27-6-519.

    Article  PubMed Central  Google Scholar 

  28. Harris BL, Johnson DL: The impact of high density SNP chips on genomic evaluation in dairy cattle. Proceedings of the 2010 Interbull Meeting, June 2010. 2010, Riga Latvia: Interbull Bulletin 42, 40-43.

    Google Scholar 

  29. Schulz-Streeck T, Ogutu JO, Piepho H-P: Pre-selection of markers for genomic selection. BMC Proc. 2011, 5 (Suppl 3): S12-10.1186/1753-6561-5-S3-S12.

    Article  PubMed  PubMed Central  Google Scholar 

  30. Wiggans GR, Cooper TA, VanRaden PM, Olson KM, Tooker ME: Use of the Illumina Bovine3K BeadChip in dairy genomic evaluation. J Daily Sci. 2011, 95: 1552-1558.

    Article  Google Scholar 

  31. Moser G, Khatkar MS, Hayes BJ, Raadsma HW: Accuracy of direct genomic values in Holstein bulls and cows using subsets of SNP markers. Genet Sel Evol. 2010, 42: 37-10.1186/1297-9686-42-37.

    Article  PubMed  PubMed Central  Google Scholar 

  32. Calus MPL, Meuwissen THE, de Roos APW, Veerkamp RF: Accuracy of genomic selection using different methods to define haplotypes. Genetics. 2008, 178: 553-561. 10.1534/genetics.107.080838.

    Article  PubMed  CAS  PubMed Central  Google Scholar 

  33. Boichard D, Chung H, Dassonneville R, David X, Eggen A, Fritz S, Gietzen KJ, Hayes BJ, Lawley CT, Sonstegard TS, Van Tassell CP, VanRaden PM, Viaud-Martinez KA, Wiggans GR: Design of a bovine low-density SNP array optimized for imputation. PLoS One. 2012, 7: e34130-10.1371/journal.pone.0034130.

    Article  PubMed  CAS  PubMed Central  Google Scholar 

  34. Weigel KA, de los Campos G, González-Recio O, Naya H, Wu XL, Long N, Rosa GJM, Gianola D: Predictive ability of direct genomic values for lifetime net merit of Holstein sires using selected subsets of single nucleotide polymorphism markers. J Daily Sci. 2009, 92: 5248-5257. 10.3168/jds.2009-2092.

    Article  CAS  Google Scholar 

  35. Vazquez AI, Rosa GJM, Weigel KA, de los Campos G, Gianola D, Allison DB: Predictive ability of subsets of single nucleotide polymorphisms with and without parent average in US Holsteins. J Dairy Sci. 2010, 93: 5942-5949. 10.3168/jds.2010-3335.

    Article  PubMed  CAS  PubMed Central  Google Scholar 

  36. Zhang Z, Ding X, Liu J, Zhang Q, de Koning DJ: Accuracy of genomic prediction using low-density marker panels. J Daily Sci. 2011, 94: 3642-3650. 10.3168/jds.2010-3917.

    Article  CAS  Google Scholar 

  37. Watanabe T, Matsuda H, Arakawa A, Yamada T, Iwaisaki H, Nishimura S, Sugimoto Y: Estimation of variance components for carcass traits in Japanese Black cattle using 50K SNP genotype data. Anim Sci J. 2014, 85: 1-7. 10.1111/asj.12074.

    Article  PubMed  CAS  Google Scholar 

  38. Japan Meat Grading Association (JMCA): New Beef Carcass Grading Standards. 1988, Tokyo, Japan: JMCS

    Google Scholar 

  39. Browning BL, Browning SR: A fast, powerful method for detecting identity by descent. Am J Hum Genet. 2011, 88: 173-182. 10.1016/j.ajhg.2011.01.010.

    Article  PubMed  CAS  PubMed Central  Google Scholar 

  40. Hill WG, Robertson A: Linkage disequilibrium in finite populations. Theor Appl Genet. 1968, 38: 226-231. 10.1007/BF01245622.

    Article  PubMed  CAS  Google Scholar 

  41. Naya H, Gianola D, Crossa J, Legarra A, Manfredi E, Weigel K, Cotes JM: Predicting quantitative traits with regression models for dense molecular markers and pedigree. Genetics. 2009, 182: 375-385. 10.1534/genetics.109.101501.

    Article  PubMed  PubMed Central  Google Scholar 

  42. R Development Core Team: R: A Language and Environment for Statistical Computing. 2011, Vienna, Austria, [ISBN 3-900051-07-0] []

    Google Scholar 

  43. De los Campos G, Pataki A, Pérez P: The BGLR (Bayesian Generalized Linear Regression) R-Package. []

Download references


The authors thank the staff of the Shirakawa Institute of Animal Genetics for technical assistance. The authors would like to thank the two anonymous reviewers for their helpful comments. The work was partly supported by the Ministry of Agriculture, Forestry, and Fishery, Japan, and by the Japan Racing and Livestock Promotion Foundation (H20-5).

Author information

Authors and Affiliations


Corresponding author

Correspondence to Shinichiro Ogawa.

Additional information

Competing interests

The authors declare that they have no competing interests.

Authors’ contributions

HI, HM and SO conceived of the study. SO carried out the analyses and drafted the manuscript. HI and HM contributed in writing and improving the manuscript. YT, TW, SN and YS participated in the design of the study. All authors read and approved the final manuscript.

Authors’ original submitted files for images

Rights and permissions

This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited.

Reprints and permissions

About this article

Cite this article

Ogawa, S., Matsuda, H., Taniguchi, Y. et al. Effects of single nucleotide polymorphism marker density on degree of genetic variance explained and genomic evaluation for carcass traits in Japanese Black beef cattle. BMC Genet 15, 15 (2014).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: