Skip to main content

Determination of genetic associations between indels in 11 candidate genes and milk composition traits in Chinese Holstein population



We have previously identified 11 promising candidate genes for milk composition traits by resequencing the whole genomes of 8 Holstein bulls with extremely high and low estimated breeding values for milk protein and fat percentages (high and low groups), including FCGR2B, CENPE, RETSAT, ACSBG2, NFKB2, TBC1D1, NLK, MAP3K1, SLC30A2, ANGPT1 and UGDH those contained 25 indels between high and low groups. In this study, the purpose was to further examine whether these candidates have significant genetic effects on milk protein and fat traits.


With PCR product sequencing, 13 indels identified by whole genome resequencing were successfully genotyped. With association analysis in 769 Chinese Holstein cows, we found that the indel in FCGR2B was significantly associated with milk yield, protein yield and protein percentage (P = 0.0041 to 0.0297); five indels in CENPE and one indel in MAP3K1 were markedly relevant to milk yield, fat yield and protein yield (P < 0.0001 to 0.0073); polymorphism in RETSAT was evidently associated with milk yield, fat yield, protein yield and protein percentage (P = 0.0001 to 0.0237); variant in ACSBG2 affected fat yield and protein percentage (P = 0.0088 and 0.0052); one indel in TBC1D1 was with respect to fat percentage and protein percentage (= 0.0224 and 0.0209). Significant associations were shown between indels in NLK and protein yield and protein percentage (P = 0.0012 to 0.0257); variant in UGDH was related to the milk yield (P = 0.0312). The two exonic indels in FCGR2B and CENPE were predicted to change the mRNA and protein secondary structures, and resulted in the corresponding protein dysfunction.


Our findings presented here provide the first evidence for the associations of eight functional genes with milk yield and composition traits in dairy cattle.


In dairy cattle, milk yield and milk composition traits are the most important economic traits, which are controlled by numerous environmental factors and genes [1,2,3,4]. Over the past decades, unraveling the major genes and causal mutations with large effect on milk yield and composition traits is one of the important research fields for researchers. Quantitative trait locus (QTL) mapping and genome-wide association study (GWAS) have been widely applied to identify the QTLs, candidate genes and mutations affecting milk production traits in dairy cattle [5,6,7,8], and a large number of QTLs and genetic associations have been detected using such two approaches so far ( In recent years, short insertion and deletion (indel), as the second main form of genomic variation, has been increasingly paid more attention and has made great contribution to investigations on genetic and phenotypic diversities in human, chicken, pig and dairy cattle [9,10,11,12,13]. A previous study found that 2–18 base pairs (bp) indel located upstream of TAL bHLH transcription factor 1 (TAL1) was responsible for the T-cell acute lymphoblastic leukemia (T-ALL) [14]. In chicken, the 9–15 bp indel of premelanosome protein (PMEL17) gene was confirmed to be the causative mutation for the plumage color (Dominant white, Dun and Smoky) [10]. In pig, an intronic inserted retrotransposon of sperm flagellar 2 (SPEF2) led to the immotile short-tail sperm defect [15]. In Belgian blue cattle, a 11-bp indel in myostatin (MSTN) gene resulted in double-muscled phenotype [9], and an exonic 15-bp insertion in coagulation factor XI (F11) gene caused the factor XI deficiency in Japanese black cattle [11]. However, up to now, limited research of indel polymorphisms associated with milk production traits in dairy cattle has been reported [16].

With the rapidly emergence of next-generation sequencing (NGS), whole genome resequencing has been an important tool in the efforts to detect polymorphsims which were contributed to the complex traits or economic traits in human and domestic animals [17,18,19,20]. In our previous whole genome resequencing study, we identified over 0.9 million short indels and 3625 common differential indels with the same allelic distribution directions based on the 8 Holstein bulls with extremely high or low estimated breeding values (EBVs) of milk protein and fat percentages (high and low groups) [21]. Based on this, 11 genes were identified as the promising candidates affecting milk compositions traits in dairy cattle, including FCGR2B, CENPE, RETSAT, ACSBG2, NFKB2, TBC1D1, NLK, MAP3K1, SLC30A2, ANGPT1 and UGDH, which contained 25 differential indels [21]. Thus, the aim of this study was to further validate whether these identified indels in the 11 genes significantly impact on milk yield and compositions traits in Chinese Holstein population.


Indel verification and genotyping

Based on two DNA pools from 40 Holstein sires, with PCR product sequencing, 22 of 25 indels identified by whole genome resequencing [21] were confirmed as true ones (Additional file 2), among them, four indels were identified for the first time (Table 1). Subsequently, 13 indels in 8 genes were successfully genotyped and performed for association analysis. Of the 13 indels, two indels, including rs381714237 in FCGR2B, ss2137349053 in CENPE were located in the exons, whilst, the remaining 11 indels were located in the intronic regions. Chi-squared test showed that all the 13 indels were in Hardy-Weinberg equilibrium (P > 0.05). The genotype frequencies and allele frequencies of the 13 indels were summarized in Table 2.

Table 1 Detailed information of 24 indels of 11 genes identified in Chinese Holstein cattle
Table 2 The genotypic and allelic frequencies of 13 indels of 8 genes

Associations between indels and five milk production traits

The results of associations between the 13 indels and five milk production traits were shown in Table 3. It was observed that all these indels were significantly associated with at least one of the milk traits (P < 0.0001 to P = 0.0312) as described below.

Table 3 Association results of the thirteen indels in eight genes on the five milk production traits (least squares mean ± SE)

Exonic indels

The exonic indel rs381714237 in FCGR2B was associated with milk yield (P = 0.0297), protein yield (P = 0.0041) and protein percentage (P = 0.0198). The other exonic indel ss2137349053 in CENPE, was strongly associated with milk yield (P < 0.0001), fat yield (P < 0.0001) and protein yield (P < 0.0001).

Intronic indels

The four intronic indels (rs385060942, ss2137349051, rs453960300 and rs378415122) in CENPE were significantly associated with milk yield (P < 0.0001), fat yield (P = 0.0004 to 0.0073) and protein yield (P < 0.0001 to 0.0002). Additionally, the five indels (four intronic indels and one exonic indel above) of CENPE gene were found to be highly linked (r2 > 0.98), and one haplotype block was inferred as presented in Fig. 1. Haplotype-based association analysis showed that the haplotype combination was evidently associated with milk yield, fat yield and protein yield as well (P < 0.0001 to P = 0.0076) (Table 4).

Fig. 1
figure 1

Linkage disequilibrium estimated of the five indels in CENPE gene. The values in boxes are pair-wise indel correlations (r2)

Table 4 Haplotype analysis of CENPE gene (least squares mean ± SE)

Indel rs134985825 in the intron 6 of RETSAT showed remarkable effects on milk yield, protein yield, fat yield and protein percentage (P = 0.0001 to 0.0237). For ACSBG2, indel rs377943075 in the intron 7 was significantly associated with fat yield (P = 0.0088) and protein percentage (P = 0.0052). Variant rs136639319 in the intron 3 of TBC1D1was significantly associated with fat percentage (P = 0.0224) and protein percentage (P = 0.0209).

For the indel rs379188781 in the intron 1 of NLK, it was found to be associated with protein percentage (P = 0.0047), the other indel rs134444531 in the intron 3 was associated with protein yield (P = 0.0012) and protein percentage (P = 0.0257). While, no LD was observed between such two indels (r2 = 0.14).

For MAP3K1, indel ss2137349058 in the intron 16 was markedly associated with milk yield, fat yield and protein yield (P < 0.0001).

For the intronic indel of UGDH gene, the indel ss2019489562 located in intron 2 was significantly associated with milk yield (P = 0.0312).

Additionally, the significant additive, dominant and allele substitution effects of the 13 indels on the five milk traits were observed as well (Table 5).

Table 5 Genetic effects of thirteen indels in eight genes on five milk production traits

Prediction the mRNA and protein structures

Using a statistical folding algorithm, the alteration of the most stable mRNA secondary structures caused by the two exonic indels for FCGR2B and CENPE were observed for both the ins/ins and del/del genotypes. As illustrated in Fig. 2, obvious structural differences spanning the position 971–980 between the ins/ins and del/del genotypes of the indel rs381714237 in FCGR2B gene were observed. The free energy (∆G) of the ins allele was predicted to be higher (∆G = − 468.70 kcal/mol) than the del allele (∆G = − 470.30 kcal/mol). Correspondingly, the ins allele was deduced to form one larger single loop structure, which potentially decreasing the stability of mRNA (∆∆G = + 1.6 kcal/mol). It is worth mentioning that previous studies have evidenced that the ∆∆G ranged from − 3.9 kcal/mol to + 4.0 kcal/mol could affect the mRNA stability [22,23,24,25,26,27,28]. In addition, indel rs381714237 of FCGR2B was predicted to decrease the number of amino acid by 38, which might change protein structure and function. As a result, differences of the protein secondary structures were predicted between the FCGR2B proteins corresponding to alleles del and ins with regard to alpha helix (21.64% vs. 16.45%), extended strand (23.10% vs. 24.67%), beta turn (7.02% vs. 7.89%) and random coil (48.25% vs. 50.99%) using the SOPMA program.

Fig. 2
figure 2

The predicted mRNA secondary structures corresponding to the exonic indel of FCGR2B gene

For the non-frameshiting indel, subtle change of mRNA secondary structures between the two homozygous genotypes of indel ss2137349053 in CENPE was occurred (data not shown). The free energy was altered from − 1816.10 kcal/mol for the del allele to − 1818.90 kcal/mol for the ins allele. While, slight difference was predicted for the CENPE protein in accordance between the del/del and ins/ins genotypes, alpha helix (72.57% vs. 72.69%), and random coil (13.94% vs. 13.82%). There was no change of extended strand and beta turn for CENPE protein.


In the present work, we confirmed that 13 indels belonging to 8 candidate genes (FCGR2B, CENPE, RETSAT, ACSBG2, TBC1D1, NLK, MAP3K1 and UGDH) for milk compositions identified by our previous whole genome resequencing study [21] showed significant genetic effects on at least one of milk traits in dairy cattle. As far as our knowledge, this is the first report to connect these genes to milk production traits of dairy cattle.

Among the total 25 differential indels with the same allelic distribution directions between the bulls in high and low groups identified by our previous whole genome resequencing study [21], indel rs383700527 (3 N ins) located upstream of ACSBG2 gene was found to contribute to milk fat in a cis-regulatory manner (unpublished data). Thus, we investigated another 24 indels in the present study. Among them, one intronic indel (4 N ins in CENPE) was failed to be verified by Sanger sequencing due to the special characteristic of the flanking sequence with lower GC% and repetitive DNA sequences. Two indels (3 N del in CENPE and 2 N ins in NFKB2) didn’t show polymorphic in this study. Eight indels (1 N del and 21 N ins in CENPE, 2N del in RETSAT, 1 N ins and 1 N del in NLK, 6 N del in SLC30A2, 2 N ins in ANGPT1 and 1N ins in UGDH) were failed to be genotyped by using MALDI-TOF MS. The possible reason may be that MALDI-TOF MS for multiplex genotyping was relied on multiplex-PCR primers and extended primers to genotype multiple loci [29], simultaneously, the primer design was depended on sequence composition, molecular weight, annealing temperature and reaction efficiencies of each locus [29]. Hence, a total of 13 polymorphic indels were successfully genotyped and performed for association analysis.

Significant associations between candidate genes and milk production traits

Six indels in FCGR2B and CENPE

For indel rs381714237 in FCGR2B, we demonstrated that ins/ins genotype had higher protein percentage. As a regulator, FCGR2B was contributed to immune response [30]. Additionally, bovine mammary gland is a product of the innate immune system and active during lactation. Thus, these evidences indicated that FCGR2B might affect milk protein percentage through impacting the cows on immune response during lactation.

For the five indels in CENPE, the association results revealed that ins/ins genotypes were dominant compared with del/del genotypes for milk yield, fat yield and protein yield. Previous report has found that CENPE acted as a monitor protein and was necessary for cell cycle [31]. Thus, it appeared that the CENPE might affect these traits through modulating bovine mammary gland development.

Seven indels located in six genes

Our association analysis confirmed that the ins/ins genotype of the indel rs134985825 in RETSAT gene increased milk yield, fat yield, and protein yield. RETSAT was considered as a regulator for liver metabolism, and was critical for lipid accumulation and adipogenesis promotion [32]. Previous research has investigated that the polymorphisms of RETSAT gene were associated with premium cut yields and backfat thickness in pig [33]. Taken together, we speculated that RETSAT might affect milk traits through influencing the lipid metabolism.

Herein, we found that individuals with ins/ins genotype of indel rs377943075 in ACSBG2 showed higher fat yield than those with del/del genotype. The ACSBG2 gene encodes the protein that belongs to a member of the acyl-CoA synthetase family and participated in PPAR signaling pathway and involved in lipid metabolism and lipid droplet formation [34, 35]. Previous researchers have found that polymorphisms of ACSBG2 showed positive effects on yolk development and abdominal fat weight [36].

In current study, our results also showed a significant relationship between the indel rs136639319 in TBC1D1 and fat percentage as well as protein percentage. It was worth mentioning that TBC1D1, as a member of Rab GTPase-activating proteins (GAPs), was involved in translocation of GLUT4 to the plasma membrane. Polymorphisms in TBC1D1 have been observed to show significant effects on severe obesity or carcass in human [37] and chicken [38], respectively, suggesting exhibiting functions related to lipid and energy homeostasis as reported by Hargett et al. [39, 40].

Two intronic indel (rs379188781 and rs134444531) in NLK showed strong associations with protein yield and protein percentage. Interestingly, Cole et al. reported that one single nucleotide polymorphism (SNP) (ARS-BFGL-NGS-106227) significantly associated with protein percentage (P = 5.59 × 10− 8) was merely 90 kb away from the NLK gene [41]. Furthermore, NLK, as a member of MAPK subfamily, had an essential role in mediating the mTORC1 signaling pathway which was involved in milk protein synthesis [42, 43]. Together, these data suggested that significant variation of protein yield and protein percentage might be regulated by NLK.

As for MAP3K1, individuals with del/del genotype of indel polymorphism ss2137349058 had higher milk yield, fat yield and protein yield. MAP3K1 was known to be involved in the MAPK signaling pathway, and was considered to be a metabolic stimuli inducing cell proliferation [44, 45]. Meanwhile, it functioned as a candidate gene for type 2 diabetes (T2D) by interacting with insulin signaling pathway [46]. Thus, we concluded that MAP3K1 might regulate milk composition traits by modulating bovine mammary gland development.

The intronic indel ss2019489562 in UGDH showed significant effect on milk yield. UGDH encodes the protein that was implicated with biosynthesis of glycosaminoglycans, hyaluronan, chondroitin sulfate, and heparan sulfate. Previously, Xu et al. have demonstrated that two exonic SNPs in UGDH showed significant associations with milk production traits in Chinese Holstein population [47]. In particular, UGDH was close to the peak location of two reported QTLs for fat yield, fat percentage and protein yield [48,49,50,51]. Further, two previously reported significant SNPs for fat yield, protein yield, fat percentage and protein percentage [41] were near to the UGDH gene. Moreover, expression pattern in InnateDB showed that UGDH have the highest expression in liver which plays an indispensable role in metabolism of carbohydrates, fats and proteins in dairy cattle. Hence, these data demonstrated that UGDH gene might be a vital regulator for milk traits by affecting liver metabolism.


In the present study, we performed association analysis for the 13 short indels within 8 candidate genes for milk compositions identified by our previous whole genome resequencing study, including FCGR2B, CENPE, RETSAT, ACSBG2, TBC1D1, NLK, MAP3K1 and UGDH. As a result, the 13 indels were shown to have significant genetic effects on at least one of milk yield and composition traits. These results not only validated the candidate genes and indels from the previous whole genome resequencing work, but also provided novel molecular information for genetic improvement program of dairy cattle.


Ethics statement

All the procedures for sample collections and phenotypic observations of experimental individuals were carried out along with regular quarantine inspection of the farms and in strict accordance with the protocol reviewed and approved by the Institutional Animal Care and Use Committee (IACUC) at China Agricultural University, and the permit number is DK996.


The animals used for association analysis included a total of 769 Chinese Holstein cows those were daugters of 40 sire families. These daughters were collected from 22 herds of Beijing Sanyuanlvhe Dairy Farming Center, a leading dairy company in China. Phenotypic data of the five milk production traits including 305-day milk yield (MY), fat yield (FY), protein yield (PY), fat percentage (FP) and protein percentage (PP) those were calculated based on at least 6 test-day records in each lactation using a multiple trait random regression test-day model by the Dairy Data Center of China, Dairy Association of China (

Genomic DNA was isolated from whole blood of cows and frozen semen of sires as previously described by Yang et al. [16].

Indels selection, PCR amplification, sequencing and genotyping

Of the 25 short indels that identified by our previous whole-genome resequencing study, 24 indels were investigated the associations with the five milk production traits except for a three-nucleotide insertion (3 N ins) in ACSBG2 gene.

A total of 23 pairs of PCR primers were designed with Primer Premier 5.0 and Oligo 7.0 softwares based on the genomic sequences of the 11 candidate genes in Bos_taurus_UMD3.1 assembly (Additional file 1). To identify the twenty-four potential indel polymorphisms, two DNA pools for the above 40 sires were constructed with equal concentration of 50 ng/μl of each bull (20 individuals/pool). PCR products basd on the pooled DNA were purified with an EasyPure PCR Purification Kit (TransGen Biotech, Beijing, China) and then bi-directionally sequenced using ABI3730xl DNA Analyzer (Applied Biosystems, Foster City, CA, USA).

To further confirm the position and sequence of the insertions and deletions, the purified PCR products were cloned into the pClone007 vector with a pClone 007 Vector Kit (TsingKe Biological Technology, Beijing, China). Positive clones including target indels were sequenced to search potential indels. The BLAST software ( and Chromas 2.0 (Technelysium, Australia) were applied for sequence alignment to the reference sequence of the corresponding gene referring to Bos_taurus_UMD_3.1 assembly. Finally, genotyping for the identified indels in 769 chinese cows was performed by using the Sequenom MassArray matrix-assisted laser desorption ionization time-of-flight mass spectrometry (MALDI-TOF MS).

Bioinformatics analysis

To further explore the potential impact of the exonic indels in FCGR2B and CENPE on the mRNA secondary structures as well as the second structures of corresponding proteins, the online RNA FOLDING FORM (version 2.3) software [52] and SPOMA program ( [53] were used, respectively.

Association analysis

Allele frequencies and genotype frequencies between the insertion and deletion genotypes, as well as the Hardy-Weinberg equilibrium were determined through a chi-square test. Associations between the 13 investigated indels and the five milk production traits were carried out by applying the mixed procedure in SAS 9.2 [54] based on the following linear mixed regression model:

$$ \mathrm{Y}=\kern0.5em \upmu \kern0.5em +\kern0.5em \mathrm{hys}\kern0.5em +\kern0.5em \mathrm{b}\kern0.5em +\kern0.5em \mathrm{M}\kern0.5em +\kern0.5em \mathrm{G}\kern0.5em +\kern0.5em \mathrm{a}\kern0.5em +\kern0.5em \mathrm{e} $$

Where Y is the phenotypic record for the analyzed trait of the cows, μ is the overall mean of the phenotypic record for each trait, hys is a fixed effect of herd, year and season, b is linear regression coefficient on calving month (M), M is effect of calving month, G is a fixed effect of indel genotype or haplotype, a is a random polygenic effect account for all known pedigree relationships, and e is a random residual.

Also, we estimated the additive (a), dominance (d) and allele substitution (α) effects using the equation of Falconer & Mackay [55]:\( \mathrm{a}=\raisebox{1ex}{$\left(\mathrm{AA}\hbox{-} \mathrm{BB}\right)$}\!\left/ \!\raisebox{-1ex}{$2$}\right.,\mathrm{d}=\mathrm{AB}\hbox{-} \raisebox{1ex}{$\mathrm{AA}+\mathrm{BB}$}\!\left/ \!\raisebox{-1ex}{$2$}\right.\kern0.5em \mathrm{and}\kern0.5em \upalpha =\mathrm{a}+\mathrm{d}\left(\mathrm{q}\hbox{-} \mathrm{p}\right) \) where AA, BB and AB were the least square means of the phenotypic values for corresponding genotypes, and p and q indicates the allele frequencies of the corresponding alleles. Multiple t-tests with Bonferroni correction were used to compare the effects of the genotypes on each indel.

The linkage disequilibrium (LD) extent among the genotyped indels (five indels in CENPE gene and two indels in NLK gene) and haplotype blocks was estimated using Haploview 4.2 (Broad Institute of MIT and Harvard, Cambridge, MA, USA).

Availability of data and materials

All relevant data are available within the article and its additional files.



Acyl-CoA synthetase bubblegum family member 2


Base pair


Centromere protein E


Estimated breeding value

F11 :

Coagulation factor XI


Fc fragment of IgG receptor IIb


Fat percentage


Fat yield


Genome-wide association study


Insertion and deletion


Matrix-assisted laser desorption/ionization time of flight mass spectrometry

MAP3K1 :

Mitogen-activated protein kinase kinase kinase 1




Milk yield


Next-generation sequencing


Nemo like kinase

PMEL17 :

Premelanosome protein


Protein percentage


Protein yield


Quantitative trait locus


Retinol saturase


Single nucleotide polymorphism


Sperm flagellar 2


Type 2 diabetes

TAL1 :

TAL bHLH transcription factor 1


T-cell acute lymphoblastic leukemia

TBC1D1 :

TBC1 domain family member 1


UDP-glucose 6-dehy- drogenase


  1. Blott S, Kim JJ, Moisio S, Schmidt-Kuntzel A, Cornet A, Berzi P, Cambisano N, Ford C, Grisart B, Johnson D, et al. Molecular dissection of a quantitative trait locus: a phenylalanine-to-tyrosine substitution in the transmembrane domain of the bovine growth hormone receptor is associated with a major effect on milk yield and composition. Genetics. 2003;163(1):253–66.

    CAS  PubMed  PubMed Central  Google Scholar 

  2. Grisart B, Farnir F, Karim L, Cambisano N, Kim JJ, Kvasz A, Mni M, Simon P, Frere JM, Coppieters W, et al. Genetic and functional confirmation of the causality of the DGAT1 K232A quantitative trait nucleotide in affecting milk yield and composition. Proc Natl Acad Sci U S A. 2004;101(8):2398–403.

    Article  CAS  Google Scholar 

  3. Cohen-Zinder M, Seroussi E, Larkin DM, Loor JJ, Everts-van der Wind A, Lee JH, Drackley JK, Band MR, Hernandez AG, Shani M, et al. Identification of a missense mutation in the bovine ABCG2 gene with a major effect on the QTL on chromosome 6 affecting milk yield and composition in Holstein cattle. Genome Res. 2005;15(7):936–44.

    Article  CAS  Google Scholar 

  4. Yang SH, Bi XJ, Xie Y, Li C, Zhang SL, Zhang Q, Sun DX. Validation of PDE9A gene identified in GWAS showing strong association with Milk production traits in Chinese Holstein. Int J Mol Sci. 2015;16(11):26530–42.

    Article  CAS  Google Scholar 

  5. Georges M, Nielsen D, Mackinnon M, Mishra A, Okimoto R, Pasquino AT, Sargeant LS, Sorensen A, Steele MR, Zhao X, et al. Mapping quantitative trait loci controlling milk production in dairy cattle by exploiting progeny testing. Genetics. 1995;139(2):907–20.

    CAS  PubMed  PubMed Central  Google Scholar 

  6. Andersson L. Genome-wide association analysis in domestic animals: a powerful approach for genetic dissection of trait loci. Genetica. 2009;136(2):341–9.

    Article  CAS  Google Scholar 

  7. Schennink A, Bovenhuis H, Leon-Kloosterziel KM, van Arendonk JA, Visker MH. Effect of polymorphisms in the FASN, OLR1, PPARGC1A, PRL and STAT5A genes on bovine milk-fat composition. Anim Genet. 2009;40(6):909–16.

    Article  CAS  Google Scholar 

  8. Li C, Sun D, Zhang S, Wang S, Wu X, Zhang Q, Liu L, Li Y, Qiao L. Genome wide association study identifies 20 novel promising genes associated with milk fatty acid traits in Chinese Holstein. PLoS One. 2014;9(5):e96186.

    Article  Google Scholar 

  9. Grobet L, Martin LJ, Poncelet D, Pirottin D, Brouwers B, Riquet J, Schoeberlein A, Dunner S, Menissier F, Massabanda J, et al. A deletion in the bovine myostatin gene causes the double-muscled phenotype in cattle. Nat Genet. 1997;17(1):71–4.

    Article  CAS  Google Scholar 

  10. Kerje S, Sharma P, Gunnarsson U, Kim H, Bagchi S, Fredriksson R, Schutz K, Jensen P, von Heijne G, Okimoto R, et al. The dominant white, dun and smoky color variants in chicken are associated with insertion/deletion polymorphisms in the PMEL17 gene. Genetics. 2004;168(3):1507–18.

    Article  CAS  Google Scholar 

  11. Kunieda M, Tsuji T, Abbasi AR, Khalaj M, Ikeda M, Miyadera K, Ogawa H, Kunieda T. An insertion mutation of the bovine F11 gene is responsible for factor XI deficiency in Japanese black cattle. Mamm Genome. 2005;16(5):383–9.

    Article  CAS  Google Scholar 

  12. Montgomery SB, Goode DL, Kvikstad E, Albers CA, Zhang ZD, Mu XJ, Ananda G, Howie B, Karczewski KJ, Smith KS, et al. The origin, evolution, and functional impact of short insertion-deletion variants identified in 179 human genomes. Genome Res. 2013;23(5):749–61.

    Article  CAS  Google Scholar 

  13. Catalan A, Glaser-Schmitt A, Argyridou E, Duchen P, Parsch J. An Indel polymorphism in the MtnA 3' untranslated region is associated with gene expression variation and local adaptation in Drosophila melanogaster. PLoS Genet. 2016;12(4):e1005987.

    Article  Google Scholar 

  14. Mansour MR, Abraham BJ, Anders L, Berezovskaya A, Gutierrez A, Durbin AD, Etchin J, Lawton L, Sallan SE, Silverman LB, et al. An oncogenic super-enhancer formed through somatic mutation of a noncoding intergenic element. Science. 2014;346(6215):1373–7.

    Article  CAS  Google Scholar 

  15. Sironen A, Thomsen B, Andersson M, Ahola V, Vilkki J. An intronic insertion in KPL2 results in aberrant splicing and causes the immotile short-tail sperm defect in the pig. Proc Natl Acad Sci U S A. 2006;103(13):5006–11.

    Article  CAS  Google Scholar 

  16. Yang S, Li C, Xie Y, Cui X, Li X, Wei J, Zhang Y, Yu Y, Wang Y, Zhang S, et al. Detection of functional polymorphisms influencing the promoter activity of the SAA2 gene and their association with milk production traits in Chinese Holstein cows. Anim Genet. 2015;46(6):591–8.

    Article  CAS  Google Scholar 

  17. Xia Q, Guo Y, Zhang Z, Li D, Xuan Z, Li Z, Dai F, Li Y, Cheng D, Li R, et al. Complete resequencing of 40 genomes reveals domestication events and genes in silkworm (Bombyx). Science. 2009;326(5951):433–6.

    Article  CAS  Google Scholar 

  18. Koboldt DC, Steinberg KM, Larson DE, Wilson RK, Mardis ER. The next-generation sequencing revolution and its impact on genomics. Cell. 2013;155(1):27–38.

    Article  CAS  Google Scholar 

  19. Daetwyler HD, Capitan A, Pausch H, Stothard P, van Binsbergen R, Brondum RF, Liao X, Djari A, Rodriguez SC, Grohs C, et al. Whole-genome sequencing of 234 bulls facilitates mapping of monogenic and complex traits in cattle. Nat Genet. 2014;46(8):858–65.

    Article  CAS  Google Scholar 

  20. Li MZ, Tian SL, Yeung CKL, Meng XH, Tang QZ, Niu LL, Wang X, Jin L, Ma JD, Long KR, et al. Whole-genome sequencing of Berkshire (European native pig) provides insights into its origin and domestication. Sci Rep. 2014;4.

  21. Jiang JP, Gao YH, Hou YL, Li WH, Zhang SL, Zhang Q, Sun DX. Whole-genome resequencing of Holstein bulls for Indel discovery and identification of genes associated with Milk composition traits in dairy cattle. PLoS One. 2016;11(12):e0168946.

    Article  Google Scholar 

  22. Shin JG, Yuan ZQ, Fordyce K, Sreeramoju P, Kent TS, Kim J, Wang V, Schneyer D, Weber TK. A del T poly T (8) mutation in the 3 ' untranslated region (UTR) of the CDK2-AP1 gene is functionally significant causing decreased mRNA stability resulting in decreased CDK2-AP1 expression in human microsatellite unstable (MSI) colorectal cancer (CRC). Surgery. 2007;142(2):222–7.

    Article  Google Scholar 

  23. Chatterjee R, Batra J, Ghosh B. A common Exonic variant of Interleukin21 confers susceptibility to atopic asthma. Int Arch Allergy Imm. 2009;148(2):137–46.

    Article  CAS  Google Scholar 

  24. Lechner J, Bae HA, Guduric-Fuchs J, Rice A, Govindarajan G, Siddiqui S, Farraj LA, Yip SP, Yap M, Das M, et al. Mutational analysis of MIR184 in sporadic keratoconus and myopia. Invest Ophth Vis Sci. 2013;54(8):5266–72.

    Article  CAS  Google Scholar 

  25. Pinsonneault JK, Frater JT, Kompa B, Mascarenhas R, Wang DX, Sadee W. Intronic SNP in ESR1 encoding human estrogen receptor alpha is associated with brain ESR1 mRNA isoform expression and behavioral traits. PLoS One. 2017;12(6):e0179020.

    Article  Google Scholar 

  26. Sun WY, Lan J, Chen L, Qiu JJ, Luo ZG, Li MZ, Wang JY, Zhao JG, Zhang TH, Long X, et al. A mutation in porcine pre-miR-15b alters the biogenesis of MiR-15b\16-1 cluster and strand selection of MiR-15b. PLoS One. 2017;12(5):e0178045.

    Article  Google Scholar 

  27. Li C, Grove ML, Yu B, Jones BC, Morrison A, Boerwinkle E, Liu XM. Genetic variants in microRNA genes and targets associated with cardiovascular disease risk factors in the African-American population. Hum Genet. 2018;137(1):85–94.

    Article  CAS  Google Scholar 

  28. Moya L, Lai J, Hoffman A, Srinivasan S, Panchadsaram J, Chambers S, Clements JA, Batra J, BioResour APC. Association analysis of a microsatellite repeat in the TRIB1 gene with prostate Cancer risk, aggressiveness and survival. Front Genet. 2018;9.

  29. Ross P, Hall L, Smirnov I, Haff L. High level multiplex genotyping by MALDI-TOF mass spectrometry. Nat Biotechnol. 1998;16(13):1347–51.

    Article  CAS  Google Scholar 

  30. Jonsson S, Sveinbjornsson G, de Lapuente Portilla AL, Swaminathan B, Plomp R, Dekkers G, Ajore R, Ali M, Bentlage AEH, Elmer E, et al. Identification of sequence variants influencing immunoglobulin levels. Nat Genet. 2017;49(8):1182–91.

    Article  CAS  Google Scholar 

  31. Orozco-Lucero E, Dufort I, Robert C, Sirard MA. Rapidly cleaving bovine two-cell embryos have better developmental potential and a distinctive mRNA pattern. Mol Reprod Dev. 2014;81(1):31–41.

    Article  CAS  Google Scholar 

  32. Schupp M, Lefterova MI, Janke J, Leitner K, Cristancho AG, Mullican SE, Qatanani M, Szwergold N, Steger DJ, Curtin JC, et al. Retinol saturase promotes adipogenesis and is downregulated in obesity. P Natl Acad Sci USA. 2009;106(4):1105–10.

    Article  CAS  Google Scholar 

  33. Martinez-Montes AM, Fernandez A, Perez-Montarelo D, Alves E, Benitez RM, Nunez Y, Ovilo C, Ibanez-Escriche N, Folch JM, Fernandez AI. Using RNA-Seq SNP data to reveal potential causal mutations related to pig production traits and RNA editing. Anim Genet. 2017;48(2):151–65.

    Article  CAS  Google Scholar 

  34. Marszalek JR, Kitidis C, Dararutana A, Lodish HF. Acyl-CoA synthetase 2 overexpression enhances fatty acid internalization and neurite outgrowth. J Biol Chem. 2004;279(23):23882–91.

    Article  CAS  Google Scholar 

  35. Fujimoto Y, Onoduka J, Homma KJ, Yamaguchi S, Mori M, Higashi Y, Makita M, Kinoshita T, Noda J, Itabe H, et al. Long-chain fatty acids induce lipid droplet formation in a cultured human hepatocyte in a manner dependent of acyl-CoA synthetase. Biol Pharm Bull. 2006;29(11):2174–80.

    Article  CAS  Google Scholar 

  36. Sun CJ, Lu J, Yi GQ, Yuan JW, Duan ZY, Qu LJ, Xu GY, Wang KH, Yang N. Promising loci and genes for yolk and ovary weight in chickens revealed by a genome-wide association study. PLoS One. 2015;10(9):e0137145.

    Article  Google Scholar 

  37. Chen L, Chen QL, Xie BX, Quan C, Sheng Y, Zhu SS, Rong P, Zhou SL, Sakamoto K, MacKintosh C, et al. Disruption of the AMPK-TBC1D1 nexus increases lipogenic gene expression and causes obesity in mice via promoting IGF1 secretion. Proc Natl Acad Sci U S A. 2016;113(26):7219–24.

    Article  CAS  Google Scholar 

  38. Wang Y, Xu HY, Gilbert ER, Peng X, Zhao XL, Liu YP, Zhu Q. Detection of SNPs in the TBC1D1 gene and their association with carcass traits in chicken. Gene. 2014;547(2):288–94.

    Article  CAS  Google Scholar 

  39. Hargett SR, Walker NN, Hussain SS, Hoehn KL, Keller SR. Deletion of the Rab GAP Tbc1d1 modifies glucose, lipid, and energy homeostasis in mice. Am J Physiol-Endocrinol Metab. 2015;309(3):E233–45.

    Article  CAS  Google Scholar 

  40. Hargett SR, Walker NN, Keller SR. Rab GAPs AS160 and Tbc1d1 play nonredundant roles in the regulation of glucose and energy homeostasis in mice. Am J Physiol-Endocrinol Metab. 2016;310(4):E276–88.

    Article  Google Scholar 

  41. Cole JB, Wiggans GR, Ma L, Sonstegard TS, Lawlor TJ, Crooker BA, Van Tassell CP, Yang J, Wang SW, Matukumalli LK, et al. Genome-wide association analysis of thirty one production, health, reproduction and body conformation traits in contemporary US Holstein cows. BMC Genomics. 2011;12.

    Article  Google Scholar 

  42. Bionaz M, Loor JJ. Gene networks driving bovine mammary protein synthesis during the lactation cycle. Bioinform Biol Insights. 2011;5:83–98.

    Article  CAS  Google Scholar 

  43. Yuan HX, Wang Z, Yu FX, Li F, Russell RC, Jewell JL, Guan KL. NLK phosphorylates raptor to mediate stress-induced mTORC1 inhibition. Genes Dev. 2015;29(22):2362–76.

    Article  CAS  Google Scholar 

  44. Glubb DM, Maranian MJ, Michailidou K, Pooley KA, Meyer KB, Kar S, Carlebur S, O'Reilly M, Betts JA, Hillman KM, et al. Fine-scale mapping of the 5q11.2 breast Cancer locus reveals at least three independent risk variants regulating MAP3K1. Am J Hum Genet. 2015;96(1):5–20.

    Article  CAS  Google Scholar 

  45. Kuo SH, Yang SY, You SL, Lien HC, Lin CH, Lin PH, Huang CS. Polymorphisms of ESR1, UGT1A1, HCN1, MAP3K1 and CYP2B6 are associated with the prognosis of hormone receptor-positive early breast cancer. Oncotarget. 2017;8(13):20925–38.

    Article  Google Scholar 

  46. Tabassum R, Chauhan G, Dwivedi OP, Mahajan A, Jaiswal A, Kaur I, Bandesh K, Singh T, Mathai BJ, Pandey Y, et al. Genome-wide association study for type 2 diabetes in Indians identifies a new susceptibility locus at 2q21. Diabetes. 2013;62(3):977–86.

    Article  CAS  Google Scholar 

  47. Xu Q, Mei G, Sun DX, Zhang Q, Zhang Y, Yin CC, Chen HY, Ding XD, Liu JF. Detection of genetic association and functional polymorphisms of UGDH affecting milk production trait in Chinese Holstein cattle. BMC Genomics. 2012;13:590.

    Article  CAS  Google Scholar 

  48. Freyer G, Kuhn C, Weikard R, Zhang Q, Mayer M, Hoeschele I. Multiple QTL on chromosome six in dairy cattle affecting yield and content traits (vol 119, pg 60, 2002). J Anim Breed Genet. 2002;119(3):200.

    Article  Google Scholar 

  49. Szyda J, Liu Z, Reinhardt F, Reents R. Estimation of quantitative trait loci parameters for milk production traits in German Holstein dairy cattle population. J Dairy Sci. 2005;88(1):356–67.

    Article  CAS  Google Scholar 

  50. Chen HY, Zhang Q, Yin CC, Wang CK, Gong WJ, Mei G. Detection of quantitative trait loci affecting milk production traits on bovine chromosome 6 in a Chinese Holstein population by the daughter design. J Dairy Sci. 2006;89(2):782–90.

    Article  CAS  Google Scholar 

  51. Kucerova J, Lund MS, Sorensen P, Sahana G, Guldbrandtsen B, Nielsen VH, Thomsen B, Bendixen C. Multitrait quantitative trait loci mapping for milk production traits in danish Holstein cattle. J Dairy Sci. 2006;89(6):2245–56.

    Article  CAS  Google Scholar 

  52. Zuker M. Mfold web server for nucleic acid folding and hybridization prediction. Nucleic Acids Res. 2003;31(13):3406–15.

    Article  CAS  Google Scholar 

  53. Geourjon C, Deleage G. SOPMA: significant improvements in protein secondary structure prediction by consensus prediction from multiple alignments. Comput Appl Biosci. 1995;11(6):681–4.

    CAS  Google Scholar 

  54. Kolbehdari D, Wang Z, Grant JR, Murdoch B, Prasad A, Xiu Z, Marques E, Stothard P, Moore SS. A whole genome scan to map QTL for milk production traits and somatic cell score in Canadian Holstein bulls. J Anim Breed Genet. 2009;126(3):216–27.

    Article  CAS  Google Scholar 

  55. Falconer DS, Mackay TFC: Introduction to quantitative genetics: 4th edn. Longman scientific and technical, New York.; 1996.

Download references


We appreciate Beijing Dairy Cattle Center for providing the phenotypic data of milk production traits.


This work was financially supported by the National Natural Science Foundation of China (31872330, 31802041), Beijing Dairy Industry Innovation Team (BAIC06–2018/2019), Beijing Science and Technology Program (D171100002417001), National Science and Technology Programs of China (2013AA102504), earmarked fund for Modern Agro-industry Technology Research System (CARS-36), and the Program for Changjiang Scholar and Innovation Research Team in University (IRT_15R62). The funders had no role in the study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Author information

Authors and Affiliations



DS conceived and designed the experiments, JJ and GY analyzed the data, JJ, SL and LW prepared the DNA samples for SNP identification and genotyping, and the manuscript was prepared by JJ and DS. LY and LL provided the samples and participated in the result interpretation. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Dongxiao Sun.

Ethics declarations

Ethics approval and consent to participate

All protocols for collection of the samples of experimental individuals and phenotypic observations were reviewed and approved by the Institutional Animal Care and Use Committee (IACUC) at China Agricultural University. Samples were collected specifically for this study following standard procedures with the full agreement of the Beijing Sanyuanlvhe Dairy Farming Center who owned the Holstein cows and bulls, respectively.

Consent for publication

Not applicable.

Competing interests

The authors declare that they have no competing interests.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Additional files

Additional file 1:

Table S1. Primers used for pooled DNA sequencing for the 24 indels. (XLSX 42 kb)

Additional file 2:

Results of sanger and clone sequencing of the thirteen indels. (DOCX 332 kb)

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (, which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver ( applies to the data made available in this article, unless otherwise stated.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Jiang, J., Liu, L., Gao, Y. et al. Determination of genetic associations between indels in 11 candidate genes and milk composition traits in Chinese Holstein population. BMC Genet 20, 48 (2019).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: