RIL populations and phenotyping
The spring barley cultivars Baudin and Fleet (H. vulgare ssp. vulgare) along with their wild relative Awcs276 (H. vulgare ssp. spontaneum) were obtained from a collection assembled at the University of Tasmania and used to generate two RIL populations (Fig. 1) as described by Chen [20]. Awcs276, a long-kernel wild barley genotype from the Middle East, was used as the common parent in the two RIL populations (Baudin/Awcs276 and Fleet/Awcs276). Baudin/Awcs276 (mapping population, 128 lines of F8, F9, and F10 generations) was evaluated in one location over three years to detect QTLs for LEN, whereas Fleet/Awcs276 (validation population, 94 lines of F10 generation) was evaluated for one year to validate putative QTLs identified in the mapping population. Baudin/Awcs276 was planted in October 2012 (F8), 2013 (F9), and 2014 (F10) in duplicate rows of ten plants each in a completely randomized design in Wenjiang, Chengdu, China (30°36′N, 103°41′E). The length of each row was 1.5 m with a row-to-row distance of 15 cm. Field management was carried out according to common practices in barley production. Mixed seeds were collected from mature plants in May 2013, 2014, and 2015, dried, and stored at 25 °C until analysis. Fleet/Awcs276 was planted in October 2014 and harvested in May 2015. Fully filled grains were used for measuring LEN in June 2015. LEN was measured in millimeters using a ruler and estimated by one measurement of 10 randomly selected kernels in 2013 or the average of three measurements in 2014 and 2015. The average LEN of each year was used for QTL analysis.
Phenotypic data analysis
LEN in a given environment was determined as the arithmetic average of three biological replicates. Student’s t-test (P < 0.05) was used to identify the differences in LEN between the parental lines. Summary statistics were performed using Excel 2010 (Microsoft Corp., Redmond, WA, USA), whereas analysis of variance (ANOVA) in conjunction with Student’s t-test (P <0.001) using the general linear model (GLM) in SPSS 17.0 (IBM SPSS, Chicago, IL, USA). Broad-sense heritability (H
2) for each trait was estimated as H
2 = σ2
g/(σ2
g + σ2
ge/n + σ2
e/nr), where σ2
g is the genetic variance, σ2
ge is the genotype by environment (G × E) variance, σ2
e is the error, n is the number of environments, and r is the number of replicates [21]. The σ2
g, σ2
ge, and σ2
e values were calculated using ANOVA (P <0.001) in SAS 9.2 (SAS Institute Inc., Cary, NC, USA). The best linear unbiased prediction (BLUP) method was used to estimate the random effects of mixed models. Phenotypic BLUP was calculated using the BLUP procedure in SAS 9.2.
Genotyping and construction of genetic linkage map
Total genomic DNA (gDNA) was isolated and purified from fresh leaf tissue of one randomly selected plant in each F8 line of Baudin/Awcs276 and F10 line of Fleet/Awcs276 using the modified cetyltrimethylammonium bromide (CTAB) method [22]. DArT sequencing was conducted by Triticarte Pty Ltd. (Canberra, Australia), selecting the corresponding predominantly active genes of a genome fraction through the use of a combination of restriction enzymes, which separate low copy sequences from the repetitive fraction of the genome (http://www.diversityarrays.com/dart-application-dartseq). DArT sequencing generates two data types: 1) scores for “presence/absence” (dominant) markers, known as SilicoDArT markers, as they are analogous to microarray DArT markers, but are extracted in silico from sequences obtained from genomic representations; and 2) SNPs within the available genomic fragments. DArT loci were named according to their clone identification numbers as provided by Triticarte (http://www.diversityarrays.com/dart-application-dartseq-data-types). Polymorphic loci were selected from a total of 62,216 DArT markers after discarding those with a minor allele frequency of 0.4, a missing value of more than 20 %, or a common position.
The linkage map was constructed using IciMapping 3.2/4.0 [23] and JointMap4 [24]. All unanchored markers were properly grouped using IciMapping 3.2/4.0 with an LOD threshold of 3. The linkage analysis was conducted using JoinMap 4 (Kyazma, Wageningen, Netherlands) with a recombination frequency of 0.25, and all markers were grouped in the seven chromosomes.
QTL mapping
Phenotypic data of each trait were the means of three biological replications in a single environment. The phenotypic BLUP was used to detect QTLs from the combined three-year data. QTL analysis for selected environments was performed through the interval mapping (IM) using MAPQTL6.0 (Kyazma, Wageningen, Netherlands) [25]. A test of 1,000 permutations was used to identify the LOD threshold that corresponds to a genome-wide false discovery rate of 5 % (P < 0.05). QTLs that were stable for a target trait across environments with clearly overlapping positions on the same chromosome were assumed to be the same. Stable QTLs that explained more than 10 % of the phenotypic variance for the specific trait were considered major QTLs [26].
QTLNetwork 2 [27] was used to determine QTLs with additive effects at individual loci, epistatic interactions between two different loci, and interactions between QTLs and the environment (QTL × E). The analysis was based on a mixed linear model (MLM) with 2 cM walking speed and 2D genome scan, which maps epistatic QTLs with or without single-locus effects using 1,000 permutations in order to generate a threshold for the presence of QTLs and QTL × E interactions.
Marker development and QTL validation
Sequence information was obtained from the IPK Barley Blast Server (http://webblast.ipk-gatersleben.de/barley/index.php), and single-base differences were identified by high-resolution melt (HRM) analysis [28]. Markers were designed using Beacon Designer 7.9 and evaluated by Oligo 6.0 [29]. The parameters for Primer Premier (Premier Biosoft International, Palo Alto, CA, USA) were as follows: inner product size of 60–100 bp, melting temperature of 55 ± 5 °C, primer length of 20 ± 3 bp, and 3ʹ-end stability to avoid self-complementarity and primer dimer formation.
To detect markers, amplification reactions were performed in a total volume of 10 μl, containing 100 ng of template DNA, 5 μl of SsoFast EvaGreen mixture, 5 pmol of each forward and reverse primer, and DNase/RNase-free water up to the final value. PCR conditions were adjusted according to primer sets as follows: 4 min at 94 °C, 50 cycles of 1 s at 94 °C, and 30 s at 55 °C. This process is a precise warming of the amplicon DNA from approximately 65 °C to 95 °C. At some point during this process, the melting temperature of the amplicon is reached, and the two strands of DNA separate or “melt” apart [28].
The homozygous lines of Fleet/Awcs276 were used to validate major QTLs using the developed markers. Based on marker profiles, individuals were grouped into two classes: genotypes with homozygous alleles from AwcS276 and genotypes with homozygous alleles from Fleet. Student’s t-test (P < 0.05) was used to calculate the differences in LEN between these two classes of alleles and measure QTL effects within the validation population.
Putative candidate gene identification
To identify putative coding gene regions, flanking candidate loci, or trait-related gene products, we used the corresponding QTL marker contigs to blast search against the WGSMorex database at the IPK Barley Blast Server (http://webblast.ipk-gatersleben.de/barley/index.php). We obtained QTL positions within the Morex reference map and putative trait-related proteins. According to the putative protein categories, most genes controlling kernel traits were identified in rice. The sequences of identified genes in rice were used to perform a BLASTN search against the barley database of the National Center for Biotechnology Information (NCBI, http://www.ncbi.nlm.nih.gov/) and the Phytozome website (https://phytozome.jgi.doe.gov/pz/portal.html) in order to identify homologous candidate genes in barley and other cereal crops.