Identification of QTL TGW12 responsible for grain weight in rice based on recombinant inbred line population crossed by wild rice (Oryza minuta) introgression line K1561 and indica rice G1025

Background Limited genetic resource in the cultivated rice may hinder further yield improvement. Some valuable genes that contribute to rice yield may be lost or lacked in the cultivated rice. Identification of the quantitative trait locus (QTL) for yield-related traits such as thousand-grain weight (TGW) from wild rice speices is desired for rice yield improvement. Results In this study, sixteen TGW QTL were identified from a recombinant inbred line (RIL) population derived from the cross between the introgression line K1561 of Oryza minuta and the rice cultivar G1025. TGW12, One of most effective QTL was mapped to the region of 204.12 kb between the marker 2,768,345 and marker 2,853,491 of the specific locus amplified fragment (SLAF). The origin of TGW12 was tested using three markers nearby or within the TGW12 region, but not clarified yet. Our data indicated thirty-two open reading fragments (ORFs) were present in the region. RT-PCR analysis and sequence alignment showed that the coding domain sequences of ORF12, one MADS-box gene, in G1025 and K1561 were different due to alternative slicing, which caused premature transcription termination. The MADS-box gene was considered as a candidate of TGW12. Conclusion The effective QTL, TGW12, was mapped to a segment of 204.12 kb using RILs population and a MADS-box gene was identified among several candidate genes in the segment. The region of TGW12 should be further narrowed and creation of transgenic lines will reveal the gene function. TGW12 could be applied for improvement of TGW in breeding program.

To date, QTLs/genes associated with TGW have been mostly cloned from the cultivated rice. However, it is known that the genetic resource of cultivated rice turned quite limited during the process of wild rice domestication, which may hinder the yield improvement of the cultivated rice. Wild rice species should contain many valuable genes that can be used for genetic improvements of cultivated rice [26]. Thus, the genetic resource of wild rice species should be explored and used for rice high-yield breeding. It would be an effective way to widen the genetic basis of cultivated rice by introduction and application of favorable wild rice genes.
Oryza minuta (O. minuta) is a tetraploid wild rice that possesses a number of favorable yield related genes [27]. Rahman et al. mapped 22 novel yield-related QTLs for 16 agronomic traits using a set of introgression line (IL) of O. minuta, and demonstrated that 57% of the QTLs were derived from O. minuta [27]. In a previous study, we also detected 28 QTLs for yield-related traits using ILs derived from the backcross of IR24 (O. sativa L) and O. minuta, and found that 46.4% of notable QTLs were from O. minuta [28].
To further identify the favorable yield-related genes from O. minuta, a recombinant inbred line has been developed by crossing of K1561 to indica rice G1025 [28]. K1561 is one out of 192 ILs derived from backcross progenies (BC 4 F 2 ) of IR24 and O. minuta. It was produced from one time cross of IR24 and O. minuta, then four times backcross with IR24 as recurrent parent, and four times self-cross. K1561 shows excellent agronomic traits such as long panicles and high TGW. G1025 is an excellent restorer line that is widely used in Guangxi Province of China with dense grains but light TGW. In this study, QTL mapping was conducted on the advanced RILs population by SSR and SLAF markers. Thirteen QTLs or TGWs responsible for TGW were detected under fiver environments, and the most effective QTL TGW12 was mapped to the segment of 204.12 kb based on the high-density genome map constructed with SLAF. The candidate genes of TGW12 were preliminarily concluded, and one gene encoding MADS-box protein was considered as the putative candidate based on sequence alignment. The TGW12 allele for increasing TGW might originated from O. minuta and likely used for rice yield improvement.

Phenotypic analysis
The two parents G1025 and K1561 showed highly significant differences in TGW under five environments including two locations in Nan-Ning (NN), and Wu-Han (WH), in China with an average of 16.01 g and 32.07 g (Table 1) Table S1).

QTL mapping of TGW by simple sequence repeats (SSR)
TGW QTLs were preliminarily detected by 300 SSR markers evenly distributed on the 12 chromosomes. The population were F 6 , F 7 RILs derived from the cross of G1025 and K1561 planted in NN in 2013, 2014. Four QTLs TGW3, TGW7, TGW9.2, and TGW12 were stably detected on the chromosomes 3, 7, 9 and 12 in the two environments ( Table 2). TGW12 had the greatest effect, which located on the region of RM247 and RM7003 (Table 2), so it was selected for further analysis. There were other 166 SSR markers (Additional file 2: Table S2) in the region based on the genome sequencing data of Nipponbare [29]. The polymorphism of the 166 SSRs was firstly detected between the parental lines G1025 and K1561. As a result, nine SSRs showed polymorphism but only five displayed clear bands. The five SSRs were further used to detect F 6 , F 7 RILs population in 2013 and 2014. Finally, TGW12 was mapped to the 5.1 cM region between RM27638 and RM27748 (Fig. 2).

QTL mapping of TGW by SLAF markers
We have developed 5521 SLAF markers by SLAF sequencing [30]. To further map TGW QTLs, those SLAF markers were used to screen F 8 RILs in NN in 2015 and   2).
Evaluation of TGW12 phenotype and genetic origin identification of TGW12 segment In order to evaluate whether the phenotypes were determined by TGW12, 16 out of the 201 RILs containing the TGW12 region were identified by means of the markers nearby the region (Fig. 4). Then, the phenotypes and genotypes of the 16 RILs were compared. All the 16 RILs with one or two segments of K1561 showed TGW increase than the recurrent parent G1025, suggesting TGW12 control TGW (Fig. 4). To clarify whether the increasing effect of TGW12 was originated from O. minuta, the genotypes of G1025, K1561, IR24, and O. minuta were examined using markers nearby or within TGW12. The genotype of K1561 was the same as that of IR24 but different from that of G1025 and O. minuta on the sites of RM27638 and RM27748, which are nearby TGW12 (Figs. 2 and 5). However, the genotype of K1561 was the same as those of IR24 and O. minuta, but it was different from that of G1025 at the site of Marker 2,768,345, which is linked with TGW12. It was hard to draw a conclusion whether TGW12 originated from IR24 or O. minuta based on the above results. It has been suggested that translocation through centric break-fusion occurred more frequently than recombination in the introgression lines with interspecific cross, which didn't always result in an O. minuta chromosome arm onto a complete or incomplete O. sativa chromosome [30,31]. Thus, TGW12 origination remains to be determined in the near future. It is feasible to compare sequence of TGW12 candidate among O. minuta, IR24, and K1561 once it was fine mapped.

Preliminary prediction of candidate genes for TGW12
Analysis of annotated genes indicated that 32 ORFs located in the 204.12 kb region based on Nipponbare genome annotation (http://rice.plantbiology.msu.edu) ( Table 3). Among them, 13 ORFs encoded functional proteins and 19 ORFs were annotated as transposon/retrotransposon proteins,   K1561. Sequence comparison indicated that the amplified sequence of ORF12 in K1561 was 56 bp shorter than that of G1025, which resulted in premature transcription termination, consequently leading to a peptide with 45 amino acid residues only in K1561, whereas ORF12 in G1025 encoded a protein with 202 amino acid residues (Additional file 3: Figure S1). Further analysis revealed that lack of 56 bp of ORF12 in K1561 was due to alternative splicing (AS) in the first extron (Fig. 6b, c), which causes premature termination of ORF12 translation (Additional file 3: Figure S1). There were no difference in the CDS of ORF14, ORF24, and ORF27 between K1561 and G1025 (data not shown). Thus, the MADS-box (ORF12) was likely one putative candidate of TGW12. However, possibility of other nine functional proteins and the hypothetical proteins, or expressed protein independently or collectively affecting TGW could not be excluded. Further investigation is required.

Discussion
Rice is one of the most important staple food widely consumed by one-half world's peopulation and more product is needed as population increases in future. However, further yield improvement of rice is constrained by the narrow genetic basis of cultivated rice varieties. Wild rice species are good candidates to explore the genetic resource for valuably genes to further enhance rice productivity [26].
O. minuta possesses a number of outstanding genes associated with resistance and yield [27]. Lots of QTLs for yield related traits have been identified using the introgression lines (IL) consisting of O. minuta segments [27,28]. In this study, sixteen TGWs were detected using the advanced RILs population under five environments ( Table 2). Among them, the most effective QTL, TGW12, was mapped on chromosome 12. Eight QTLs for TGW (AQAG040, AQCF014, AQAG053, AQGP079, AQDR045, AQDR047, AQCY020, CQAS153) had been previously mapped to Chromosome 12 (http://archive. gramene.org/). Location comparison indicated that TGW12 was partially overlapped with AQDR045. TGW12 and AQDR045 were respectively located in the regions of 4,037,811-6,150,143 and 1,589,200-5, 829,185 on Chromosome 12 in the physical map of Nipponbare genome [29], suggesting that there might be a major QTL controlling grain weigh in this region. However, AQDR045 was mapped by using the popu1ation derived from two cultivated rice Lemont and Teqing [33], whereas TGW12 was mapped using the population of one O. minuta introgression line and one cultivated line. Although it was uncertain whether TGW12 allele originated from O. minuta or IR24, it showed a great increasing effect for TGW and could be directly applied in the breeding program.  (15-20 cm). b Genomic structure of ORF12. Solid bar, exon; hollow bar, 3'untranslated region; line, intron; fold lines, alternative splicing. c the cDNA sequence alignment of ORF12 between G1025 and K1561. The red and green letters indicated the sequence of exon1, and the sequence showed in green letters are those kept in the cDNA of G1025 but spliced in K1561; the blue letters indicated the sequence of exon2; the black letters indicated the sequence of intron. The symbol of "\\" indicated the omitted sequences There were 32 annotated ORFs in the TGW12 region, among which four ORFs were identified as TFs based on sequence alignment and considered as TGW12 candidates due to their regulatory roles in plant growth and development. Sequence analysis indicated that ORF12 in K1561 was truncated, with 45 amino acid residues only, due to an AS event while the counterpart in G1025 possessed a full length of the protein. No sequence difference in the other three TFs (ORF14, ORF24, and ORF27) was found between K1561 and G1025. ORF12 encoded a MADS-box protein, which possesses a highly conserved DNA-binding MADS domain and is involved predominantly in developmental processes [34]. In Arabidopsis, there are 107 genes encoding MADS-box proteins [34], and almost all of them are involved in the process of flower and seed development [35]. In rice, 75 MADS-box genes were identified, and more than 20 were transcribed during the stages of panicle and seed development [36]. In addition, alternative splicing of one MADS-box transcription factor OsMADS1 encoded by OsLG3b (Os03g0215400) controls grain length and yield in japonica rice [37]. Our results suggested that ORF12 could be critical for the function of TGW12, even functioned as TGW12. However, further studies are required to examine the function of other ORFs located in the region.

Conclusions
In this study, an effective QTL TGW12 related to the trait of thousand-grain weight in rice was mapped to a segment with 204.12 kb using RILs population derived from the cross progenies of one O. minuta introgression line and one cultivated rice. Out of 32 ORFs located in the region of TGW12, ORF12 encoded a MADS-box protein could be crucial for the TGW12 function. Further investigation is required to validate this speculation.

Plant materials and field trials
O. minuta (Acc. No. 101133) and IR24 were kindly provided by the International Rice Germplasm Centre of the International Rice Research Institute. The parental line G1025 was kindly provide by Rice Research Institute of Guangxi Academy of Agricultural Science. The parental lines K1561 was developed by our lab in Rice Research Institute of Guangxi Academy of Agricultural Science. It is produced by one time cross of IR24 and O. minuta, then four times backcross with IR24 as recurrent parent, and four times self-cross. The parental lines G1025 and K1561 along with 201 F 6 , F 7 RILs were planted in Nanning (NN) from February to July in 2013 and 2014, respectively. The parental lines along with 201 F 8 RILs were planted in NN (February to July) in 2015, and parents and F 9 RILs were planted in NN (February to July) and Wuhan (WH) from May to October in 2016, respectively. The phenotypes of parents and RILs were collected to map TGW based on SSR or SLAFs. Grain weight was calculated based on 200 grains and converted to TGW after harvesting and sun-drying. The mean values of ten plants were used as input data to identify QTLs (Additional file 1: Table S1).

SSR, linkage, and QTL analysis
DNA was extracted from fresh leaves following the CTAB procedure [38]. SSR markers were used to analyze a polymorphism between the parents (Additional file 2: Table S2). SSR were synthesized according to published sequences [29]. Polymerase chain reaction (PCR) was conducted in a 15 μL volume as follow: 50 ng of template DNA, 0.3 μL of 10 mM each dNTPs, 0.5 units of Taq DNA polymerase, 1.5 μL of 10 × PCR buffer with Mg 2+ , and 0.5 μL of 10 μM forward and reverse primers. The reaction conditions was carried out as an initial denaturation at 94°C for 5 min, followed by 35 cycles of 30 s at 94°C, 30 s at 56°C, and 30 s at 72°C, with a final extension at 72°C for 10 min. PCR products were separated on 6% polyacrylamide denaturing gels, and the bands were revealed by the silver-staining protocol [39].
Linkage was constructed by Mapmaker/Exp 3.0 [40]. Genetic distance was calculated by the Kosambi function. QTLNetwork2.2 was used to analyze QTL at a threshold of LOD 3.0 [41].
Single nucleotide polymorphism (SNP) genotyping, linkage map construction and QTL analysis Genomic DNA was extracted from fresh leaves of the parents and RILs by CTAB [38]. Quantified DNA was used for SLAF sequencing by an Illumina HiseqTM 2500 [42]. SLAF markers, developed in previous work, were used for genotyping, linkage map construction and QTL analysis for TGW in this study as described by Zhu et al. [30].
Derived cleaved amplified polymorphic sequences (dCAPS) marker development dCAPS marker was developed for SLAF Marker2768345. Primers were designed according to dCAPS Finder 2.0 (http://helix.wustl.edu/dcaps/dcaps.html). PCR were conducted in a 20 μL volume as follow: 100 ng of template DNA, 0.5 μL of 10 mM each dNTPs, 1 units of Taq DNA polymerase, 2.0 μL of 10 × PCR buffer with Mg 2+ , and 0.5 μL of 10 μM forward and reverse primers. The reaction conditions was carried out as an initial denaturation at 94°C for 3 min, followed by 35 cycles of 30 s at 94°C, 30 s at 60°C, and 30 s at 72°C, with a final extension at 72°C for 5 min. PCR products were digested with EcoR V (Takara, China) for 4 h at 37°C, then were resolved in 2% agrose gel to genotype.