Comparative transcriptome analysis of Alpinia oxyphylla Miq. reveals tissue-specific expression of flavonoid biosynthesis genes

Background Alpinia oxyphylla Miq. is an important edible and medicinal herb, and its dried fruits are widely used in traditional herbal medicine. Flavonoids are one of the main chemical compounds in A. oxyphylla; however, the genetic and molecular mechanisms of flavonoid biosynthesis are not well understood. We performed transcriptome analysis in the fruit, root, and leaf tissues of A. oxyphylla to delineate tissue-specific gene expression and metabolic pathways in this medicinal plant. Results In all, 8.85, 10.10, 8.68, 6.89, and 8.51 Gb clean data were obtained for early-, middle-, and late-stage fruits, leaves, and roots, respectively. Furthermore, 50,401 unigenes were grouped into functional categories based on four databases, namely Nr (47,745 unigenes), Uniprot (49,685 unigenes), KOG (20,153 unigenes), and KEGG (27,285 unigenes). A total of 3110 differentially expressed genes (DEGs) and five distinct clusters with similar expression patterns were obtained, in which 27 unigenes encoded 13 key enzymes associated with flavonoid biosynthesis. In particular, 9 DEGs were significantly up-regulated in fruits, whereas expression of 11 DEGs were highly up-regulated in roots, compared with those in leaves. Conclusion The DEGs and metabolic pathway related to flavonoids biosynthesis were identified in root, leaf, and different stages of fruits from A. oxyphylla. These results provide insights into the molecular mechanism of flavonoid biosynthesis in A. oxyphylla and application of genetically engineered varieties of A. oxyphylla. Supplementary Information The online version contains supplementary material available at 10.1186/s12863-021-00973-4.

comparable in roots and leaves of A. oxyphylla, but was significantly higher in fruits [17]. In addition, the content of chemical compounds in the fruits of A. oxyphylla harvested at different times indicates that the 45-day harvested fruit had the highest content of chemicals [17,18] The metabolic processes and regulatory mechanisms of these chemical compounds in different tissues and fruits at different stages have not yet been elucidated.
The transcriptome is a complete set of RNA transcripts in a cell at a specific developmental stage, and provides information on gene expression and regulation related to a variety of cellular processes including secondary metabolite biosynthesis [19,20]. With the development of next-generation sequencing, RNA sequencing is an effective method for investigating the metabolic pathways influenced by active ingredients and associated gene expression in different tissues or samples, such as flavonoid biosynthesis in Ampelopsis megalophylla [21], terpenoids metabolism in ginseng roots [22] and polysaccharide and alkaloid content in Dendrobium [23]. To date, there are no studies on the genetic modification of A. oxyphylla either toward increased production of secondary metabolites or biomass accumulation. Therefore, it is important to explore the whole genome transcriptome of A. oxyphylla to identify candidate genes contributing to metabolic processes and regulatory mechanisms.
In this study, the differentially expressed genes (DEGs) and metabolic pathway related to flavonoids biosynthesis were identified in root, leaf, and different stages of fruits from A. oxyphylla. Therefore, the results of this study may serve as a significant resource for developing genetically engineered varieties of A. oxyphylla with improved quality and yield.

De novo assembly
The three tissue samples (fruits of different developmental stages, leaves, and roots) of A. oxyphylla were sequenced using Illumina HiSeq 4000 which generated approximately 29 (Table 1).

Functional annotation and classification
To investigate the function of unigenes, annotation was performed based on four databases. A total of 50,401 unigenes were grouped into the databases, nonredundant protein (Nr) (47,745 unigenes), Universal Protein (Uniport) (49,685 unigenes), EuKaryotic Orthologous Groups (KOG) (20,153 unigenes), and Kyoto Encyclopedia of Genes and Genomes (KEGG) (27,285 unigenes), respectively, while an additional 89,725 unigenes were not found in these databases. A detailed comparison of the unigenes annotated by four different databases are illustrated in Fig. 1.
GO analysis illustrated that 37,555 unigenes of A. oxyphylla were annotated into three categories: molecular function (30,356), cellular component (20,203), and biological process (26,368), respectively (Supplementary Fig. 1 in Additional file 1). The binding (19,730) and catalytic activity (17,452) functional groups were the most prominent molecular functions. A total of 20,153 unigenes of A. oxyphylla were further annotated and grouped into 25 molecular families in KOG database (Supplementary Fig. 2 in Additional file 1). These molecular families were grouped into four categories: information storage and processing (5575), cellular processes and signaling (7377), metabolism (6180), and poorly characterized (5803). For KEGG analysis, 29,211 unigenes of A. oxyphylla had significant matches in the database and were assigned to five primary categories: cellular processes (3324), environmental information processing (2571), genetic information processing (5073), metabolism (13,599), and organismal systems (4644) (Supplementary Fig. 3 in Additional file 1). A majority of unigenes were assigned to metabolism, and global and overview maps had the highest number of annotated unigenes (5005).

Differential gene expression analysis
There were 35,278 DEGs identified between the leaf vs fruit sample, including 15,063 up-regulated and 20,215 down-regulated DEGs in fruit (Fig. 2a). A total of 34,846 DEGs were identified between root vs. fruit sample, including 14,807 up-regulated and 20,039 down-regulated DEGs in fruit (Fig. 2b). There were 19,776 DEGs between root vs. leaf sample, out of which 8797 were upregulated and 10,979 were down-regulated in leaf (Fig.  2c). Using a Venn diagram, we compared the data sets from the three comparison groups (leaf vs. fruit, root vs. fruit, and root vs. leaf). In this comparison, 19,266 DEGs were identified as common (Fig. 2d) to all three groups. A total of 16,213 DEGs were identified in both "leaf vs. fruit" and "root vs. fruit" comparisons; 19,266 DEGs were identified in both "leaf vs. fruit" and "root vs. leaf" comparisons; while 19,266 DEGs were identified in both "root vs. fruit" and "root vs. leaf" comparisons.

Cluster and KEGG enrichment analysis of DEGs
To investigate the expression trends of DEGs in different tissues, we performed a cluster analysis using normalized expression values from each individual replicate of five different samples of A. oxyphylla. As a result, a total of 3110 DEGs and five distinct clusters with similar expression patterns were obtained, containing 606, 807, 954, 725, and 18 genes, respectively (Fig. 3a). As shown in Fig. 3b, the expression level of cluster I (606) and cluster IV (725) genes in fruits of A. oxyphylla were higher than in roots and leaves, and the expression levels of cluster II (807), cluster III (954), and cluster V (18) in fruits were lower than in roots and leaves. The secondary metabolites in fruits are higher than roots and leaves, for instance, flavonoids in fruits are 1000 times higher than roots and leaves [17]. Therefore, the DEGs related to secondary metabolite biosynthesis should be in cluster I and cluster IV. Signal pathway analysis of DEGs in the five clusters showed that cluster I contains DEGs involved in flavonoid biosynthesis, isoquinoline alkaloid biosynthesis, and biosynthesis of secondary metabolites (Fig. 4). Through further comparative analysis, there were 35 and 44 DEGs related to secondary metabolites in root vs fruit and leaf vs fruit, repetively ( Table 2). These DEGs were mainly distributed in phenylpropanoid, flavonoid and isoquinoline alkaloid biosynthesis pathways. For phenylpropanoid biosynthesis pathways, 14 DEGs were up-regulated and 3 DEGs were down-regulated in root vs fruit, and 19 DEGs were up-regulated, 5 DEGs were down regulated in leaf vs fruit. It is noteworthy that all the 8 DEGs mapped to flavonoids biosynthesis, and they were both up-regulated in leaf vs fruit (Table 2). In addition, 2 DEGs were up-regulated in anthocyanin biosynthesis, 3 DEGs were down-regulated in diarylheptanoid and gingerol biosynthesis, 1 DEGs were up-regulated and 2 DEGs were down-regulated in sesquiterpenoid and triterpenoid biosynthesis. In conclusion, phenylpropanoid, flavonoids and isoquinoline alkaloid biosynthesis related DEGs were significantly upregulated, while diarylheptanoid, gingerol, sesquiterpenoid, triterpenoid and carotenoid biosynthesis related DEGs were down-regulated in fruits compared with roots and leaves.

Candidate genes associated with flavonoid biosynthesis
Flavonoids are one of the main chemical compounds found in A. oxyphylla and are important for evaluating its quality [18]. To understand the regulation of flavonoid biosynthesis in A. oxyphylla, key regulatory genes involved in the pathways for phenylpropanoid and flavonoid biosynthesis were identified in this study. Twenty-seven unigenes encoding 13 key enzymes observed in this study were mostly associated with biosynthesis of flavonoids. Furthermore, results of the microarray analysis of tissue-specific transcriptomes demonstrated that the majority    Table 1 in Additional file 2). In previous studies, flavonoids are found in high concentrations in fruits, followed by roots, and are found in the lowest concentrations in leaves [17]. Expression analysis of flavonoid biosynthesis genes in the present study also showed a similar trend. The putative flavonoid synthesis pathway is shown in Fig. 5b. Flavonoids are synthesized via the phenylpropanoid pathway and are converted from phenylalanine to chalcone by the enzymes phenylalanine ammonia-lyase (PAL), cinnamate 4-hydroxylase (C4H), 4CL, and CHS. CHI catalyzes the isomerization of chalcones into flavanone. Flavanone can be converted either to flavonols through the subsequent action of F3H and FLS, or to flavone through the action of DFR and LAR. However, no unigene coding for flavone synthase (FNS) was detected in the transcriptome analysis. A similar situation has been reported in the transcriptome sequencing of other plants such as Sophora japonica, which may be attributed to the fact that FNS genes are short fragments without sequence similarity [24].

Discussion
There are about 250 species of Alpinia plants distributed in tropical Asia [25]. The roots and fruits of Alpinia plants are often used for medicinal applications [2,26]. The capsular fruit of A. oxyphylla has been used as a medicinal constituent or health supplement for centuries as one of the four famous southern Chinese medicines [2,3]. Studies in natural product chemistry reveal that the capsular fruit, root, and leaf contain flavonoids, sesquiterpenes, diarylheptanoids, essential oils, glycosides, and steroids [14,17]. The main chemical components of A. oxyphylla flavonoids comprise of tectochrysin, izalpinin, chrysin, and kaempferide, of which tectochrysin is the second most abundant flavonoid concentrated in fruits [11]. Therefore, flavonoids are one of the most important active chemical components in A. oxyphylla and are important for evaluating its quality. However, the molecular mechanism of tissue-specific flavonoid biosynthesis and accumulation in A. oxyphylla remains largely unexplored.
In this study, we collected three tissue samples (fruits of different developmental stages, leaves, and roots) of A. oxyphylla and performed a comparative transcriptome analysis, with a particular focus on flavonoid biosynthesis genes. To analyze if the gene expression of biosynthetic genes also follow this pattern, highthroughput transcriptome sequencing technology was employed. Indeed, transcriptional analysis showed that a large number of transcripts exhibited a tissue-specific expression. The number of DEGs in the 'leaf vs. fruit' and 'root vs. fruit' comparison groups was higher than that in the 'root vs. leaf' comparison group. These results suggest that the medicinal properties and associated biological processes are concentrated in the fruits of A. oxyphylla. To investigate the trends of DEGs in gene expression, we performed a cluster analysis using normalized expression values from each individual replicate of five different samples of A. oxyphylla. A total of 3110 DEGs were divided into five distinct clusters according to their expression patterns. Further analysis showed that only the cluster I of DEGs were related to flavonoid biosynthesis, isoquinoline alkaloid biosynthesis and biosynthesis of secondary metabolites, and the expression level in fruits was significantly higher than that in leaves and roots. The enriched KEGG pathways results showed that all the DEGs related to flavonoid biosynthesis were up-regulated, and most of the DEGs involved in phenylpropanoid biosynthesis were also up-regulated, but the DEGs related to stilbenoid, diarylheptanoid and gingerol biosynthesis were down-regulated in fruits, indicating that flavonoids were the main secondary metabolites. The characterized flavonoids, including tectochrysin, izalpinin, chrysin, and kaempferide, are found in greatest concentrations in fruits, followed by roots, and are found in the lowest concentrations in leaves [17]. Therefore, the expression level of flavonoid related genes was consistent with that of chemical components in different tissues of A. oxyphylla.
The biosynthesis of flavonoids has been reported in many other medicinal plants such as Astragalus membranaceus var. mongholicus, Apocynum venetum, and Eucommia ulmoides, and phenylpropanoid biosynthesis is the common core pathway for the synthesis of flavonoids [27][28][29]. The first step in flavonoid biosynthesis is regulated by enzymes (PAL, C4H, and 4CL) in the phenylpropanoid pathway. The substrate 4-coumaroyl-CoA is converted into chalcone by CHS in the first ratelimiting step of flavonoid biosynthesis [30]. Next, different flavonoid subgroups are synthesized through modification of the molecular backbone, which is controlled by flavonoid, flavone and flavonol biosynthesis enzymes such as HCT, CCoAMT, CHS, CHI, F3H, F3′,5′H, DFR, ANR, and ANS [29][30][31][32]. In this study, homologous unigenes and the expression levels of these genes were investigated in samples of different tissues from A. oxyphylla.
Interestingly, DEGs encoding CHS, CHI, F3H, FLS, ANS, DFR and ANR were highly expressed in the samples from fruits than the other two tissues, and DEGs encoding PAL, 4CL, HCT, CCoAMT, and F3'5'H were highly expressed in the samples from roots than the other two tissues. It is noteworthy that PAL and 4CL display high expression in roots, but the flavonoids are not concentrated in the root [17]. It is speculated that in the initial stages of flavonoid synthesis, phenylpropanoid biosynthesis pathway initiates synthesis of substrates in the root, part of which is converted into eriodictyol by HCT, CCoAMT, and F3'5'H, and the rest is transported to the fruit, where it is modified and processed by CHS, CHI, F3H, FLS, ANS, DFR, and ANR to form flavonoids, flavones, and flavonols (Fig. 5). Therefore, it reasonable to primarily utilize fruits of A. oxyphylla as components of traditional medicine, rather than the root as done in species such as A. officinarum. These results provide insights into the molecular processes of flavonoid biosynthesis in A. oxyphylla and offer a significant resource for the application of genetic engineering to develop varieties of A. oxyphylla with improved quality.

Conclusions
In this study, a total of 3110 DEGs and five distinct clusters with similar expression patterns were obtained, in which 27 unigenes encoded 13 key enzymes associated with flavonoid biosynthesis. In particular, 9 DEGs were significantly up-regulated in fruits, whereas expression of 11 DEGs were highly up-regulated in roots, compared with those in leaves. In summary, The DEGs and metabolic pathway related to flavonoids biosynthesis were identified in root, leaf, and different stages of fruits from A. oxyphylla. These results provide insights into the molecular mechanism of flavonoid biosynthesis in A. oxyphylla and application of genetically engineered varieties of A. oxyphylla.

Plant material
A. oxyphylla were collected from cultivated fields in Baisha County, Hainan Province, China (N.109.437569, E. 19.19680). The sample was identified by Kun Pan and deposited at the Key Laboratory of Tropical Translational Medicine of the Ministry of Education, Hainan Medical University, Haikou, Hainan, China. The specimen accession number was CHMU0123. The fruits were sampled at the following three developmental stages: early-fruit (15 days), middle-fruit (30 days) and late-fruit (45 days). Fresh A. oxyphylla fruits were obtained from the three plants simultaneously during each phase. Then, the materials of same phase were mixed for further experiments. After harvesting the fruit, the leaves and roots were obtained from the same plant. All the samples of A. oxyphylla were immediately frozen in liquid nitrogen and stored at − 80°C prior to processing.

RNA sequencing and De novo assembly
The total RNA was extracted from different plant tissues using the RNAprep Pure Plant Kit (Tiangen, Beijing, China) as per the standard protocol [33]. The RNA concentration and quantity were assessed using the Nanodrop 2000 spectrometer (Thermo Fisher Scientific, Wilmington, DE, USA) and Agilent Bioanalyzer 2100 system (Agilent Technologies, Santa Clara, CA, USA). A Stranded Total RNA Library Prep Kit (Illumina, Inc., San Diego, AR, USA) was used for cDNA library construction and normalization. The cDNA library was sequenced using Illumina HiSeq 4000 as per standard protocol. Raw reads were filtered by removing the adapter and low-quality sequences to produce highquality clean reads and the reads were assembled to generate unigene libraries. Trinity software (v.2.8.5, the Broad Institute, Cambridge, MA, USA) was used to assemble the clean data into unigenes according to a basic group quality score of more than Q30 [34].

Analysis of DEGs
Unigene expression level was calculated using the fragments per kilobase of transcript per million mapped (FPKM) method. The DEGs were screened using the edgeR package with the threshold set as described previously [35]. GO and KEGG enrichment analysis of the identified DEGs was performed using the GOAtools version 0.5.9 (https://github.com/tanghaibao/Goatools) and KOBAS version 2.0.12 with default settings, respectively. The corrected p-value for identifying significant differences in expression was calculated and adjusted by the hypergeometric Fisher exact test. GO terms with a corrected p-value≤0.05 were considered to be significantly enriched. Next, we employed the same method for KEGG pathway functional enrichment analysis of DEGs.
Additional file 2: Supplementary Table 1. Expression level of candidate A. oxyphylla unigenes coding for key enzymes involved in flavonoid biosynthesis pathways.