Skip to main content

Transcriptome-based analysis of key functional genes in the triterpenoid saponin synthesis pathway of Platycodon grandiflorum

Abstract

Background

Platycodon grandiflorum (P. grandiflorum) is a commonly used medicinal plant in China. Transcriptome sequencing studies of different tissues of P. grandiflorum have been widely conducted. However, studies on transcriptome sequencing and expression patterns of key genes in the saponin synthesis pathway of Tongcheng P. grandiflorum, a high-quality medicinal resource of different years, are relatively limited.

Results

This study involved transcriptome sequencing and bioinformatics analysis of the roots from annual, biennial, and triennial P. grandiflorum in the Tongcheng area. After data filtering and assembly, we obtained 111.44 Gb of clean base data, including 742,880616 clean reads. We identified 5,156 differential expression unigenes between at least two sample groups, with differences noted among annual, biennial, and triennial P. grandiflorum plants. GO enrichment analysis annotated 3509, 1819, and 1393 DEGs in comparison TC1vsTC2, TC1vsTC3, and TC2vsTC3, respectively. Furthermore, KEGG enrichment analysis identified 16 genes encoding key enzymes in the terpene skeleton biosynthesis, sesquiterpene and triterpene biosynthesis pathways, including SE, AACT, FPPS, DXR, HMGR, HMGS, and DXS. The results of qRT-PCR experiments showed that most of the genes were most highly expressed in annual P. grandiflorum.

Conclusions

The present study provided transcriptomic data from the roots of Tongcheng P. grandiflorum of different years, which provides critical bioinformatics data on the growth and development of P. grandiflorum, laying a foundation for further research on saponins and identifying key enzymes involved in this process across different growth stages.

Peer Review reports

Background

Platycodon grandiflorum, commonly known as the dried root of plant, is one of China’s bulk Chinese medicinal herbs [1]. It has an extended flowering period with blue, purple, and white flowers, making it highly ornamental value [2]. Medicinally, the root is flat in nature, bitter and pungent in taste, and is associated with the lung meridian. It is traditionally used to promote lung health, relieve sore throats expel phlegm, and drain pus, making it effective for treating cough, chest tightness, sore throat, hoarseness, and lung abscesses. Modern pharmacological studies have shown that P. grandiflorum has various effects, such as anti-inflammatory, hepatoprotective, anti-tumor, lipid-lowering, and antioxidant [3,4,5]. It is used to treat respiratory diseases, such as asthma, bronchitis, and tuberculosis. Additionally, extracts of P. grandiflorum are used in cosmetics for anti-aging and skin-whitening effects [6]. P. grandiflorum is both a medicinal and food plant, often pickled and consumed as kimchi in Northeast China and North Korea. Its applications span medicine, foods, cosmetics, and ornamental uses, highlighting its high research value and development prospects.

Currently, over 100 secondary metabolites have been identified in P. grandiflorum, including saponins, flavonoids, polysaccharides, phenolic acids, and fatty acids [7, 8]. The primary pharmacologically active components are oleanane-type triterpenoid saponins, such as platycoside D, platycoside E, deapioplatycoside D and polygalacin D. Among these, platycoside D is abundant and exhibits high pharmacological activity, serving as standardized substance for the evaluating the quality of P. grandiflorum [9]. Studies have shown that platycoside D has a variety of pharmacological activities such as anticancer, anti-inflammatory, anti-obesity, anti-atherosclerosis, and anti-thrombosis effects [10,11,12].

Triterpenoid saponins, consisting of aglycone and sugars, are widely distributed in nature. The aglycones are triterpenoids, with a basic skeleton comprising six isoprene units. These saponins can be divided into tetracyclic triterpenoids and pentacyclic triterpenoids, with P. grandiflorum containing mainly pentacyclic triterpenoid-type saponins [13, 14]. Triterpenoids have important physiological and ecological functions such as enhancing plant resistance to stress, boosting immunity, and exhibiting anti-inflammatory, anticancer, and antitumor effects [15]. The synthesis pathway of triterpenoids in plants can be divided into three parts. Firstly, the mevalonate (MVA) pathway in the cytoplasm and the methylerythritol-4-phosphate (MEP) pathway in the plastid, which independently synthesizes isopentenyl pyrophosphate (IPP) and dimethylallyl pyrophosphate (DMAPP) [16]. Secondly, the IPP polymerisation pathway, where IPP and DMAPP are sequentially catalyzed by farnesyl pyrophosphate synthetase (FPPS) and geranylgeranyl pyrophosphate synthase (GPPS) to produce farnesyl pyrophosphate, which is then converted to the key precursor substance, 2,3-squalene oxide, catalyzed by enzymes such as squalene synthetase (SS) and squalene cyclooxygenase (SE). Finally, the synthesis of various triterpene saponins with different backbones from 2,3-oxidosqualene, facilitated by the action of enzymes like oxidosqualene cyclases (OSCs), cytochrome P450 monooxygenases (CYP450s), and uridine diphosphate glycosyltransferases (UGTs) [17]. This process is a significant factor in the diversity of triterpene saponins found in nature.

The first step in the formation of the triterpenoid saponin carbon skeleton is the cyclization of 2,3 oxidosqualene, catalyzed by various oxidosqualene cyclases (OSCs). Key members of OSC family identified so far include α-coumaryl alcohol synthase (α-AS), β-coumarin alcohol synthase (β-AS), lupinol synthase (LS), and dammarenediol synthase. These skeletons are then modified by cytochrome P450 monooxygenases (CYP450s) and uridine diphosphate glycosyltransferases (UGTs) to form triterpene saponins of the ursane-type, lupinolane-type, and dammarane-type, and oleanocarpene-type [18]. The CYP450 and UGT family proteins are crucial for the formation of diverse triterpenoid saponins. Several CYP450 enzymes involved in modifying the triterpenoid carbon skeleton have been identified. Notably, the CYP716 family of proteins in P. grandiflorum catalyze the oxidation of β-coumarin alcohol to synthesise oleanocarpine-type saponins [19].

P. grandiflorum is a perennial plant found across northern and southern of China. Interestingly, the varieties produced in Anhui Taihe, Shandong Zibo and Inner Mongolia Chifeng are often processed and consumed as pickles. The P. grandiflorum from Tongcheng of Anhui Province is known for its bitterness and high quality, commonly referred to as ‘Tong P. grandiflorum’. The saponin content of P. grandiflorum varies significantly with the plant’s age. Annual roots are smaller, and the yield and quality of the medicinal herbs are noticeably lower compared to biennial and triennial roots [20]. Therefore, studying P. grandiflorum at different growth stages is crucial for improving medicinal yield and optimizing the use of high-quality resources for medicinal purposes.

In recent years, transcriptome sequencing technology has advanced rapidly, offering high throughput, high resolution, high sensitivity, and the ability to perform a genome-wide analysis on any species. It is widely used in the study of gene expression level and key functional gene in medicinal plants. Transcriptome sequencing has been applied to various medicinal plants such as Dendrobium nobile [21], Salvia miltiorrhiza [22], Codonopsis pilosula [23], Zanthoxylum bungeanum [24], providing new scientific insights into the synthesis and accumulation of medicinal components and the cultivation of high-quality medicinal herb varieties.

In this study, we used P. grandiflorum from Tongcheng, Anhui Province, China, as experimental materials. High-throughput sequencing technology was employed to sequence the transcriptomes of the roots from annual to triennial P. grandiflorum. We analyzed the differentially expressed genes and related pathways among plants of different ages and identified the genes encoding key enzymes in the triterpenoid saponins synthesis pathway. This research provides a bioinformatics foundation for the development of high-quality medicinal P. grandiflorum varieties and the study of triterpenoid saponins biosynthesis in P. grandiflorum.

Results

Saponin contents analysis in P. Grandiflorum

The total saponin content in the roots, stems and leaves of 1–3-year-old Tongcheng P. grandiflorum was measured by UV spectrophotometry. The results showed that the roots of biennial P. grandiflorum had the highest total saponin content. There was significant variation in saponin content across different tissues with roots having the highest total saponin content, followed by stems, and the lowest content found in leaves across all three years. Analysis of 8 saponin monomers showed that deapioplatycoside E, platycoside E, deapioplatycoside D3, deapioplatycoside D had the highest content in annual P. grandiflorum. Conversely, platycoside D3, deapioplatycoside D, platycoside D2, platycoside D, polygalacin D had the highest concentration in biennial P. grandiflorum (Fig. 1).

Fig. 1
figure 1

Saponin content of P. grandiflorum from different years. (a total saponin content, horizontal coordinates indicate different tissues of P. grandiflorum from different years, vertical coordinates indicate saponin content; b content of eight saponin monomers. Horizontal coordinates indicate different years of P. grandiflorum, vertical coordinates indicate saponin content)

Transcriptome sequencing and sequence splicing

By sequencing the transcriptome of P. grandiflorum samples from different years, we obtained a total of 742,880,616 clean reads, containing 111.44 Gb of clean bases. The base distribution of Q20 ranged from 98.69 to 99.03%, and the base distribution of Q30 ranged from 94.69 to 95.61%, with GC content ranging from 44.5 to 45% (Table S1, Fig. 2). These metrics show the high quality of sequencing data. The high-quality clean reads of each sample were compared to the reference genome, resulting in mapping rate of 84.76% for each sample, demonstrating that the sequencing data were well-assembled and spliced.

Fig. 2
figure 2

Transcriptome sequencing data for each sample (a raw data; b Valid data)

Differentially expressed genes (DEGs) analysis

The transcripts of Tongcheng P. grandiflorum from three different years were analyzed separately using DESeq2. Differentially expressed genes were identified with q < 0.05 and |log2fc| ≥ 1 as the screening criteria. The analysis revealed that a total of 5156 unigenes showed differential expression between at least two samples. Specifically, 3891 DEGs were identified between TC1 and TC2, of which 2626 genes showed up-regulated expression patterns and 1265 genes showed down-regulated expression patterns; a total of 1980 differentially expressed genes were identified between TC1 and TC3, of which 1502 genes showed up-regulated expression patterns and 478 genes showed down-regulated expression patterns; a total of 1542 differentially expressed genes were identified between TC2 and TC3, 713 genes showed up-regulated expression patterns and 829 showed down-regulated expression patterns (Fig. 3).

Fig. 3
figure 3

Statistics on the number of differentially expressed genes in Tongcheng P. grandiflorum in different years. (a Histogram of DEGs in different control groups; b Venn diagram analysis of DEGs)

From the clustering heat map of differentially expressed genes, genes that were upregulated in annual P. grandiflorum (TC1) were predominantly downregulated in biennial and triennial P. grandiflorum (TC2 and TC3). The genes that showed downregulation in the TC1 were more frequently upregulated in the TC2 and TC3. The highest number of differentially expressed genes and significantly more upregulated genes than downregulated genes were found in the TC1vsTC2 group, while the lowest number of differentially expressed genes were found in the TC2vsTC3 group, in which the number of downregulated genes was slightly higher than the number of upregulated genes (Fig. 4). These finding suggests that annual and biennial P. grandiflorum have relatively vigorous vital activities, while the vital activities of triennial P. grandiflorum were weakened compared to them. Among all the differentially expressed genes, 69 common differentially expressed genes were identified simultaneously in three different years of P. grandiflorum. This suggests that these genes may be critical to its growth and development.

Fig. 4
figure 4

Heat map of clustering of differentially expressed genes in P. grandiflorum of different ages

GO functional annotation and enrichment analysis

Gene ontology (GO) is an internationally standardized gene function classification system that provides a set of dynamically updated standard vocabularies to comprehensively describe the attributes of genes and gene products in organisms. There are three ontologies in GO, describing the molecular function of genes (MF), cellular components (CC), and biological processes involved (BP), respectively. To explore more deeply the differentially expressed genes in P. grandiflorum of different growth years, GO enrichment analysis was performed on three groups of differentially expressed genes in TC1vsTC2, TC1vsTC3, and TC2vsTC3, and the top 25, 15, and 10 GO entries enriched to biological processes, cellular components, and molecular functions were selected according to the number of differentially expressed genes annotated, respectively. A total of 3509 DEGs were annotated in TC1vsTC2, 1819 DEGs were annotated in TC1vsTC3, and 1393 DEGs were annotated in TC2vsTC3. The results of GO enrichment analysis of differentially expressed genes in the three comparison groups were similar: the most annotated genes were related to “biological process” in the biological process group, “nucleus” in the cellular group, “molecular function” in the molecular function group, and “molecular function” in the cellular group. The most abundant DEGs in different components were all genes that play important roles in the growth and development of P. grandiflorum (Fig. 5, Tables S2, S3, S4).

Fig. 5
figure 5

GO analysis of differentially expressed genes in P. grandiflorum (a TC1vsTC2 differentially expressed gene analysis; b TC1vsTC3 differentially expressed gene analysis; c TC2vsTC3 differentially expressed gene analysis)

KEGG annotation and enrichment of differentially expressed genes

To further investigate the biological functions of the differentially expressed genes (DEGs), KEGG enrichment analysis was conducted on each group of comparison. The results showed that 1302, 693, and 470 DEGs were annotated to 130, 122, and 115 pathways in the three comparison groups of TC1vsTC2, TC1vsTC3, and TC2vsTC3 respectively. Including Metabolism, Human Diseases, Cellular Processes, Environmental Information Processing, Genetic Information Processing, Organismal Systems 6 KEGG primary classification pathways.

The top 6 Pathways with the highest enrichment of DEGs were selected for analysis within each of the 5 categories of the KEGG pathway level classification. If fewer than 5 Pathways were significantly enriched, all of them were included for analysis. In the TC1vsTC2 comparison group, the highest number of DEGs was annotated to the plant hormone signal transduction pathway in the environmental information processing category with 100 DEGs, which accounted for the highest percentage (7.7%), followed by the plant-pathogen interactions pathway in the organic systems category and the starch and sucrose metabolism and phenylpropane biosynthesis pathway in the metabolism category. The plant hormone signal transduction pathway in the environmental information processing category in TC1vsTC3 was annotated to 57 DEGs, the highest percentage (8.2%), followed by starch and sucrose metabolism in the metabolism category, the phenylpropane biosynthesis pathway and the protein processing in endoplasmic reticulum pathway in the genetic information processing category. In TC2vsTC3, 40 DEGs were annotated in the plant hormone signal transduction pathway in the environmental information processing category, which accounted for the highest proportion (8.5%), followed by the ribosome pathway in the genetic information processing category, and the MAPK signaling pathway-plant pathway in the environmental information processing category. Significant enrichment of differentially expressed genes was observed in the plant hormone signal transduction pathway across all three comparison groups. This suggests substantial differences in genes related to plant hormone signal transduction in P. grandiflorum across different years, underscoring their potential importance in the plant’s growth and development (Fig. 6, Tables S5, S6, S7).

Fig. 6
figure 6

KEGG analysis of differentially expressed genes in P. grandiflorum (a TC1vsTC2 DEGs analysis; b TC1vsTC3 DEGs analysis; c TC2vsTC3 DEGs analysis)

The KEGG enrichment results were screened, mainly analyzing the top 20 pathways with the smallest P-value, and the results showed that the category with the most annotated pathways in the three comparison groups was metabolism, in which there were 14 metabolism pathways in TC1vsTC2, 15 metabolism pathways in TC1vsTC3, and 11 metabolism pathways in TC2vsTC3. In the TC1vsTC2 comparison group, up-regulated DEGs were mainly enriched in the plant hormone signal transduction and phenylpropane biosynthesis pathways; down-regulated DEGs were mainly enriched in the protein processing in the endoplasmic reticulum, starch and sucrose metabolism pathways. Up-regulated DEGs in the TC1vsTC3 comparison group were mainly enriched in the phenylpropanoid biosynthesis, starch and sucrose metabolism pathways; down-regulated DEGs were mainly enriched in the protein processing pathway in the endoplasmic reticulum. Up-regulated DEGs in TC2vsTC3 were mainly enriched in the plant hormone signal transduction, starch and sucrose metabolism pathways; down-regulated DEGs were mainly enriched in ribosome biogenesis in eukaryotes, MAPK signaling pathway-plant and plant hormone Signal transduction pathways (Fig. 7). In addition, DEGs in the three comparison groups were enriched in four pathways related to terpenoid synthesis: terpenoid backbone biosynthesis, sesquiterpene and triterpene biosynthesis, dipterpenoid biosynthesis, and monoterpenoid biosynthesis, suggesting that terpenoid synthesis-related genes play an important regulatory role in Different Years of P. grandiflorum. Since triterpenoids are the main active components in P. grandiflorum, we will next focus primarily on these metabolic pathways and related genes associated with terpenoid synthesis.

Fig. 7
figure 7

Distribution of genes up-and down-regulated in the top 20 pathways with the smallest P-value for KEGG enrichment analysis (a TC1vsTC2 DEGs analysis; b TC1vsTC3 DEGs analysis; c TC2vsTC3 DEGs analysis)

DEGs analysis related to the biosynthesis of triterpenoids

It is known that terpenoids are the main active ingredients in P. grandiflorum, and the saponins in P. grandiflorum are mainly oleanane-type pentacyclic triterpenoids. The two main pathways for terpenoid synthesis are the mevalonic acid pathway (MVA) and the methylerythritol phosphate pathway (MEP). And there are a variety of key enzymes involved in terpenoid synthesis in this process (Table S8). In the following, we will analyze and summarise the genes encoding key enzymes involved in the synthesis of triterpenoids in P. grandiflorum based on the KEGG enrichment results.

Among the KEGG metabolic pathways, the two main pathways associated with terpenoid synthesis are terpenoid backbone biosynthesis, sesquiterpenoid and triterpenoid biosynthesis (Fig. 8). A total of 9 DEGs were annotated to the sesquiterpenoid and triterpenoid biosynthesis pathways in TC1vsTC2, among which 3 SQE genes, 4 β-AS genes were genes encoding key enzymes in the triterpenoid synthesis process; 12 DEGs were annotated to the triterpenoid backbone biosynthesis pathway, among which the genes encoding key enzymes involved in terpenoid synthesis process included 2 AACT genes, 1 FPPS gene, 1 DXR gene, 1 GPPS gene and 1 HMGR gene. A total of seven DEGs were annotated to the terpenoid backbone biosynthesis pathway in TC1vsTC3, including 1 HMGS gene, 1 IPI gene and 2 HMGR genes that are genes encoding key enzymes in triterpenoid synthesis, while 5 DEGs were annotated to the sesquiterpenoid and triterpenoid biosynthesis pathways, and there are 4 β-AS genes that are genes encoding key enzymes in the triterpenoid synthesis process. A total of 2 DEGs were annotated to the sesquiterpenoid and triterpenoid biosynthesis pathways in TC2vsTC3, with 1 SQE gene being a key enzyme gene in the triterpenoid biosynthesis process; 6 DEGs were annotated to the terpenoid backbone biosynthesis pathway, with 1 DXS and 1 IPI genes being genes encoding key enzymes involved in terpenoid synthesis.

Fig. 8
figure 8

Triterpenoid saponin biosynthetic pathways

Among the three comparison groups, DEGs related to terpenoid biosynthesis were mainly concentrated in the TC1vsTC2 group, followed by the TC1vsTC3 group, and the least in the TC2vsTC3 group, which shows that annual P. grandiflorum differed from the biennial and triennial P. grandiflorum at the gene expression level. In the above two pathways associated with terpenoid synthesis, the differentially expressed genes were mainly concentrated in the terpenoid backbone biosynthesis Pathway. A total of four β-AS genes, two SQE genes, two HMGR genes, two AACT genes, one FPPS gene, one DXR gene, one HMGS gene, one IPI gene, one GPPS gene, and one DXS gene were identified in all differentially expressed genes related to terpenoid synthesis in the three comparison groups, for a total of 16 key enzymes in the terpenoid synthesis pathway genes(AACT1, AACT2, DXR, DXS, FPPS, GPPS, HMGR1, HMGR2, HMGS, IPI, SQE1, SQE2, β-AS1, β-AS2, β-AS3, β-AS4). Based on these findings, we hypothesize that these genes play a crucial role in regulating terpenoid synthesis in P. grandiflorum of different growth stages.

Identification of transcription factors

Transcription factors (TFs) are a family of proteins with unique structures that regulate gene expression, playing crucial roles in secondary metabolic processes in plants. In this study, gene expression profiles were annotated for TFs using the PlantTFDB database (https://planttfdb.gao-lab.org/index.php), and the distribution of transcription factors was analyzed. There were 3891 significantly different transcription factors identified in the TC1vsTC2 group, including 2626 up-regulated and 1265 down-regulated. A total of 1980 significantly different transcription factors were identified in TC1vsTC3, including 1502 up-regulated and 478 down-regulated transcription factors. A total of 1542 significantly different transcription factors were identified in TC2vsTC3, 713 up-regulated and 829 down-regulated. We identified a total of multiple transcription factors in the three groups. Among these transcription factors, bHLH, MYB, NAC, and WRKY accounted for a relatively high percentage, which may have important regulatory roles in the growth and development of P. grandiflorum roots and the synthesis of secondary metabolites at different times (Table 1, S12, S13, S14).

Table 1 Classification and number of differential transcription factors in roots of P. grandiflorum from different years

qRT-PCR validation

qRT-PCR validation of 16 differentially expressed genes involved in the triterpenoid synthesis pathway using β-actin as an internal reference gene [25]. The results showed that the genes of AACT2, HMGR1, HMGR2, DXR, FPPS, SQE2, β-AS2, β-AS3, β-AS4, GPPS, HMGS, and IPI were the most highly expressed in annual P. grandiflorum, while AACT1 and SQE1 were the most highly expressed genes in biannual P. grandiflorum, and the genes of β-AS1 and DXS were the most highly expressed in triennial P. grandiflorum. The qRT-PCR results demonstrated high consistency with RNA-seq data, affirming the reliability of the transcriptome sequencing results (Fig. 9).

Fig. 9
figure 9

qRT-PCR validation of 16 DEGs. The horizontal coordinates indicate different years of P. grandiflorum samples, bar graphs indicate relative expression of genes determined by qRT-PCR, and line graphs indicate FPKM values of genes in RNA-seq

Discussion

P. grandiflorum, a perennial medicinal plant renowned for its diverse pharmacological activities, exhibits significant variation in saponin content across different growth years [26], leading to varying clinical efficacy. In this study, we analyzed the total saponin content, and the level of eight saponin monomers in Tongcheng P. grandiflorum, over three growth years. Our finding revealed that the total saponin content in the root tissues was significantly higher than that in the stems and leaves. Specifically, biennial plants exhibited highest total saponin content among the P. grandiflorum of different growth years, while the total saponin content in the roots of annual and triennial plants was lower. The results of 8 saponin monomers showed that the content of various saponin monomers in triennial P. grandiflorum was the lowest, and the content of platycoside D was the highest in biennial P. grandiflorum. We assume that the reason for this may be that the accumulation of saponins is insufficient due to the short growth period of annual plants, and the total saponin content is low due to the low rate of saponin synthesis in triennial plants. In conclusion, our study recommends use due to its superior saponin content and potential clinical efficacy.

In recent years, transcriptome sequencing technology has advanced rapidly, leading to increased studies on P. grandiflorum. Kim et al. conducted transcriptome sequencing across eight different tissues of P. grandiflorum, including root, stem, leaf, petal, sepal, stamen, pistil and seed [27]. However, there are less studies on transcriptome sequencing studies and the relationship between saponin content and expression of genes encoding key enzymes in the terpenoid synthesis pathway in P. grandiflorum of different growth years. Tongcheng, located in Anhui Province, is known for producing high-quality P. grandiflorum, particularly the medicinal bitter variety. In this study, we performed transcriptome sequencing on the roots of it with three different growth years to analyze the relationship between the expression of genes encoding key enzymes in the triterpenoid synthesis pathway and the content of saponins. Laying the foundation for further research on genes encoding key enzymes for triterpenoid synthesis in P. grandiflorum.

Analysis of the transcriptome sequencing data revealed differences in gene expression levels between annual P. grandiflorum and biennial and triennial P. grandiflorum in different growth years. KEGG enrichment analysis of differentially expressed genes showed that DEGs were enriched in the triterpenoid backbone biosynthesis pathway associated with triterpenoid synthesis, as well as in the sesquiterpenoid and triterpenoid biosynthesis pathways among P. grandiflorum at different growth years. key functional genes involved in terpenoid biosynthesis were identified in these pathways, including SQE, IPI, β-AS, GPPS, HMGR, AACT, FPPS, DXR, HMGS and DXS. The AACT is the first key enzyme in the MVA pathway, which can catalyze the synthesis of acetoacetyl-coenzyme A from acetyl-coenzyme A. Subsequently, HMGS catalyzes acetyl-coenzyme A and acetoacetyl-coenzyme A to synthesize 3-hydroxy-3-methylglutaryl-coenzyme A, which is further converted into MVA by HMGR to produce MVA. Among of them the HMGR is the key rate-limiting enzyme in the mevalonate pathway [28]. In addition, DXS and DXR are pivotal enzymes catalyzing the initial reactions in the MEP pathway, and SE, IPI, β-AS, GPPS, and FPPS are key enzymes in the terpenoid backbone biosynthesis process.

Numerous studies on genes encoding key enzymes in the terpenoid synthesis pathway have highlighted their regulatory roles in terpenoid production in plants [29]. The CiDXR gene in Chrysanthemum indicum regulates terpenoid synthesis [30]. Overexpression of the PtHMGR gene in Populus trichocarpa significantly increases terpenoid content [31]. These studies collectively demonstrate the pivotal regulatory function of genes encoding key enzymes in the terpenoid synthesis pathway across different plant species.

We conducted qRT-PCR to validate the expression of 16 genes encoding key enzymes identified in this study, and the results consistently showed that most of these genes had the highest expression levels in annual plants, aligning with the transcriptome sequencing findings. This suggest that annual P. grandiflorum exhibits the fastest rate of saponin synthesis compared to biennial and triennial varieties, where saponin synthesis rates may decrease with the increase of the growth years. Furthermore, our measurements that the total saponin content of triennial P. grandiflorum was the lowest among the different growth years of P. grandiflorum. This may be related to the low expression of genes encoding key enzymes in the triennial plants. It is therefore inferred that the 16 DEGs we detected in the terpene synthesis pathway are involved in the regulation of the biosynthesis of saponins in P. grandiflorum. Studying differentially expressed genes in the P. grandiflorum root across different growth years provides insights into metabolites accumulation patterns during different growth and development periods, provides a theoretical basis for the planting and harvesting of it, and provide bioinformatics basis for the formation of high-quality medicinal P. grandiflorum varieties and the biosynthesis of triterpenoid saponins in P. grandiflorum.

Transcription factors (TFs) are pivotal proteins that bind to DNA regulatory sequences such as enhancers and silencers, modulating gene transcription rate and thereby influencing cellular function [32]. In this study, a total of 7413 significantly different transcription factors were identified in three different years of P. grandiflorum.

Notably, the bHLH and MYB families comprised a substantial percentage of these TFs. The bHLH family of transcription factors is renowned for regulating plant growth and development, response to stress, and synthesis of secondary metabolites [33, 34]. Studies on Medicago truncatula have shown that bHLH TFs like TSAR1 and TSAR2 are involved in regulating triterpenoid saponin biosynthesis [35]. PnMYB4 and PnMYB1 in Panax notoginseng are involved in the regulation of saponin biosynthesis together with PnbHLH [36]. In this study, a total of 54 significantly different bHLH transcription factors were identified in roots of P. grandiflorum from three different years, and it is predicted that these differential transcription factors may also be involved in the regulation of triterpenoid saponin synthesis in P. grandiflorum. The MYB transcription factor family is widely distributed in eukaryotes, and the MYB transcription factor family is involved in the regulation of a number of processes in plants, including secondary metabolic pathways, phytohormone signalling pathways and plant growth and developmental processes [37]. It has been shown that the SIMYB75 transcription factor in Tomato is involved in the regulation of terpenoid accumulation processes [38]. The MYB family of transcription factors in tea plants is involved in the regulation of processes that regulate growth and development, biosynthesis of secondary metabolites, and environmental stress responses in tea tree, and the CsMYB68, CsMYB147, CsMYB148 and CsMYB193 transcription factors are involved in the regulation of terpenoid synthesis in tea plants [39]. A total of 51 significantly different MYB transcription factors were identified in this study, and it was inferred that these transcription factors may have important regulatory roles in the synthesis of terpenoids in P. grandiflorum roots. In addition, some NAC transcription factors and WRKY transcription factors were also identified in this study. It has been reported that NAC transcription factors in tomato have important regulatory roles in tomato growth and development, fruit ripening and abiotic stresses [40]. The WRKY transcription factors in apple play an important role in regulating drought stress tolerance [41]. We hypothesize that NAC and WRKY TFs may play crucial roles in the growth, development, and terpenoid saponin synthesis of P. grandiflorum. The specific regulatory mechanisms of these differential transcription factors, including bHLH, MYB, NAC, WRKY, and others identified in our study, in relation to the growth, development, and terpenoid saponin synthesis of P. grandiflorum require further investigation.

In this study, transcriptome sequencing studies were conducted on the roots of Tongcheng P. grandiflorum from different years, and a variety of genes encoding key enzymes and significantly different transcription factors in the triterpene biosynthesis pathways were identified. However, to fully reveal the regulatory mechanism of triterpenoid saponin synthesis in P. grandiflorum. We next need more in-depth studies on the functions of genes encoding key enzymes as well as integrated analyses of multi-omics data.

Conclusions

In this study, transcriptome sequencing of Tongcheng P. grandiflorum from different years obtained a total of 742,880,616 clean reads, containing 111.44 Gb of valid data, demonstrating high sequencing data quality. Analysis through GO and KEGG enrichment revealed substantial differences in gene expression levels between annual P. grandiflorum and biennial and triennial P. grandiflorum. Genes encoding key enzymes involved in the terpenoid synthesis pathway, including AACT1, AACT2, DXR, DXS, FPPS, GPPS, HMGR1, HMGR2, HMGS, IPI, SQE1, SQE2, β-AS1, β-AS2, β-AS3, β-AS4, a total of 16 genes, were identified across three different years of P. grandiflorum. These finding underscore the significance of these genes in regulating terpenoid biosynthesis in Tongcheng P. grandiflorum, highlighting their importance for future studies on terpenoid synthesis in P. grandiflorum.

Materials and methods

Experimental materials

The materials for this experiment were collected from Shucheng Hongsheng Agricultural and Forestry Herb Planting Base, Shucheng County, Lu’an City, Anhui Province, China (Fig. 10). Six samples each of annual P. grandiflorum (TC1_1 to TC1_6), biennial P. grandiflorum (TC2_1, to TC2_6), and triennial P. grandiflorum (TC3_1 to TC3_6) were collected respectively, totaling 18 samples (Table S9). All samples were confirmed to be P. grandiflorum. The roots were rinsed with PBS and dried with absorbent paper, and approximately 3 g of fresh roots were taken from each group. These samples were placed into labeled centrifuge tubes, quickly frozen in liquid nitrogen for 15 min, and then stored at -80 ℃ for storage.

Fig. 10
figure 10

Diagram of roots of P. grandiflorum in different years

Methods

Determination of total saponin content

The total saponin content of P. grandiflorum from different growth years was measured by UV spectrophotometry using polygalacin D as the standard. First, 9.47 mg of polygalacin D standard was weighed and dissolved in methanol in a 10 mL volumetric flask, and well-shaking to obtain the reference solution. Next, 0.50 g of P. grandiflorum powder was precisely weighed, and 70% methanol was added in a ratio of 20:1 to dissolve the powder. The solution was then sonicated for 0.5 h, evaporated to dryness, and dissolved again in methanol in a 5 mL volumetric flask to prepare the test solution. The vanillin-glacial acetic acid perchloric acid chromogenic method was used to measure the absorbance of the control reference and test solution at 472 nm after color development. This allowed for the calculation of the total saponin content in each sample. Data processing and graphing using Excel and origin2022.

Determination of saponin monomer content

HPLC-ELSD chromatographic conditions were as follows: the column used was an Agilent Eclipse Plus C18 (2.1 × 100 mm, 5 μm). The mobile phase A (water)-B (acetonitrile) with gradient elution conditions specified in Table S10. The flow rate was set at 0.3 ml/min, and injection volume of 3 µL. The column temperature of 35 ℃. The drift tube temperature was set to 85 ℃, the nebulization temperature was 50 ℃, and the carrier gas flow rate of 1.6 SLM.

To prepare the reference solution, accurately weigh 5.00 mg deapioplatycoside E, 8.92 mg platycoside E, 4.86 mg of deapioplatycoside D3, 7.75 mg platycoside D3, 9.57 mg of deapioplatycoside D, 10.63 mg platycoside D2, 9.98 mg platycoside D, 7.08 mg of polygalacin D standards in a 10 mL volumetric flask, add 50% methanol solution to dissolved them and make up the volume to scale mark. Shake well to obtain the reference solution. Measure 1 ml of each control solution, combine them in a volumetric flask, and mix thoroughly to obtain the mixed reference solution.

For the test solution, weigh 2.0 g of powdered P. grandiflorum and place it in a 100 ml conical flask. Add 50% methanol 50 mL, sonicate at 100 W for 1 h, filter the solution, and evaporate the solvent. Dissolved the residue in 50% methanol, transfer it to a 5 mL volumetric flask, and make up the volume to scale mark. Shake well and filter through a 0.22 μm microporous membrane to obtain the test solution.

RNA extraction library construction and sequencing

Total RNA was extracted using Trizol reagent (thermofisher, 15596018) following the manufacturer’s procedure. The quantity and purity of the RNA were analyzed with Bioanalyzer 2100 and RNA 6000 Nano LabChip Kit (Agilent, CA, USA, 5067 − 1511). High-quality RNA samples with RIN number > 7.0 were selected for constructing the sequencing library. After total RNA extraction, mRNA was purified from total RNA (5 ug) using Dynabeads Oligo (dT) (Thermo Fisher, CA, USA) with two rounds of purification. The mRNA was then fragmented into short fragments using divalent cations at elevated temperature (Magnesium RNA Fragmentation Module (NEB, cat.e6150, USA) under 94℃ 5–7 min). Then fragmented RNA was reverse-transcribed into cDNA by SuperScript™ II Reverse Transcriptase (Invitrogen, cat. 1896649, USA), followed by the synthesis of U-labeled second-stranded DNAs with E. coli DNA polymerase I (NEB, cat.m0209, USA), RNase H (NEB, cat.m0297, USA) and dUTP Solution (Thermo Fisher, cat.R0133, USA). An A-base was then added to the blunt ends of each strand to prepare for ligation to the indexed adapters, which contained a T-base overhang for ligating the adapter to the A-tailed fragmented DNA. Dual-index adapters were ligated to the fragments, and size selection was performed with AMPureXP beads. After the heat-labile UDG enzyme (NEB, cat.m0280, USA) treatment of the U-labeled second-stranded DNAs, the ligated products were amplified with PCR by the following conditions: initial denaturation at 95 ℃ for 3 min; 8 cycles of denaturation at 98 ℃ for 15 s, annealing at 60 ℃ for 15 s, and extension at 72 ℃ for 30 s; and then final extension at 72 ℃ for 5 min. The average insert size for the final cDNA librarys were 300 ± 50 bp. Finally, 2 × 150 bp paired-end sequencing (PE150) was performed with the IlluminaNovaseq™ 6000 platform.

Sequential assembly splicing

Cutadapt was used to remove reads containing splice sites, polyA and polyG sequences, unknown nucleotide (N) exceeding 5% and low-quality reads from the raw data. After obtaining clean, the quality of the reads, including Q20, Q30 and GC content, was verified using FastQC (https://www.bioinformatics.babraham.ac.uk/projects/fastqc/, 0.11.9). HISAT2 (https://daehwankimlab.github.io/hisat2/, version: hisat2-2.2.1) was then used to align the clean data to the reference genome (https://download.cncb.ac.cn/gwh/Plants/Platycodon_grandiflorus_JG_GWHARYT00000000/), providing comprehensive comparison information, and statistics based on gene location information specified in the genome annotation file. The clean reads were assembled using Trinity software to obtain transcripts, which were then processed for homologous splicing to obtain unigene clusters.

Analysis of differentially expressed genes

Differentially expressed genes were screened and analyzed using DESeq2, edgeR, and the p-value was corrected using BH to obtain the q-value (FDR value, p.adj value), and differentially expressed genes were screened using the multiplicity of differences FC ≥ 2 or FC ≤ 0.5 (|log2fc| ≥ 1) and q < 0.05 (|log2fc| ≥ 1 & q < 0.05) as the threshold criteria. Clustering heatmaps were used to illustrate differential gene expression patterns and GO (https://geneontology.org) and KEGG (https://www.kegg.jp/kegg) enrichment analyses were performed on the differentially expressed genes to further explore their biological significance. Data processing and graphing using Excel, TBtools and origin 2022.

Quantitative real-time PCR analysis

To verify the reliability of the transcriptome sequencing results, 16 DEGs identified in triterpene synthesis pathway were validated by qRT-PCR in P. grandiflorum of different growth years, in which three replicates were available for each sample. Primers were designed using Primer software with β-actin as the internal reference gene, and the primer sequences for all genes are shown in (Table S11). Total RNA from all samples was extracted using the RNA extraction kit, and Total RNA from all samples was extracted using an RNA extraction kit and reverse transcribed using the kit SynScript® III RT SuperMix kit. The resulting cDNA was as a template for qRT-PCR analyses with the ArtiCanCEO SYBR qPCR Mix kit. The reaction procedure was as follows: pre deformation 95 °C for 5 min. Cycling phase 95 °C for 15 s, 60 °C for 20 s, 72 °C for 20 s for 40 cycles. Dissolution phase 95 °C for 15 s, 65 °C for 1 min, and warming from 65 °C to 95 °C at 0.1 °C/s. The fluorescence intensities of the samples were measured consecutively to obtain the melting curve. Relative expression levels were determined using the 2−∆∆CT method. Data processing and graphing using Excel and origin2022.

Data availability

Data availabilityThe datasets supporting the conclusions of this article are available in the [NCBI-SRA] repository, the datasets generated for this study can be accessed through the SRA-BioProjects (accession numbers: PRJNA1131010) and SRA-BioSamples (accession numbers: SRR29702873, SRR29702872, SRR29702863, SRR29702862, SRR29702861, SRR29702860, SRR29702859, SRR29702858, SRR29702857, SRR29702856, SRR29702871, SRR29702870, SRR29702869, SRR29702868, SRR29702867, SRR29702866, SRR29702865, SRR29702864) databases. The datasets analyzed in this article are available from our the RNA sequencing (Table S1-10), Primers used for PCR in this study are listed in the supplement Table 11.

Abbreviations

P. grandiflorum :

Platycodon grandiflorum

TC:

TongCheng

GO:

Gene Ontology

KEGG:

Kyoto Encyclopedia of Genes and Genomes

qRT-PCR:

Quantitative Real-time Polymerase Chain Reaction

DEGs:

Differentially expressed genes

MVA:

Mevalonate

MEP:

Methylerythritol-4-phosphate

IPP:

Isopentenyl pyrophosphate

DMAPP:

Dimethylallyl pyrophosphate

OSCs:

Oxidosqualene cyclases

CYP450s:

Cytochrome P450 monooxygenases

UGTs:

Uridine diphosphate glycosyltransferases

LS:

Lupinol synthase

UV:

Ultraviolet

MF:

Molecular function

CC:

Cellular components

BP:

Biological processes

TFs:

Transcription factors

References

  1. Ji MY, Bo AG, Yang M, Xu JF, Jiang LL, Zhou BC, Li MH. The pharmacological effects and Health benefits of Platycodon grandiflorus-A Medicine Food Homology species. Foods 2020; 9(2).

  2. Lv YP, Tong XH, Zhang PF, Yu NJ, Gui SY, Han RC, Ge DZ. Comparative transcriptomic analysis on White and Blue flowers of Platycodon grandiflorus to elucidate genes involved in the biosynthesis of anthocyanins. Iran J Biotechnol. 2021;19(3):70–7.

    Google Scholar 

  3. Ke W, Bonilla-Rosso G, Engel P, Wang P, Chen F, Hu X. Suppression of High-Fat Diet–Induced obesity by Platycodon Grandiflorus in mice is linked to changes in the gut microbiota. Nutrition. 2020;150(9):2364–74.

    Google Scholar 

  4. Buchwald W, Szulc M, Baraniak J, Derebecka N, Kania-Dobrowolska M, Piasecka A, Bogacz A, Karasiewicz M, Bartkowiak-Wieczorek J, Kujawski R, et al. The effect of different water extracts from Platycodon grandiflorum on selected factors Associated with Pathogenesis of Chronic bronchitis in rats. Molecules. 2020;25(21):5020.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  5. Leng J, Wang Z, Fu Cl, Zhang J, Ren S, Hu Jn, Jiang S, Wang Yp, Chen C, Li W. NF-κB and AMPK/PI3K/Akt signaling pathways are involved in the protective effects of Platycodon grandiflorum saponins against acetaminophen‐induced acute hepatotoxicity in mice. Phytother Res. 2018;32(11):2235–46.

    Article  CAS  PubMed  Google Scholar 

  6. Ju J, Lee T, Lee J, Kim T, Shin K, Oh D. Improved Bioactivity of 3-O-β-D-Glucopyranosyl platycosides in Biotransformed Platycodon grandiflorum Root Extract by Pectinase from Aspergillus Aculeatus. J Microbiol Biotechn. 2021;31(6):847–54.

    Article  CAS  Google Scholar 

  7. Qiu L, Xiao Y, Liu YQ, Peng LX, Liao W, Fu Q. Platycosides P and Q, two new triterpene saponins from Platycodon grandiflorum. J Asian Nat Prod Res. 2019;21(5):419–25.

    Article  CAS  PubMed  Google Scholar 

  8. Zhang YP, Sun MH, He YJ, Gao WY, Wang Y, Yang BY, Sun YP, Kuang HX. Polysaccharides from Platycodon grandiflorum: a review of their extraction, structures, modifications, and bioactivities. Int J Biol Macromol. 2024;271.

  9. Zhang Z, Zhao M, Zheng W, Liu Y, Platycodin D. a triterpenoid saponin from Platycodon grandiflorum, suppresses the growth and invasion of human oral squamous cell carcinoma cells via the NF-κB pathway. J Biochem Mol Toxic. 2017; 31(9).

  10. Chen ST, Wang Q, Ming S, Zheng HG, Hua BJ, Yang HS. Platycodin D induces apoptosis through JNK1/AP-1/PUMA pathway in non-small cell lung cancer cells: a new mechanism for an old compound. Front Pharmacol. 2022;13.

  11. Jeon D, Kim S, Kim H, Platycodin D. A bioactive component of Platycodon grandiflorum, induces cancer cell death associated with extreme vacuolation. Anim Cells Syst. 2019;23(2):118–27.

    Article  CAS  Google Scholar 

  12. Luo Q, Wei G, Wu X, Tang K, Xu M, Wu Y, Liu Y, Li X, Sun Z, Ju W, et al. Platycodin D inhibits platelet function and thrombus formation through inducing internalization of platelet glycoprotein receptors. J Transl Med. 2018;16(1):311.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  13. Almeida A, Dong L, Appendino G, Bak S. Plant triterpenoids with bond-missing skeletons: biogenesis, distribution and bioactivity. Nat Prod Rep. 2020;37(9):1207–28.

    Article  CAS  PubMed  Google Scholar 

  14. Cárdenas PD, Almeida A, Bak S. Evolution of Structural Diversity of triterpenoids. Front Plant Sci. 2019;10:1523.

    Article  PubMed  PubMed Central  Google Scholar 

  15. Yao L, Lu J, Wang J, Gao W-Y. Advances in biosynthesis of triterpenoid saponins in medicinal plants. Chin J Nat Med. 2020;18(6):417–24.

    CAS  PubMed  Google Scholar 

  16. Dinday S, Ghosh S. Recent advances in triterpenoid pathway elucidation and engineering. Biotechnol Adv. 2023;68.

  17. Wang J, Guo YH, Yin X, Wang XN, Qi XQ, Xue ZY. Diverse triterpene skeletons are derived from the expansion and divergent evolution of 2,3-oxidosqualene cyclases in plants. Crit Rev Biochem Mol. 2022;57(2):113–32.

    Article  Google Scholar 

  18. Tang Q, Chen G, Song W, Fan W, Wei K, He S, Zhang G, Tang J, Li Y, Lin Y, et al. Transcriptome analysis of Panax zingiberensis identifies genes encoding oleanolic acid glucuronosyltransferase involved in the biosynthesis of oleanane-type ginsenosides. Planta. 2018;249(2):393–406.

    Article  PubMed  Google Scholar 

  19. Miettinen K, Pollier J, Buyst D, Arendt P, Csuk R, Sommerwerk S, Moses T, Mertens J, Sonawane PD, Pauwels L, et al. The ancient CYP716 family is a major contributor to the diversification of eudicot triterpenoid biosynthesis. Nat Commun. 2017;8(1):14153.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  20. Xinyi L, Ling C, Yuying M, Wei D. Comparison of the Contents of Platycodon Grandiflorum Total Saponins and Platycodon Grandiflorum D in different cultivation years. China Pharm. 2018;29(9):1249–52.

    Google Scholar 

  21. Xu M, Liu X, Wang J, Teng S, Shi J, Li Y, Huang M. Transcriptome sequencing and development of novel genic SSR markers for Dendrobium officinale. Mol Breed. 2017;37(2):1–7.

    Article  Google Scholar 

  22. Zhou Y, Bai YH, Han FX, Chen X, Wu FS, Liu Q, Ma WZ, Zhang YQ. Transcriptome sequencing and metabolome analysis reveal the molecular mechanism of Salvia miltiorrhiza in response to drought stress. BMC Plant Biol. 2024; 24(1).

  23. Liang LJ, He ZG, Yu HZ, Wang EH, Zhang XJ, Zhang BX, Zhang CL, Liang ZS. Selection and validation of reference genes for gene expression studies in Codonopsis pilosula based on transcriptome sequence data. Volume 10. SCI REP-UK.; 2020. 1.

  24. Shi J, Fei X, Hu Y, Liu Y, Wei A. Identification of key genes in the synthesis pathway of volatile terpenoids in Fruit of Zanthoxylum Bungeanum Maxim. Forests. 2019;10(4):328.

    Article  Google Scholar 

  25. Yu HW, Liu ML, Yin MN, Shan TY, Peng HS, Wang JT, Chang XW, Peng DY, Zha LP, Gui SY. Transcriptome analysis identifies putative genes involved in triterpenoid biosynthesis in Platycodon grandiflorus. Planta. 2021;254(2):34.

    Article  CAS  PubMed  Google Scholar 

  26. Lu H, Ju M, Chu S, Xu T, Huang Y, Chan Q, Peng H, Gui S. Quantitative and Chemical Fingerprint Analysis for the quality evaluation of Platycodi Radix collected from various regions in China by HPLC coupled with Chemometrics. Molecules. 2018;23(7):1823.

    Article  PubMed  PubMed Central  Google Scholar 

  27. Jungeun K, SangHo K, SinGi P, TaeJin Y, Yi L, Tae KO, Oksung C, Jungho L, JaePil C, SooJin K, et al. Whole-genome, transcriptome, and methylome analyses provide insights into the evolution of platycoside biosynthesis in Platycodon grandiflorus, a medicinal plant. Hortic Res-England. 2020;7(1):112.

    Article  Google Scholar 

  28. Wang X, Wang CY, Yang MK, Jie WC, Fazal A, Fu JY, Yin TM, Cai JF, Liu B, Lu GH et al. Genome-Wide Comparison and Functional Characterization of HMGR Gene Family Associated with Shikonin Biosynthesis in Lithospermum erythrorhizon. Int J Mol Sci. 2023; 24(15).

  29. Zhang C, Liu H, Zong Y, Tu Z, Li H. Isolation, expression, and functional analysis of the geranylgeranyl pyrophosphate synthase (GGPPS) gene from Liriodendron tulipifera. Plant Physiol Bioch. 2021;166:700–11.

    Article  CAS  Google Scholar 

  30. Gao WJ, Wang X, Purente N, Muhammad L, Zhou YW, He M. A 1-deoxy-D-xylulose 5-phosphate reductoisomerase gene probably involved in the synthesis of terpenoids in Chrysanthemum indicum var. Aromaticum. Can J Plant Sci. 2018;98(6):1254–64.

    Article  CAS  Google Scholar 

  31. Hui W, Chen X, Ali M, Weibo S, Dawei L, Qiang Z. Characterization and function of 3-Hydroxy-3-Methylglutaryl-CoA reductase in Populus trichocarpa: overexpression of PtHMGR enhances terpenoids in Transgenic Poplar. Front Plant Sci. 2019;10:1476.

    Article  Google Scholar 

  32. Jiang J, Ma S, Ye N, Jiang M, Cao J, Zhang J. WRKY transcription factors in plant responses to stresses. J Integr Plant Biol. 2017;59(2):86–101.

    Article  CAS  PubMed  Google Scholar 

  33. Ribeiro B, Erffelinck M, Lacchini E, Ceulemans E, Colinas M, Williams C, Van Hamme E, De Clercq R, Perassolo M, Goossens A. Interference between ER stress-related bZIP-type and jasmonate-inducible bHLH-type transcription factors in the regulation of triterpene saponin biosynthesis in Medicago truncatula. Front Plant Sci; 2022. p. 13.

  34. Sun X, Wang Y, Sui N. Transcriptional regulation of bHLH during plant response to stress. Biochem Bioph Res Co. 2018;503(2):397–401.

    Article  CAS  Google Scholar 

  35. Zhang WH, Zhang JZ, Fan YD, Dong J, Gao P, Jiang WZ, Yang T, Che DD. RNA sequencing analysis reveals PgbHLH28 as the key regulator in response to methyl jasmonate-induced saponin accumulation in Platycodon grandiflorus. Hortic Res-England. 2024;11(5).

  36. Man J, Shi Y, Huang Y, Zhang X, Wang X, Liu S, He G, An K, Han D, Wang X et al. PnMYB4 negatively modulates saponin biosynthesis in Panax notoginseng through interplay with PnMYB1. Hortic Res-England. 2023;10(8).

  37. Thakur S, Vasudev PG. MYB transcription factors and their role in Medicinal plants. Mol Biol Rep. 2022;49(11):10995–1008.

    Article  CAS  PubMed  Google Scholar 

  38. Gong Z, Luo Y, Zhang W, Jian W, Zhang L, Gao X, Hu X, Yuan Y, Wu M, Xu X, et al. A SlMYB75-centred transcriptional cascade regulates trichome formation and sesquiterpene accumulation in tomato. J Exp Bot. 2021;72(10):3806–20.

    Article  CAS  PubMed  Google Scholar 

  39. Li P, Xia E, Fu J, Xu Y, Zhao X, Tong W, Tang Q, Tadege M, Fernie AR, Zhao J. Diverse roles of MYB transcription factors in regulating secondary metabolite biosynthesis, shoot development, and stress responses in tea plants (Camellia sinensis). Plant J. 2022;110(4):1144–65.

    Article  CAS  PubMed  Google Scholar 

  40. Chen N, Shao Q, Lu Q, Li X, Gao Y, Xiao Q. Research progress on function of NAC transcription factors in tomato (Solanum lycopersicum L). Euphytica 2023;219(1).

  41. Duan DY, Yi R, Ma YL, Dong QL, Mao K, Ma FW. Apple WRKY transcription factor MdWRKY56 positively modulates drought stress tolerance. Environ Exp Bot. 2023;212.

Download references

Acknowledgements

We would like to thank you for your professional review work, constructive comments, and valuable suggestions on our manuscript.

Funding

This study was supported by the Natural Science Foundation of China (32202442), Anhui Provincial University Research Projects (2023AH052637), National Key R&D Program of China (2023YFC3503804), China Agricultural Research System of MOF and MARA (CARS-21).

Author information

Authors and Affiliations

Authors

Contributions

GYW, GHL and HD conceived the experiments. GYW, XTW, XLL and HD collected plants and processed samples. GYW and GHL processed the data and plotted the pictures. GYW and GHL prepared the manuscript. GYW, GHL, JMO and HD reviewed the manuscript. HD provided financial support for this study.

Corresponding authors

Correspondence to Guohui Li or Hui Deng.

Ethics declarations

Ethics approval and consent to participate

Ethics approval and consent to participate. No animal species are involved in this experiment. All Platycodon grandiflorum plants used in this study are cultivated products. Collecting some Platycodon grandiflorum plants for research purposes will not harm the local ecology from.

Consent for publication

Not applicable.

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary Material 1: Transcriptome sequencing data quality analysis

Supplementary Material 2: GO analysis of TC1vsTC2 differentiallyexpressed genes

Supplementary Material 3: GO analysis of TC1vsTC3 differentiallyexpressed genes

Supplementary Material 4: GO analysis of TC2vsTC3 differentiallyexpressed genes

Supplementary Material 5: KEGG analysis of TC1vsTC2 differentiallyexpressed genes

Supplementary Material 6: KEGG analysis of TC1vsTC3 differentiallyexpressed genes

Supplementary Material 7: KEGG analysis of TC2vsTC3 differentiallyexpressed genes

Supplementary Material 8: Names and abbreviations of key enzymes ofterpenoid synthesis pathway

Supplementary Material 9: Grouping of samples of Platycodon grandiflorum

Supplementary Material 10: The mobile phase gradient elution conditions

Supplementary Material 11: Primers used for PCR

Supplementary Material 12: Analysis of TC1vsTC2 differential transcription factors

Supplementary Material 13: Analysis of TC1vsTC3 differential transcription factors

Supplementary Material 14: Analysis of TC2vsTC3 differential transcription factors

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wang, G., Wan, X., Li, X. et al. Transcriptome-based analysis of key functional genes in the triterpenoid saponin synthesis pathway of Platycodon grandiflorum. BMC Genom Data 25, 83 (2024). https://doi.org/10.1186/s12863-024-01266-2

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1186/s12863-024-01266-2

Keywords