RNA-Seq analysis in giant pandas reveals the differential expression of multiple genes involved in cataract formation

Background The giant panda (Ailuropoda melanoleuca) is an endangered mammalian species native to China. Fewer than 2500 giant pandas are known to exist, many of which are bred in captivity as a means to preserve and repopulate the species. Like other captive mammals, giant pandas acquire age-related cataracts, reducing their quality of life. Recent comparative genome-wide methylation analysis revealed 110 differentially methylated genes associated with cataract formation including six also associated with the formation of age-related cataracts in humans. Results To investigate the pathological pathway in greater detail, here we used RNA-Seq analysis to investigate the differential expression profiles of genes in three giant pandas with cataracts and three healthy controls. We identified more than 700 differentially expressed genes, 29 of which were selected for further analysis based on their low q-value. We found that many of the genes encoded regulatory and signaling proteins associated with the control of cell growth, migration, differentiation and apoptosis, supporting previous research indicating a key role for apoptosis in cataract formation. Conclusion The identification of genes involved in the formation of age-related cataracts could facilitate the development of predictive markers, preventative measures and even new therapies to improve the life of captive animals. Supplementary Information The online version contains supplementary material available at 10.1186/s12863-021-00996-x.


Background
The housing of mammals in zoos and reservations is an efficient strategy to protect endangered species and encourage repopulation, but captive mammals tend to live longer than their wild counterparts and thus experience diseases of ageing, which are uncommon in the wild. One example is the development of age-related cataracts, which are associated with ageing mammals due to the accumulation of oxidative damage in the lens [1]. Cataracts are the main cause of blindness in ageing humans and other primates, as well as companion animals such as dogs and cats [2][3][4]. They are also prevalent in captive giant pandas (Ailuropoda melanoleuca), which live to 15-20 years in the wild but 25-30 years in captivity [5,6]. Giant pandas 18 or more years old are described as aged because they have reached an equivalent human age of~75. The prevalence of cataracts in the current population of aged giant pandas is~20%.
Age-related cataracts are heritable with significant environmental triggers, including oxidative stress and the resulting accumulation of DNA damage [7][8][9][10]. The genes most strongly associated with cataracts therefore include those related to oxidative stress responses, the production of antioxidant enzymes and metabolites, and various DNA repair pathways [11][12][13]. Previous studies identified changes in DNA methylation associated with cataract formation in several mammals [14][15][16] and we recently reported that 110 genes with functions relevant to cataract pathogenesis are differentially methylated in giant pandas, including six genes known to be associated with age-related cataracts in humans [17].
Epigenetic modifications such as DNA methylation affect gene expression and therefore control the availability of the corresponding gene products. To gain more insight into the role of differential gene expression in the pathogenesis of age-related cataracts in giant pandas, we took blood samples from three aged giant pandas with cataracts and three healthy controls for RNA-Seq analysis. Following the alignment of reads with the reference genome, we identified expression profiles representing genes expressed exclusively or preferentially in the cataractogenic or healthy samples, and determined the corresponding functional annotations. The identification of genes that are overexpressed or suppressed during the formation of cataracts could lead to the development of new diagnostics, preventative treatments and therapeutic approaches to improve the quality of life for captive giant pandas and other mammals.

Samples
Peripheral blood samples were collected from six giant pandas (five females and one male) ranging in age from 19 to 37, with three of the females affected by cataracts and the other specimens defined as healthy based on regular physical examinations ( Table 1). The blood samples were assigned to three sample bands: A (affect females), B (unaffected male) and C (unaffected females). These sample bands were subsequently used for comparative transcriptomic analysis.

RNA-Seq data processing and quality analysis
Blood samples from all six giant pandas were used to prepare a de novo RNA-Seq dataset with an average of 52.05 ± 5.63 million reads per specimen and an average length of 144.35 ± 1.86 bp after quality control and trimming. For each specimen,~87% of the reads mapped to unique sequences in the reference genome (Supplementary Table S1). Homogeneity distribution analysis and the gene coverage ratio indicated the anticipated distribution of reads within each gene and across different genes ( Supplementary Fig. S1a,b) and also confirmed that~44.72% of the reads mapped to exonic regions of the reference sequence ( Supplementary Fig. S1c).

Analysis of gene structure characteristics
Analysis of the RNA-Seq dataset for gene coverage and chromosome distribution revealed differences in mapping density across different chromosomes, with chromosomes GL192341.1 and GL192355.1 showing a particularly low read density and chromosomes GL192348. 1 showing the opposite phenomenon (Fig. 1a). The mapped reads were screened for single-nucleotide polymorphisms (SNPs) and insertion/deletion polymorphisms (indels) revealing 5.9-6.7 × 10 4 SNPs but < 1 × 10 4 indels in each sample (Fig. 1b). Among the SNPs, the number of transitions vastly exceeded the number of transversions, and the transitions A > G, C > T, G > A and T > C were particularly abundant ( Supplementary Fig. S2). We detected more than 2.5 × 10 4 alternative mRNA processing events, the most abundant of which were alternative transcriptional start sites (8 × 10 3 ) followed by alternative transcriptional stop sites (7 × 10 3 ) (Fig. 2). We also identified at least eight new transcript regions representing previously unknown isoforms of known genes, with seven of these new regions mapping to four genes: GNAO1, OGFOD1, HERPUD1 and NLRC5 (Supplementary Table S2).

Analysis of gene expression
The number of reads mapping to each gene in the reference genome represents the abundance of the corresponding mRNA when corrected for factors such as gene length, sequencing depth and saturation. We also ensured the accuracy of our results by testing for correlation between samples (Supplementary Table S3). We then determined the number of genes that were coexpressed in our samples, revealing a core of 2711 common genes that were universally expressed and a minimum of 172 and a maximum of 263 genes in each animal that were not part of this common set (Fig. 3). PCA indicated that 36.68% of the variation could be explained by PC1, a further 33.06% by PC2, and 13.65% by PC3, primarily separating the affected and unaffected animals but to a lesser extent separating the unaffected male from the unaffected females. Overall, this resulted in the clustering of the affected females together and the dispersion of the other three samples, suggesting the affected females had more in common than any of the unaffected individuals had in common with each other (Fig. 4a). This  outcome was broadly supported by PoCA, which revealed that 50.02% of the variation could be explained by PoC1, a further 13.65% by PoC2 and 5.62% by PoC3, again leading to the formation of an affected cluster with the other samples dispersed (Fig. 4b). The individual two-dimensional plots showing the PCA and PoCA data in more detail are provided in Fig. 5.

Identification and analysis of differentially expressed genes
We observed a much greater number of differentially expressed genes when comparing female pandas with cataracts to the healthy male (A vs B, 705 genes) than when comparing female pandas with cataracts to healthy females (A vs C, 33 genes). However, most of these genes were not differentially expressed when comparing healthy male and female pandas (B vs C, 116 genes), suggesting that the A vs B profile cannot be wholly explained by sex-specific differences in gene expression (Fig. 6). Given that very few genes were differentially expressed when comparing affected and healthy females (A vs C), cataract formation appears to influence a larger number of genes in male than female pandas. Among the 705 differentially expressed genes in the A vs B comparison, 533 were upregulated and 172 downregulated. When visualized as a scatter plot (Fig. 7), it is clear that more genes are upregulated in the healthy male than the affected females and are more likely to show a statistically significant change in expression (Fig. 7b) whereas the 23 upregulated and 10 downregulated genes in the comparison A vs C show limited statistical significance (Fig. 7a). The results are emphasized in the corresponding heat maps (Fig. 8). The comparison of affected females (A) with unaffected pandas of either sex (B + C) revealed 29 genes satisfying q < 0.05 and |log 2 fold change (FC)| > 1 ( Table 2).

Functional annotation of differentially expressed genes
The biological functions and related pathways of the differentially expressed genes were investigated by screening the sequences against the GO and KEGG databases, as well as the Clusters of Orthologous Groups of proteins (COG) and euKaryotic Ortholog Groups (KOG) maintained by the NCBI. GO annotations revealed that the differentially expressed genes represented a wide range of biological processes (111 genes), cellular components (73 genes) and molecular functions (5 genes). Strongly represented biological process categories (accounting for > 10% of the genes) included cellular process, metabolic process, biological regulation/regulation of a biological process (particularly positive regulation), and response to stimulus. With few exceptions, the functional profile of the differentially expressed genes mirrored that of the total gene catalog, although the differentially expressed genes were overrepresented in the categories immune system process and multi-organism process but underrepresented in the categories biological phase and cell aggregation ( Fig. 9). There was little difference in terms of cellular Scatter plots showing the differentially expressed genes when comparing the three sample bands in three pairwise comparisons. The horizontal and vertical axes represent the log2(TPM) value of two group samples. Each point represents a gene, and the closer each point is to the origin, the lower the expression level. Red represents upregulated genes, green represents downregulated genes, and black represents genes with no difference in expression component categories, with the exception of the differentially expressed genes being unrepresented in the category nucleoid. However, one of the most abundant categories among the differentially expressed and complete gene catalog was protein-containing complex, and preliminary protein network analysis revealed that the genes most strongly upregulated in the A vs B comparison also tended to be more likely to interact with other proteins and also tended to have more connections (data not shown). In terms of molecular functions, there was again little difference between the differentially expressed genes and complete gene catalog, with the exception of the differentially expressed genes being unrepresented in the categories involving morphogen, metallochaperone and chemoattractant/chemorepellant activity (Fig. 9). Interestingly, several genes encoding enzymes involved in histidine metabolism were included among the differentially expressed pathways revealed when screening KEGG (Fig. 10). The analysis of GO categories that were enriched among the differentially expressed genes revealed strong hits for immunity and defense-related functions, which showed the highest Rich factors (Table 3, Fig. 11). Visualization of the biological, cellular and molecular functions by means of directed acyclic graphs revealed that 45      Fig. S3).

Discussion
Ageing mammals in captivity often develop cataracts due to the accumulation of oxidative damage [1]. This has been observed in captive giant pandas, which typically live up to 10 years longer than their wild counterparts [5,6]. The prevalence of cataracts in the current population of aged giant pandas is~20% and this is associated with a declining quality of life, as the animals find it difficult to feed and negotiate their surroundings. Although genetic factors have been identified that promote agerelated cataracts, the most important triggers are environmental, particularly oxidative stress and DNA damage [7][8][9][10]. We previously compared the DNA methylation status of giant pandas with and without cataracts in an attempt to identify epigenetic effects that might influence the expression of genes associated with cataract formation [17]. We identified multiple differentially methylated genes with potential roles in cataract-related pathways, including base excision repair, apoptosis and p53 signaling. Certain genes also showed abnormal methylation profiles specifically in the pandas with cataracts, including the cysteineaspartate protease gene CASP3, a pro-apoptotic mediator already linked to cataracts in rats [18,19], and the glutathione S-transferase gene GSTM3, which is expressed in the lens tissues of human patients with age-related cataracts [20].
DNA methylation is an epigenetic mechanism that regulates gene expression by modifying the structure of chromatin, usually leading to the suppression of gene expression. As a logical extension of our previous study, we were therefore interested in the analysis of differential gene expression between giant pandas with cataracts and controls with healthy eyes. Using the same animals as in our previous study, we carried out RNA-Seq analysis to identify panels of genes either upregulated or downregulated in pandas with cataracts. Following the alignment of reads with the reference genome, we identified expression profiles representing genes expressed exclusively or preferentially in the cataractogenic or healthy samples, and identified the corresponding functional annotations by screening the KEGG database and GO categories. Our RNA-Seq data also revealed abundant alternative splicing and novel transcripts within the dataset, as previously reported during murine lens development [21]. Two genes that were shown in our DNA methylation study [17] to be methylated specifically in affected pandas were also shown to be downregulated in the affected pandas by RNA-Seq analysis. CASP3, encoding the apoptotic cysteine-aspartate protease was downregulated with a log 2 FC of − 1.03, whereas GSTM3, encoding the glutathione S-transferase, was downregulated with a log 2 FC of − 1.28.
A link between the glutathione (GSH) pathway and cataract formation was identified in a transcriptomic study in mice [22]. The authors considered the Fig. 9 Gene Ontology categories of differentially expressed genes. The x-axis shows classifications and the y-axes show percentage of total classified genes (left) and total number of genes (right). Different colors represent the three different domains of biological process (green), cellular component (blue) and molecular function (yellow). Light bars represent differentially expressed genes, and dark bars represent all genes consequences of GSH deficiency in naïve and buthionine sulfoximine-treated C57Bl/6 LEGSKO (lens GSHsynthesis knockout) mice compared to wild-type controls, thus providing a more relevant model of oxidative damage. Among 24,415 mapped reads, 441 genes showed significantly modulated expression, including genes involved in epithelial-mesenchymal transition (EMT) signaling, the visual cycle, and lipid metabolism. Several detoxification genes were upregulated, including the aldehyde dehydrogenases Aldh1a1 and Aldh3a1, the metallothioneins Mt1 and Mt2, the carboxylesterase Ces1g, and the urea transporter Slc14a1, whereas genes encoding lens crystallins and other vision-related genes were downregulated [22]. The authors concluded that GSH deficiency in the lens leads to the expression of detoxifying genes and the activation of EMT signaling (providing evidence of adaption to the loss of antioxidant capacity) and also revealing a pathogenetic mechanism of cataract formation.
Several of the most strongly modulated genes we identified are involved in signaling, indicating the activation of cell-cell signaling as a response to lens deterioration. The top-ranking downregulated gene was RCAN1 (16fold repression), which encodes a calcineurin-binding regulator of CNS development. The relevance of this gene in cataracts is unclear, but may be linked to the vascular remodeling discussed above given its role in the suppression of angiogenesis [44]. Other genes we identified appear to be involved in signaling pathways involving G-protein-coupled receptors (GPCRs). For example, CYSLTR1 (induced 1.3-fold) encodes a leukotrienespecific GPCR typically involved in bronchoconstriction, but it also regulates vascular permeability, cell migration and collagen deposition, which may be the more relevant functions here [45]. The UV irradiation of lens epithelial cells was previously shown to increase the expression of LET-7B, the ligand for another GPCR (LGRT), resulting in the induction of apoptosis [46]. A targeted deletion in LGR4 reduced the resistance of rat lens epithelial cells to oxidative stress and accelerated the development of agerelated cataracts [47]. RASD1 (induced 4.3-fold) encodes a member of the Ras family of G-protein regulators acting downstream of GPCRs, and its upregulation in human cell lines has been shown to suppress cell growth and induce apoptosis [48,49]. Similarly RANBP9 (induced 2.7-fold) encodes a Ras protein that influences cytoskeletal organization and interacts with several regulators of cell growth [50]. PDE4A (induced 1.6-fold) encodes a cAMP-specific 3′,5′-cyclic phosphodiesterase that regulates second messenger signaling downstream of GPCRs by cleaving cAMP [51]. Other modulated genes encoding signaling proteins potentially involved in the regulation of cell growth included PTP4A3, encoding a membrane-associated protein tyrosine phosphatase (induced 2.8-fold) [52], ANXA3, encoding annexin 3 (downregulated 1.75-fold) [53], and DUSP22, encoding a dual-specificity protein kinase (induced 1.42-fold) [54]. We also observed the 1.92-fold downregulation of the MYBL1 gene, whose principal function is the regulation of meiosis, so its relevance in the context of cataract formation is unclear.
Mutations in the HSF4 gene have been shown to cause cataracts in humans and other mammals [55], including the recent discovery of a novel HSF4 mutation in pandas [56]. HSF4 is a transcriptional repressor and we proposed that the novel mutation we discovered in this gene is likely to affect its interactions with upstream signaling components, thus disrupting the genetic control of lens functions. Although HSF4 was not among the differentially expressed genes we detected in this study, we found that both FOS and FOSB were strongly upregulated (4.5-fold and 4.9-fold, respectively). These genes encode two leucine zipper proteins that dimerize with members of the JUN family to form transcriptional regulators [57]. Importantly, the FOS family of transcription factors is involved in the regulation of many of the processes discussed above revealed by the functional analysis of other differentially expressed genes, including cell proliferation, differentiation and survival, oxidative stress and angiogenesis [58]. It is therefore possible that FOS/FOSB play a key role in the regulation of the other genes discussed above, or that the FOS/FOSB genes are targets of the modulated signaling pathways we have identified.
The analysis of GO enrichment provided results that were broadly consistent with the functional annotation of the differentially expressed genes, with cellular process, metabolic process, biological regulation/regulation of a biological process as the most overrepresented categories in the differentially expressed gene catalog. The categories immune system process and multi-organism process were also represented, which may reflect the modulation of signaling proteins that are also known to participate in the regulation of immunity responses, including cytokine release and immune cell recruitment. Another enriched category was protein-containing complex, which agreed with our preliminary protein network analysis and is consistent with the tendency of signaling proteins to form complexes. Finally, functional annotation by screening the differentially expressed genes against the KEGG database generated several hits in the histidine metabolic pathway, which is particularly interesting given that dietary histidine and/or carnitine are known to prevent cataract development in salmon, presumably by countering the effect of oxidative stress [59,60]. Taken together, our results confirm that agerelated cataracts in pandas bear many of the same hallmarks as cataracts in other animals, including the modulation of stress response genes, metabolic adaptations, and signaling pathways regulating cell growth and apoptosis. The identification of genes that are overexpressed or suppressed during the formation of cataracts could lead to the development of markers for early diagnosis, preventative strategies and therapies that improve the quality of life for captive giant pandas and other mammals.

Conclusions
Blood samples from six giant pandas with and without cataracts were used for de novo RNA-Seq analysis. This revealed the differential expression of several genes related to oxidative damage, the visual cycle, developmental functions, and lipid metabolism, suggesting that ageing pandas may develop cataracts as the natural capacity for oxidative stress responses begins to diminish. We found that many of the genes encoded regulatory and signaling proteins associated with the control of cell growth, migration, differentiation and apoptosis, supporting previous research indicating a key role for apoptosis in cataract formation. The identification of genes potentially involved in the formation of age-related cataracts could facilitate the development of predictive markers, preventative measures and even new therapies to improve the life of captive animals.

Clinical findings
Routine physical examinations were carried out every month on the living captive animals, including eye, mouth and nose and general physical appearance, abdominal palpation, and general clinical signs. Blood was collected once a month for the analysis of physiological and biochemical indicators in order to exclude risk factors such as injury, diabetes or other diseases that can promote cataract formation [36].

RNA isolation
Peripheral blood samples (2 ml) were collected from all six giant panda specimens (Table 1) because markers for many diseases can be detected in blood using transcriptomics methods [61] including eye diseases [62][63][64]. JN and LL were from Beijing Zoo, XX and YE were from Chongqing Zoo, and BD and YY were from Strait (Fuzhou) Giant Panda Research and Exchange Center. Three of the females were diagnosed with age-related cataracts and the other three donors were healthy controls. Blood samples were stored at 4°C and total RNA was extracted within 2 days using the Trizol Total RNA Extractor kit (Sangon Biotech, Shanghai, China) according to the manufacturer's protocol. The samples were treated with RNase-free DNase I to remove genomic DNA. RNA integrity was evaluated by 1.0% agarose gel electrophoresis, and RNA quality and quantity were assessed using a NanoPhotometer (Implen, Westlake Village, CA, USA) and an Agilent 2100 Bioanalyzer (Agilent Technologies, Santa Clara, CA, USA).

Library preparation and sequencing
High-quality RNA samples were sent to Sangon Biotech for library preparation and sequencing. The libraries were generated from 2 μg RNA using the VAHTS mRNA-seq V2 Library Prep Kit for Illumina (Vazyme Biotech, Nanjing, China) and index codes were added to attribute sequences to each sample. Briefly, mRNA was purified from total RNA using poly-T oligo-attached magnetic beads. Fragmentation was carried out using divalent cations at 94°C in VAHTS 5× First Strand Synthesis Reaction Buffer, and first-strand cDNA was synthesized using random hexamers and M-MuLV reverse transcriptase RNase H − . Second-strand cDNA was synthesized using DNA polymerase I and RNase H, which also created blunt ends for 3′ polyadenylation. Following adapter ligation, cDNA fragments in the size range 150-200 bp were selected using the AMPure XP system (Beckman Coulter, Brea, CA, USA). The sizeselected and adaptor-ligated cDNA was incubated with 3 μL USER enzyme mix (New England Biolabs, Ipswich, MA, USA) for 15 min at 37°C then 5 min at 95°C. We then carried out PCR using Phusion high-fidelity DNA polymerase, universal PCR primers i7 (5′-CAA GCA GAA GAC GGC ATA CGA GAT-index primer-GTG ACT GGA GTT CAG ACG TGT GCT C-3′) and i5 (5′-A*A TGA TAC GGC GAC CAC CGA GAT CTA CAC-index primer-ACA CTC TTT CCC TAC ACG ACG CTC TTC CGA T*C-3′) and sample-specific index primers ( Table 4). The PCR products were purified (AMPure XP system) and library quality was assessed using the Agilent Bioanalyzer 2100 system. The libraries were then quantified and pooled. Paired-end sequencing was carried out using a HiSeq XTen device (Illumina, San Diego, CA, USA).

Data assessment and quality control
The quality of the sequence data was determined using FastQC v0.11.2. Raw reads were filtered using Trimmomatic v0.36 in five steps. First, the adaptor sequences were removed. Next, low-quality bases (Q < 20) were removed from reads in the 3′ to 5′ direction and then in the 5′ to 3′ direction. Then we used a sliding window method to remove low-quality bases from the read tails (window size = 5 bp). Finally, any reads smaller than 35 nt were removed along with the corresponding paired reads.

Alignment with reference genome
Sequences randomly selected from clean data were used as blastn queries against the NCBI nucleotide database (http://ncbi.nlm.nih.gov/). Hits with an Evalue cutoff of ≤1 × 10 − 10 , similarity > 90% and coverage > 80% of the results were used to calculate the species distribution and pollution detection. The remaining clean reads were mapped to the reference genome using HISAT2 v2.0 with default parameters. RSeQC v2.6.1 was used for the statistical analysis of the alignment results.

Gene structure analysis
The sequence of each chromosome was compared and mapped statistically to show the distribution of sequences using BEDtools v2.26.0. After comparing the reads to the reference genome, the proportion of each gene structure in the reads was counted, including exons, introns and intergenic DNA, using Qualimap v2.2.1. Only 30 chromosomes were displayed. BCFtools v1.5 was used to find SNPs and SnpEff v2.36 was used to determine their effects. The sequences were filtered according to the mass value (> 20) and coverage (> 8).
The sequences were assembled using StringTie v1.3.3b, and compared with existing genomic data in the STRING database (http://string-db.org/) using GffCompare v0.10.1 to find new transcription regions. The predicted transcripts were used as gene models to identify variable splicing. ASprofile v1.0.4 is used to classify variable splicing events according to the predicted gene model of each sample.

Expression analysis
StringTie v1.3.3b was applied to the transcriptionally assembled reference genome to determine gene expression levels by measuring transcript abundances as transcripts per million reads (TPM) for both the protein-coding genes and lncRNAs in each sample. After quality control, the sequence was compared with the reference genome using HISAT2 v2.1.0 and the results were statistically compared using RSEQC v2.6.1, with the feature count used to homogenize the read count matrix (gene length and sequence depth) after each gene count to determine the TPM. Principal component analysis (PCA) and principal co-ordinates analysis (PCoA) were used to determine the distance and difference between samples in vegan v2.0.10. Data were presented as TPM avoid the influence of gene length or sequencing discrepancies during sample comparison. We used DESeq2 v1.12.4 to identify genes that were differentially expressed between two samples. Differential expression was considered significant if the false discovery rate (qvalue) was ≤0.05 and the log2 fold change (|FC| value) was ≥2. If the normalized expression of a gene between two samples was zero, its expression value was adjusted to 0.01 (because 0 cannot be plotted on a log plot). If the normalized expression of a gene in two libraries was < 1, further differential expression analysis was conducted without this gene.

Functional analysis of differentially expressed genes
Given the absence of biological replicate samples, we first standardized all the data with TMM. This removed all the unexpressed genes, identified a sample with average data trends from many samples as the reference sample, calculated the total number of reads for all samples, and divided each sample by its own total number of reads to get the modified number of reads. The Q3 value (the third quartile) of each sample's modified read number was calculated, averaged, and the sample with the smallest difference from the average Q3 was used as the reference sample. To find the representative gene sets in each sample and calculate the standardization factor of the sample, we referred to the fold change of these representative gene sets. Differentially expressed genes were identified by standardizing the read count data in DESeq2 v1.12.4 as described above. TopGO v2.24.0 was then used for GO enrichment analysis and to prepare the significant GO-directed acyclic graph. ClusterProfiler v3.0.5 was used to analyze KEGG pathways and KOG taxonomic enrichment analysis, allowing the construction of a network diagram. Functional enrichment analysis was carried out to identify differentially expressed genes significantly enriched in GO terms (biological, cellular and molecular functions) or KEGG metabolic pathways. Genes were mapped to the GO database (http://www. geeontology.org) and KEGG database (http://www. kegg.jp) [65][66][67], the number of genes representing each term or pathway was calculated, and hypergeometric tests were performed to identify significantly enriched GO terms or KEGG pathways in the gene list. GO terms and KEGG pathways were considered significant if the q-value was ≤0.05. The results of differential gene expression analysis were visualized using DESeq2 v1.12.4 and mapped to the STRING protein interaction network database (http:// string-db.org/) to construct the protein interaction network. Based on these results, Venn diagrams were prepared with VennDiagram v1.6.17 in the R package. The correlation of gene expression levels between