Genome-wide identification and expression analysis of the calmodulin-binding transcription activator (CAMTA) gene family in wheat (Triticum aestivum L.)

Background Plant calmodulin-binding transcription activator (CAMTA) proteins play important roles in hormone signal transduction, developmental regulation, and environmental stress tolerance. However, in wheat, the CAMTA gene family has not been systematically characterized. Results In this work, 15 wheat CAMTA genes were identified using a genome-wide search method. Their chromosome location, physicochemical properties, subcellular localization, gene structure, protein domain, and promoter cis-elements were systematically analyzed. Phylogenetic analysis classified the TaCAMTA genes into three groups (groups A, B, and C), numbered 7, 6, and 2, respectively. The results showed that most TaCAMTA genes contained stress-related cis-elements. Finally, to obtain tissue-specific and stress-responsive candidates, the expression profiles of the TaCAMTAs in various tissues and under biotic and abiotic stresses were investigated. Tissue-specific expression analysis showed that all of the 15 TaCAMTA genes were expressed in multiple tissues with different expression levels, as well as under abiotic stress, the expressions of each TaCAMTA gene could respond to at least one abiotic stress. It also found that 584 genes in wheat genome were predicted to be potential target genes by CAMTA, demonstrating that CAMTA can be widely involved in plant development and growth, as well as coping with stresses. Conclusions This work systematically identified the CAMTA gene family in wheat at the whole-genome-wide level, providing important candidates for further functional analysis in developmental regulation and the stress response in wheat.


Background
Ca 2+ signals, one of the most important secondary messengers in plants, are widely involved in many adaptive and developmental processes [1]. In plants, there are three main classes of Ca 2+ sensors to decode and transmit the Ca 2+ signals, including calmodulin (together with calmodulin-like proteins) (CaMs/CMLs), calcium-dependent protein kinases (CDPKs), and calcineurin B-like proteins (CBLs) [2]. Most of the calmodulin/calmodulin-like proteins execute their biological functions by binding to calmodulin-binding proteins (CaMBPs), including transcription factors, protein kinases, ion channels, and enzymes, with the exception of CaM7, which can act as a transcription factor to directly regulate the expression of the HY5 gene [3]. To our knowledge, plant CaMs can regulate at least 90 transcription factors, including calmodulin-binding transcription activators (CAMTAs) [4].
CAMTAs have been shown to be extensively involved in plant growth and developmental regulation, as well as in biotic and abiotic stress tolerance. In Arabidopsis, CAMTA1 and CAMTA2 work in concert with CAMT A3 to directly bind to the promoter of C-repeat binding factor2 (CBF2) to induce expression, leading to increased plant freezing tolerance [16,17]. While AtCAMTA1 also positively regulates drought responses by regulating a few stress-responsive genes, including responsive to dehydration26 (RD26), early response to de-hydration7 (ERD7), responsive to ABA18 (RAB18), lipid transfer proteins (LTPs), cold-regulated78 (COR78), CBF1, and heat shock proteins (HSPs) [18], AtCAMTA3 can act as a negative regulator of plant immunity to modulate pathogen defense responses by activating the EDS1-mediated salicylic acid (SA) signaling [19]. A recent study showed that TaCAMTA4 may function as a negative regulator of the defense response against Puccinia triticina, since the virus-induced gene silencing (VIGS)-based knockdown of TaCAMTA4 resulted in the enhanced resistance to P. triticina race 165 [20]. This suggested that one CAMTA member usually participates in multiple signaling pathways, while multiple CAMTA members often work together to participate in one signaling pathway.
Here, we obtained 15 TaCAMTA genes from wheat genomes. Their chromosome location, physicochemical properties, subcellular localization, gene structure, protein domain, promoter cis-elements, and expression profiles in multiple tissues as well as in response to stresses were systematically analyzed. Our work has established a foundation for the further analysis of wheat CAMTA genes and provides a basic understanding of their roles in development and stress responses.

Results and discussion
Identification of the TaCAMTA gene family in wheat Using the method described below, a total of 15 TaCAMTA genes were identified in wheat. Since the TaCAMTA genes were clustered into six homoeologous groups, these genes were designated as TaCAMTA1 to TaCAMTA6 according to their homology with rice CAMTA genes, plus a suffix corresponding to the specific wheat genome identifier (A, B, or D) for each gene name (Table 1, Fig. 1). For example, the TaCAMTA1 genes in genomes A, B, and D were named TaCAMTA1-A, TaCAMTA1-B, and TaCAMTA1-D, respectively. The results showed that TaCAMTA1, 2, 3, and 4 contained three homolog genes ( TaCAMTA1-A Table 1. The predicted TaCAMTA proteins contain 805 (TaCAMTA1-B) to 1067 (TaCAMTA2-B) amino acid residues, with molecular weights ranging from 90.82 kDa (TaCAMTA1-B) to 119.32 kDa (TaCAMTA2-A), and the isoelectric points ranged from 5.14 (TaCAMTA4-B) to 8.96 (TaCAMTA5-A) ( Table 1).
The size of the CAMTA gene family in wheat is similar to that of oilseed rape (B. napus) and soybean (G. max) with 18 and 15 members [12,13], respectively, but is higher than that of A. thaliana with six members, citrus (C. sinensis and C. clementina) with nine members, maize (Z. mays) with nine members, and alfalfa (M. truncatula) with seven members [5,11,14,15]. The higher number of CAMTA genes may be due to gene duplication during chromosome polyploidization, since oilseed rape and soybean are tetraploid, whereas wheat is allohexaploid (AABBDD).
The subcellular locations were predicted with Plant-mPLoc. According to the results, all 15 wheat CAMTA proteins were located in the nucleus, which corroborates recent studies where the CAMTA proteins have typically been located in the nucleus [4,21], confirming that their main function is to regulate the expression of other genes as transcription factors.

Phylogenetic analysis of the TaCAMTAs
To investigate the phylogenetic relationships of the CAMTA gene families, a phylogenetic tree of CAMTAs from five species, including wheat, Triticum urartu, Aegilops tauschii, A. thaliana, and rice, was constructed using the neighbor-joining (NJ) algorithm. The CAMTA gene families were highly conserved during the evolution of these species (Fig. 1). All of the 36 proteins from the five species were distinctly clustered into three groups (groups A, B, and C). Seven wheat CAMTAs An unrooted phylogenetic tree was constructed using MEGA-X with the NJ algorithm and 1000 bootstrap replicates. The bootstrap values are displayed next to the branches, and the wheat CAMTAs are marked in red. The CAMTA gene ID numbers are listed as follows:     IQ motif (Pfam00612), and a calmodulin-binding domain (CaMB) (Fig. 3). Additionally, five TaCAMTA proteins (TaCAMTA1-A/B/D and TaCAMTA5-A/D) contained all of the conserved domains except for the TIG domain, which is consistent with previous studies that CAMTAs can be divided into two groups based on whether the TIG domain is present [22]. It has been confirmed that the IQ motif is able to bind with CaM in a Ca 2+ -independent manner, while the CaMB domain interacts with CaM in a Ca 2+ -dependent way [5,7,8]. It is interesting to note that all the wheat CAMTAs contain the IQ motif and a CaMB domain, indicating that wheat CAMTAs may interact with CaM in both a Ca 2+ -dependent and Ca 2+ -independent manner.
The results showed that there were various known stresses/stimuli-related cis-acting elements that existed in the promoter regions of the 15 TaCAMTA genes. ABRE, SARE, W-box, and CG-box could be found in the promoter of all the 15 TaCAMTA genes, and four TaCAMTAs (TaCAMTA1-D, 3-B, 4-A, and 4-D) contained all seven types of cis-elements in the promoter region, including ABRE, SARE, G-box, W-box, P1BS, SURE, and CG-box. Meanwhile, the remainder of the 11 TaCAMTA genes contained at least five cis-elements in their promoter region (Table 2). It has been reported that more stress-related cis-elements are located in the promoter regions of wheat CAMTA genes than other plant species [13,14], indicating that wheat CAMTA genes may be more widely involved in the plant response to stress.

Tissue-specific expression patterns of the TaCAMTA genes
To elucidate the possible functions of the TaCAMTA genes in wheat, qRT-PCR assay was performed to investigate the spatial expression patterns of the TaCAMTAs. The results showed that all of the 15 TaCAMTA genes were expressed in multiple tissues with different expression levels. TaCAMTA3-D, 5-A, and 5-D showed highest expression level in shoot during seedling stage, while highest expression level of TaCAMTA1-D and 3-B was observed in spike during reproductive stage, suggesting that various CAMTA gene members maintain different functions in wheat growth and development (Fig. 4). Expression of TaCAMTAs were analyzed by qRT-PCR in root and shoot of ten-day-old seedlings, root, stem, leaf, spike at flowering in reproductive stage, and grain 15 DAA (days after athesis). The relative expression levels were normalized to 1 in roots of ten-day-old seedlings (0 h).

Expression profiles of the TaCAMTA genes during abiotic stress
Previous studies have shown that plant CAMTAs could be involved in diverse environmental stresses. AtCAMTA1 and SlSR1L played a positive function in drought stress in Arabidopsis and tomato [18,30], while plant CAMTAs also respond to salt and cold stress [11,16,31]. However, to date there is no information available on wheat CAMTAs involved in abiotic stresses. In this light, the expression profiles of the TaCAMTAs were analyzed under drought, NaCl, cold and heat stress. Under drought stress, TaCAMTA1 (Fig. 5b). In the cold treatment assay, the expressions of TaCAMTA1-A, 1-D, 3-A, and 3-D increased dramatically, while the expressions of TaCAMTA2-A, 4-A, 4-B, and 4-D decreased (Fig. 5c). In the heat treatment group, the expressions of TaCAMTA1-A, 1-B, 1-D, 2-A, and 4-B remarkably increased within one hour; by contrast, the expressions of TaCAMTA2-B, 2-D, 3-B, 4-A, 5-A, 5-D, and 6-B were repressed, especially in the late stage of heat treatment (Fig. 5d).
It can been found that the expression of each TaCAMTA gene could respond to at least one abiotic stress, and TaCAMTA1-A and 1-D could be upregulated by all abiotic stresses used in this study, including drought, NaCl, cold and heat stress (Fig. 5), implying different regulations and functions of TaCAMTA gene members while coping with various abiotic stresses in wheat. It can also been found that the CAMTA genes from same homoeologous group showed similar expression patterns, such as TaCAMTA1-A/B/D under drought treatment (Fig. 5a), TaCAMTA5-A/D under NaCl treatment (Fig. 5b), and TaCAMTA1-A/B/D under heat shock stress (Fig. 5d). However, several homoeologous CAMTA genes from same group showed different expression patterns under stresses. For example, TaCAMTA1-A/D and TaCAMTA3-A/D were upregulated by cold treatment, while the expressions of TaCAMTA1-B and TaCAMTA3-B were relatively stable (Fig. 5c). These results suggest that the homoeologous CAMTA genes from the same group generally have the same regulations and functions, while functional differentiation may have occurred in some homoeologous CAMTA genes.
Expression of TaCAMTAs were analyzed by qRT-PCR in roots of ten-day-old seedlings, which had been treated with 16.1% PEG 6000 (drought), 200 mM NaCl, 4°C (cold) and 40°C (heat) for indicated durations. The relative expression levels were normalized to 1 in unstressed plants (0 h).

Prediction of target genes by CAMTA
It has been found that CAMTA has the specific binding activity to CGCG box in promoter of target genes [5]. In this study, a search of the data base revealed that cis-acting elements ACGCGG/CCGCGT were present in the promoter regions of about 584 genes (more than two copies) in wheat genome, which were considered as potential target genes by CAMTA (Additional file 1: Table  S1). These genes are related to RNA regulation (69 genes), protein degradation (42 genes), signalling transduction (30 genes), biotic and abiotic stresses (17 genes), hormone metabolism (17 genes), and lipid metabolism (13 genes), demonstrating that CAMTA can be widely Table 2 Numbers of stress-related cis-elements in the promoter regions of the TaCAMTA genes ABRE SARE G-box W-box P1BS SURE CG-box TaCAMTA1-A 11  1  1  4  1  0  17 TaCAMTA1-B 4 3 0 5 1 1 1 1 TaCAMTA1-D 7  7  2  3  1  3 1 0 TaCAMTA2

Conclusions
In conclusion, 15 CAMTA genes were identified in wheat in the present study. Analysis of the gene structure and protein domain, physicochemical properties, and the phylogenetic relationships indicated that the CAMTA gene family was highly conserved during plant evolution. Tissue-specific expression analysis showed that all of the 15 TaCAMTA genes were expressed in multiple tissues with different expression levels, suggesting that various CAMTA gene members maintain different functions in wheat growth and development. Under abiotic stress, the expressions of all the TaCAMTA genes could respond to at least one abiotic stress, implying different regulations and functions of TaCAMTA gene members while coping with various abiotic stresses in wheat. 584 genes in wheat genome were predicted to be potential target genes by CAMTA, demonstrating that CAMTA can be widely involved in plant development and growth, as well as coping with stresses. Our findings provide new insight into the CAMTA gene family in wheat as well as a foundation for further studies on the roles of TaCAMTA genes in wheat development and growth as well as the stress response.

Methods
Genome-wide identification of the CAMTA gene family Protein sequences of Triticum aestivum (IWGSC1.1), Triticum urartu (ASM34745v1), and Aegilops tauschii (ASM34733v1) were obtained from the Ensemble plant database (http://plants.ensembl.org) to predict the CAMTA genes [32]. The Hidden Markov Model (HMM) profile of the CG-1 domain (PF03859), the ANK repeat domain (PF00023), and the IQ domain (PF00612) sequences were downloaded from the PFAM database [33] and used to examine all wheat protein sequences using the HMMER search tool with E-value <= 0.0001 [34]. The obtained protein sequences were checked using the National Center for Biotechnology Information (NCBI) -Conserved domain database (CDD) search (https://www. ncbi.nlm.nih.gov/cdd) to identify the conserved protein domain with the default parameters. The redundant sequences containing complete CG-1, ANK repeats, and the IQ domain were further removed by alignment, and the remainder were considered as putative CAMTA genes. Finally, the biochemical parameters of the TaCAMTA proteins were calculated using the Compute pI/MW tool in the ExPASy database with the default parameters (https:// web.expasy.org/compute_pi/). Subcellular localization of the TaCAMTA proteins was predicted online by Plant-mPLoc with the default parameters (http://www.csbio.sjtu. edu.cn/bioinf/plant-multi/) [35].

Phylogenetic tree construction and sequence analysis
Protein sequences from Arabidopsis and rice were obtained from NCBI (http://www.ncbi.nlm.nih.gov/) and Ensembl Plants (http://plants.ensembl.org/index.html). The amino acid sequences of all CAMTAs were aligned using the ClustalX program with the default parameters, and a phylogenetic tree was constructed in MEGA-X using the neighbor-joining method with 1000 bootstrap replicates [36]. The display of the phylogenetic tree was optimized using the Interactive Tree Of Life (iTOL) v4 [37]. The schematic structures of the TaCAMTA genes were analyzed online using the Gene Structure Display Server 2.0 based on exon/intron data (GSDS 2.0, http:// gsds.cbi.pku.edu.cn/) [38]. The domain structures of the TaCAMTA proteins were analyzed in the Pfam database (http://pfam.janelia.org/) and NCBI Conserved Domains Search online tool against database CDD v3.18 with Evalue threshold <= 0.01 (https://www.ncbi.nlm.nih.gov/ Structure/cdd/wrpsb.cgi) [39]. The CaMB domain was specifically analyzed using Calmodulin Binding Site Search in the Calmodulin Target Database (http://calcium.uhnres.utoronto.ca/ctdb/ctdb/).   [42]. All experiments were performed with three technical replicates and three biological replicates, and the data were represented by mean value of three biological replicates.

Prediction of target genes by CAMTA
Prediction of the target genes by CAMTA were performed as described by Yang and Poovaiah (2002) with some modifications [5]. 1-kb sequences upstream of the initiation codon (ATG) of all genes in wheat genome were collected as promoter sequences, and a search of cis-acting elements ACGCGG/CCGCGT (CGCG box) were conducted. The genes with more than two copies of CGCG box were considered as potential target genes by CAMTA. The MapMan tool was used to facilitate the assignment of different gene sets into functional categories (BINs). A MapMan mapping file that mapped the genes into BINs via hierarchical ontologies through the searching of a variety of reference databases was generated using the Mercator tool (http://mapman.gabipd. org/web/guest/app/mercator) [43].
Additional file 2: Table S2. Primer sequences of TaCAMTA and TaActin genes used for qRT-PCR analysis.