Genome-wide analysis of plant specific YABBY transcription factor gene family in carrot (Dacus carota) and its comparison with Arabidopsis

YABBY gene family is a plant-specific transcription factor with DNA binding domain involved in various functions i.e. regulation of style, length of flowers, and polarity development of lateral organs in flowering plants. Computational methods were utilized to identify members of the YABBY gene family, with Carrot (Daucus carota) ‘s genome as a foundational reference. The structure of genes, location of the chromosomes, protein motifs and phylogenetic investigation, syntony and transcriptomic analysis, and miRNA targets were analyzed to unmask the hidden structural and functional characteristics YABBY gene family in Carrots. In the following research, it has been concluded that 11 specific YABBY genes irregularly dispersed on all 9 chromosomes and proteins assembled into five subgroups i.e. AtINO, AtCRC, AtYAB5, AtAFO, and AtYAB2, which were created on the well-known classification of Arabidopsis. The wide ranges of YABBY genes in carrots were dispersed due to segmental duplication, which was detected as prevalent when equated to tandem duplication. Transcriptomic analysis showed that one of the DcYABBY genes was highly expressed during anthocyanin pigmentation in carrot taproots. The cis-regulatory elements (CREs) analysis unveiled elements that particularly respond to light, cell cycle regulation, drought induce ability, ABA hormone, seed, and meristem expression. Furthermore, a relative study among Carrot and Arabidopsis genes of the YABBY family indicated 5 sub-families sharing common characteristics. The comprehensive evaluation of YABBY genes in the genome provides a direction for the cloning and understanding of their functional properties in carrots. Our investigations revealed genome-wide distribution and role of YABBY genes in the carrots with best-fit comparison to Arabidopsis thaliana. Supplementary Information The online version contains supplementary material available at 10.1186/s12863-024-01210-4.


Introduction
YABBY plant-specific transcription factors (PSTFs) gene family plays an important role in the development of plants i.e. regulation of style length in flowering plants [1] resistance against abiotic stresses [2], polarity development in plant's lateral organs [3] developmental processes of vegetative and reproductive organs [4], initiating signals responsible for plant hormonal reactions [5] development of vascular organs [6] development of nectary [7] and germination of seed and processes after germination [8,9].The DcYABBY genes are members of the YABBY superfamily having functionally important domains i.e., Hmg_box and Hmg_box2.These two domains and the YABBY domain contain highly conserved amino acid residues that function in specific DNA binding [10].
The Carrot (D. carota L.) is a vital biennial vegetable in Apiaceae family.The family Apiaceae also possess several members i.e.Fennel (F.vulgare), celery (A.graveolens), parsley (P.crispum), cilantro (C.sativum) and dill (A.graveolens) [11,12].Carrot is a cool-season biennial crop used for domestic, commercial, and medicinal purposes initially and cultivated for over 2000 years.It contains sufficient vitamins and amino acids and helps improve eyesight, lowering cholesterol and improving digestion [13,14].Antioxidants like carotenoids & phenolic compounds are found in sufficient amounts in carrot, which are beneficial in several biological processes of the human body [15].While the amount of carotenoids differs noticeably between different genotypes of carrots, which could be due to the physiological and evolutionary distribution of genomics features [16,17].Carrots comprise phenolic components with only one aromatic ring (phenolic acids), 3-O-caffeoylquinic [18].For new marketable carrot varieties, sweetness was considered a significant factor for acceptance [19].There is a need to develop highly productive varieties of crops like carrot containing richer nutritional value to enhance the production of healthful foods across the globe [20].For a balanced, secure, and healthy diet, these foods must be accessible worldwide [21].Carrot faces several physiological damages due to drought [22,23].Therefore, we will also try to find out whether the YABBY transcription factor gene family can solve this problem.
The research aims to discover and describe the genes belonging to the YABBY PSTrFs gene family in the carrot genome using various bioinformatics tools [24].Concisely, an efficient approach was followed to find the YABBY genes family in carrots.This study unveiled YABBY genes, revealing their chromosomal locations, exon structures, and the presence of cis-regulatory elements, along with conserved domains.. Broad genomewide assessment of YABBY PSTrF gene family in carrot provides insights to unhide the functional and structural properties which can be used to strengthen the nutritional and food value of other horticulture crops.

Database search and sequence retrieval
It has been confirmed that the experimental data collection complied with relevant institutional, national, and international guidelines and legislation with appropriate permissions from authorities of the Department of Horticulture, University of the Punjab, Lahore, Lahore 54,300, Pakistan.The amino acid sequences of Plant-Specific Transcription Factors (PSTrFs), specifically YABBY, were obtained from the peptide genome of Arabidopsis thaliana through the Pfam database (Gene ID: PF04690).The YABBY gene's 164 amino acid sequences were separated from the Arabidopsis thaliana (Accession No. A0A1P8APE2).The following sequences were used in BLAST-P (Basic local protein alignment search tool) for heuristic search against carrot genome using the proteome database at Ensembl plants (https:// plants.ensem bl.org/ index.html) [25][26][27].The information on gene IDs, chromosomal position, and sequences of genes and proteins were retrieved.DcYABBY amino acid sequences subjected to motif finder (https:// www.genome.jp/ tools/ motif/) [28,29] and Conserved Domain Database (CDD) (https:// www.ncbi.nlm.nih.gov) National Centre for Biotechnology Information (NCBI) (https:// www.ncbi.nlm.nih.gov) [30,31] with customized parameters.The protein sequences that lack in the conserved domain of YABBY proteins were diminished from subjective investigations.

Gene structure analysis
To predict the genomic architecture of carrot YABBY genes, CDS and genomic sequences of DcYABBY genes retrieved from Ensembl plants [26,27].These sequences and the Newick format of the carrot phylogenetic tree were subjected to a Gene Structure Display Server (GSDS)(http:// gsds.gao-lab.org) [35].

Duplication and syntenic gene analysis
The alignment of protein sequences was conducted using Molecular Evolutionary Genetic Analysis (MEGA) with default parameters.The ratio between the Ka and Ks was predicted using TB tools, and genetic divergence time was calculated using the eq.T = Ks/2r.The "r" signifies a neutral substitution rate (5.2 × 10 −9 substitutions per site per year) [36,37].
Duplication events of DcYABBY genes were checked with the Multiple Collinearity Scan toolkit (MCScanX) with default settings [38,39].Dual synteny analysis of carrot was performed with three crops i.e.Arabidopsis, cucumber, and musk melon.A synteny graph of paralogous of DcYABBY genes was created with circos module using TB tools [40].

Transcriptomic analysis
To check the specific expressions of DcYABBY genes RNA-Seq data was downloaded from NCBI Geo (https:// www.ncbi.nlm.nih.gov › geo) [41][42][43].A log2 transformation was created to check genes' expression levels in the Reads per Kilo Base per Million (RPKM) values for different DcYABBY genes.Using TB tool, a heat map was generated to display the expression level of the different genes [40,44,45].

Analysis of microRNA target sites
The PmiREN webserver (https:// acade mic.oup.com) was utilized to acquire mature miRNA sequences for carrot species [46,47].To identify micro-RNAs targeting DcYABBY genes in carrots, the CDS sequences of DcYABBY genes were inputted into the miRNA and target section of the psRNA Target website (https:// bio.tools/ psrna target).Subsequently, the corresponding complementary miRNAs and their targets were retrieved from this analysis [48,49].

Identification of the YABBY genes in carrot
In total 22 DcYABBY proteins were identified from proteomic blasts in the carrot genome, and complete domain-possessing sequences were subjected to further investigations.Total 11 sequences of DcYABBY genes were selected for analysis.The range of amino acid length of DcYABBY genes was between 105 and 229 amino acids, while molecular weight was between 12.17 and 25.23 kDa.The DcYABBY8 is the shortest, and DcY-ABBY1 is the longest protein (Table 1).The pI value of the recognized proteins was extended from 6.82 to 9.16, and it might be due to the increasing number of hydrophobic amino acids.Subcellular localization of these 11 YABBY genes depicted that most of these genes were localized towards the nucleus, including a few to chloroplast and the least in the cytoplasm, as shown in the Fig. 1.

Gene architecture and conserved motifs analysis
Seven out of eleven genes comprised 7 exons and 6 introns, while two genes contained 6 exons and 5 introns, and one gene comprised 4 exons and 3 introns & the last gene contained 3 exons and 2 introns (Table S5, Fig. 2).The following coincidence and consistency in several introns and exons leads to the clue that these genes share common ancestors and structural and functional features.The genomic architecture showed that DcYABBY8 contained 3 introns (27.27%),DcYABBY7 contains 4 introns (36.36%), and DcYABBY10 have 5 introns (45.45%) while DcYABBY1, DcYABBY2, DcY-ABBY3, DcYABBY4, DcYABBY5, DcYABBY6, DcYABBY9 and DcYABBY11 contained 6 introns (54.54%) as shown in Fig. 2.There were elucidation and identification of 10 conserved motifs in 11 DcYABBY proteins by the motif identification.The YABBY domain was conserved in all the DcYABBY proteins with several mutations.The motif   S3, Fig. 3).

Phylogenetic analysis
A phylogenetic relationship tree was made among YABBY genes of D. carota, A. thaliana, C. sativus, and C. maxima.D. carota YABBY genes are highlighted with a small red triangle symbol.The figure shows the division of 37 YABBY genes of four different crops.The grouping is based on the typical Arabidopsis phylogenetic grouping system.The results of phylogenetic analysis depicted that 11 DcYABBY proteins were distributed among 5 subgroups named AtINO, AtCRC , AtYAB5, AtAFO/AtYAB3 and AtYAB2 (Fig. 4, Table S4).Group AtINO consists of total 6 YABBY proteins, including 1 from Arabidopsis i.e.AtINO, and the remaining is DcYABBY9, CmYABBY9, CmYABBY10, CsYABBY4, and CsYABBY8.AtCRC group consist of 7 YABBY-like proteins that are AtCRC , DcY-ABBY10, DcYABBY11, CmYABBY12, CmYABBY11, CmYABBY7 and CsYABBY3.The AtYAB5 group

Evaluation of gene duplication and gene mapping of carrot YABBY genes
The duplication date of DcYABBY genes was calculated using the TB tool v1.098669 (Fig. 6).The Ka/Ks ratio extended from 0.08888631 in DcYABBY7_DcYABBY8, to 0.1821759 in DcYABBY4_DcYABBY2 pair.The speculative date for segmental duplication date was calculated between 51.0916678 (Mya) for paralogous pair DcYABBY3_DcYABBY1 as highest, to 463.797915 Mya for paralogous pair DcYABBY7_DcYABBY8 as lowest.
The Ka/Ks ratios of all the 5 paralogous group pairs were greater than 0.05 and less than 1 ultimatley resulting in a significant divergence during purifying selection period (Fig. 6).

Analysis of Cis-regulatory elements
Various Cis-regulatory elements with different physiological and biological functions were observed.Many of these include light-responsive elements, specific responsive elements to abscisc acid, salicylic acid and gibberellins, anaerobic induction, meristem expression, seed-specific regulation, zien metabolism, and some defensive regulatory elements (Fig.   have RY-element which is mainly associated with seed regulation.On the contrary, 1 DcYABBY gene contained GCN4_motif, which takes account of endosperm expression, and 2 DcYABBY genes possess AT-rich elements involved in DNA binding protein ATBP-1. The MSA-like element was expressed by 1 DcYABBY gene, which regulates the cell cycle. 1 DcYABBY gene contained CCAT, which is a common binding site for MYBHv1 while 5 DcYABBY genes showed CGTCA-motif that is also involved in methyl jasmonic acid responsiveness, 4 DcYABBY genes have G-box that helps in responding to light, 5 DcYABBY genes contained GT1motif and 3 DcYABBY genes possessed MRE both of which are light-responsive element, O2-site was possessed by 3 DcYABBY genes which have a very important role in zien metabolism, p-box TATC box and TCA element are only contained by 1 DcYABBY gene and first two are gibberellin responsive elements and the last is salicylic acid responding element.TCCC-motif, TCT, and TGACG-motif contain 3 and 5 DcYABBY genes with varying functions (Figs. 7 and 8).
The physiological and biochemical functions with their orthologues in Arabidopsis of DcYABBY genes were studied with the help of gene ontology study (Table 3).

Transcriptomic analysis of carrot YABBY genes
Regarding gene expression among all the 11 DcYABBY genes, only 1 has been involved in anthocyanin pigmentation in the carrot taproots.DcYABBY9 (DCAR_007074) was expressed in dP2 POP and dP2 NPIP (Fig. 8).The extent of gene expression was slightly varied among these replicates.So it was concluded that DcYABBY9 helps build a dark purple color in the outer phloem of carrot taproot by influencing more anthocyanin pigmentation [41,42] (Fig. 9).

Putative miRNA targets in carrot
Consequently, 5 miRNAs target the three genes i.e.DcY-ABBY2, DcYABBY3 and DcYABBY5 of the total 11 DcY-ABBY genes.DcYABBY 2 is the gene targeted by 3 mature miRNAs with different PmiREN IDs.On the other hand, DcYABBY 3 and 5 were targeted by 1 of the same mature miRNA (Table 4).None of the mature miRNAs targeted the remaining 8 DcYABBY genes.So, this indicated that DcYABBY 2 was the individual gene targeted by the maximum number of mature miRNAs.While discussing based on groups, AtAFO was targeted by 4 mature miR-NAs.In contrast, the minimum number of miRNA targeted groups was AtYAB5, which was targeted by only 1 miRNA (Table 5).

Discussion
Plant specific Transcription factors (PSTrFs) are important molecules with spatio-temporal function and support during plant development and growth.PSTrFs are key in defining the fate of strong biological development and biochemical actions 22 .YABBY genes in carrots and other species act as TrFs and provide basic support during the developmental cycle.Phylogenetic and conserved sequences analysis of YABBY TrFs in Arabidopsis thaliana and eggplant of span into five families, including AtINO, AtCRC , AtYAB5, AtAFO/AtYAB3, AtYAB2.
Fig. 8 The graphical representation of Cis-regulatory elements of DcYABBY genes with intensity to their function at various levels via each gene's promoter region.The functional intensity can be defined with red to blue colours from higher to low level during biochemical and physiological plant development respectively The genomic identification of DcYABBY genes has been completed by comparing recently released genomic features from comprehensive plant repository Ensembl plants [26,27], [82] (Table 1).Phylogenic findings characterize 11 YABBY genes of A. thaliana into five groups AtINO, AtCRC , AtYAB5, AtAFO/AtYAB3, and AtYAB2 (Fig. 5, Table S4).The following distribution leads to new insights into less sequence-level conservation for YABBY carrot genes.The number of YABBY TFs in the carrot is less than other domestic and model plant i.e. rice possesses 30 OsYABBY, Arabidopsis; 36 AtYABBY, tomato; 34 SiYABBY [83] banana; 74 MaYABBY and [84] Chinese cabbage; 76 BrATYABBY [85,86].The less correlated number of introns and exons in these families depicts the purifying selection and evolutionary instability with divergent evolution.Higher introns in the plant genome provide information regarding its evolutionary and genomic stability.The genomic architecture and correlation in phylogeny depicted a clear picture of evolutionary correlation among various YABBY gene families [87,88].
The genomic feature of similar characters possessing genes had the same number of introns and exons at genomic level (Table S2).Same clades of DcYABBY have an almost similar number of exons and introns (Fig. 2) while various clades of different families have different number of introns and exons i.e.Arabidopsis, rice and soybean, suggesting conservation of characteristic sequences among them [89,90].
The conservation of sequence to function level has been assessed by identification of motif (Fig. 3) sequence among all DcYABBY genes at protein level spanning from 15 to 167 bp (Table S3) amino acids along with frequently existing HMG box domain (Table S5, S9).All members of DcYABBY proteins comprised Motif 1 and motifs named Hmg_box and Hmg_box2 are also residing, and at a functional level, HMG box is responsible for binding the DNA.The sequence-level investigations correspond to similarities at the sequence level, leading to functional and structural correlation.The preservation of evolutionary traits leads to the rearrangement and structuring of domains while maintaining consistent functionality.Confirming these functional similarities, gene ontology (GO) annotation of AtAFO genes in Arabidopsis thaliana has been undertaken.Evolutionary gene expansion might cause arrangements of the YABBY domains to have similar motif patterns in different groups.To recognize the possible function of the Group AtAFO, which contained five DcYABBY genes and several similar motifs, GO annotations of the Group AtAFO genes in Arabidopsis resulting in similarities among DcYABBY genes and AtAFO with transcriptional functions, cis-regulatory region binding, DNA-binding, protein binding and ion channelling (Table 3) [91,92].The structural arrangement of the DcYABBY genes was conserved among all the five divided groups of species i.e.Arabidopsis, Cucumber, and Musk melon [93].Furthermore, an investigation of subcellular localization among DcYABBY proteins using the online web tool WoLF PSORT [34] has been performed and resulted in nuclear localization of DcYABBY proteins to cytoplasm and chloroplast while these all were commonly present in the nucleus (Table S1).Segmental and tandem duplication was observed in the YABBY gene family at various chromosomes, which is a clear picture of genomic rearrangements during the evolutionary process.These rearrangements at the genome level lead to the development of new characters, i.e., conservative sequences and domains for sustaining the functional characteristics of plants [94].The best-known tandem and segmental duplication in carrot YABBY genes on chromosome 2 (Fig. 7A) and DcYABBY1 with DcYABBY3, DcYABBY2 with DcYABBY4 and DcY-ABBY10 with DcYABBY11 (Fig. 7B) have been found in this research.Segmental duplications are dominant in chickpea [93] pigeon pea [15,92], and in the YABBY gene family.These results indicate the main process of gene and conserved region expansion at the genomic level due to duplications of YABBY genes throughout the evolution of eukaryotic plants [95,96].The purifying and evolutionary selection at amino acid level and substitution ratio i.e.Ka and Ks (Fig. 5) support these findings that YABBY genes have evolved and retain their function through evolution.Ka/Ks < 1 ratio leads to purifying selection, and positive selection pressure leads to Ka/Ks > 1 values.This selection pressure by the biological clock and environment leads to the rearrangement of specific blocks and domains at the level, resulting in the origination of new characteristics across the species [97].In current investigations, variation among ratios of Ka /Ks between DcYABBY genes is less and predicted values of Ka/Ks ranges from 0.09 to 0.29 which are less than   [41, Cluster analysis and gene omnibus at NCBI were used to unhide the spatio temporal function of carrot YABBY genes and 1 out of 11 DcYABBY genes involved in anthocyanin pigmentation in the carrot taproot [41,42].DcYABBY9 (DCAR_007074) was highly expressed in dP2 POP and dP2 NPIP (Fig. 9) [98].Except for DcY-ABBY9, all other genes have no expression or function related to anthocyanin pigmentation.The cis-regulatory analysis also predicts that DcYABBY9 also has a role in light responsiveness, zein metabolism regulation, regulatory function related to meristem expression, involved in drought-induce ability, essential for the anaerobic metabolism during abiotic stress and defence responsiveness (Fig. 8, Table 2).The orthologue of DcYABBY9 in Carrot is AT1G23420 and AtINO, which are in the same group and have a role in DNA and metal ion binding.The orthologues of these three aforementioned Arabidopsis proteins are DcYABBY 10, 11 and 12 which can lead to conclude their similar functions in the Carrot plant as of its orthologues in Arabidopsis.
MicroRNAs are important in plant growth regulation processes extending from developmental to defending against pathogens and sustaining internal immunity [99][100][101][102].MiRNAs are present in most plant species in a conserved manner with specified functions.Most of the DcYABBY genes have transcriptional-associated functions, resulting in the suppression of activity to miRNAs.It is the only reason that three out of 11 DcY-ABBY genes were targeted by MIR408 and MIR168 family members (Table 5).MIR408 targeted two DcYABBY genes while MIR168 to one gene.These two micro RNAs targeted DcYABBY2, DcYABBY3, and DcYABBY5, respectively.DcYABBY2 was targeted by three miRNAs i.e.MIR168a and MIR168b, which reside on chromosome 1 and MIR168c at chromosome 9 of carrot.Meanwhile DcYABBY3 and DcYABBY5 were both targeted by MIR408a located on chromosome 1.This scenario provides a basis for the conclusion that most of their origin and activity are driven by chromosome 1.MiR408 is abundantly present in different plant species that specifically hits mRNAs related to copper-binding protein.Overexpression of MIR408 was shown to improve phenotypic properties of Arabidopsis by increasing leaf area, plant height, petiole length, flower size, and silique length, which ultimately enhances seed yield and biomass [103].MiR408 has diverse roles in Arabidopsis, from which we can assume that this micro RNA targeting DcYABBY genes can also play an important role in enriching carrot nutrients.Overexpression of miR408 triggered enhanced drought tolerance in chickpeas by causing plantacyanin transcript suppression, which regulates DREB and other genes related to drought response [104].In response to miR168, Argonaute (AGO1) is upregulated, activating the RNA silencing complex (RISC) in tomatoes to modulate the small RNA regulatory pathway [105].The suppression of miR168 by a target mimic (MIM168) not only improves grain yield and shortens rice flowering time but enhances immunity to Magnaporthe oryzae, the causal agent of rice blast disease.

Conclusion
This study comprehensively analyzed DcYABBY PSTrFs genes in the carrot genome.The 11 DcYABBY genes were classified into five groups, and some of the structural and functional properties of each DcYABBY member were characterized.Some of the DcYABBY genes were involved in taproot pigmentation.MiRNA data targeting the DcYABBY gene in anthocyanin pigmentation development in carrot suggest their role in growth and development.The in-depth computational analysis of carrot YABBY proteins revealed in the current study is the first step to undermining the hidden realities of YABBY proteins in carrots and in contrast to other crops.Complex interaction and cooperation at the functional level of YABBY proteins portray their expression level and interaction with different transcription factors.The presence of an almost similar number of YABBY genes i.e. 33 in (tomato), 34 (pepper), and 35 (potato), and a relatively higher number in other plants 78 in soybean and 51 in carrot suggested the variation in YABBY genes at a genomic, structural and functional level.

Fig. 1 Fig. 2
Fig. 1 Heat Map representing Sub-cellular localization of all 11 DcYABBY genes to various regions of the plant cell including nucleus, cytoplasm and chloroplast.Grey colour represents absence of respective gene in specific region, white colour is showing minimum functional presence of corresponding gene and Red colour represent maximum value of functionally important gene in that particular region

Fig. 3
Fig. 3 The distribution of 10 motifs along the 11 YABBY proteins family in carrot.Motifs is conserved throughout the YABBY protein family and are basic structural and functionally important regulator during transient interaction and activation of transcription factors

1 .Fig. 9
Fig.9 The Heat map of carrot YABBY genes responsible for pigmentation of anthocyanin are represented with higher intensity to low with red to blue colors.dP POP (Dark Purple Outer Phelom), dP NPIP (Dark Purple Inner), pP POP (Pale Purple Outer Phelom) and pP NPIP (Pale Purple Inner Phelom)

Table 1
Details of 11 non-redundant YABBY genes identified from the genome of Carrot AA Amino acid: MW Molecular weight: PI Isoelectric point: Chr Chromosome

Table 2 The
spatio-temporal functional distribution of YABBY gene's Cis-regulatory elements among various tissues and organs during plant biological development process

Table 3
DcYABBY genes have physiological and biochemical functions with their orthologues in Arabidopsis

Table 4
Representation of miRNAs with their targeting genes, length, starts and aligned sequence details

Table 5
Functions of miRNAs and their role in gene regulation during the developmental stages