Genetic diversity and population structure analyses of Plectranthus edulis (Vatke) Agnew collections from diverse agro-ecologies in Ethiopia using newly developed EST-SSRs marker system
BMC Genetics volume 19, Article number: 92 (2018)
Plectranthus edulis (Vatke) Agnew (locally known as Ethiopian dinich or Ethiopian potato) is one of the most economically important edible tuber crops indigenous to Ethiopia. Evaluating the extent of genetic diversity within and among populations is one of the first and most important steps in breeding and conservation measures. Hence, this study was aimed at evaluating the genetic diversity and population structure of this crop using collections from diverse agro-ecologies in Ethiopia.
Twenty polymorphic expressed sequence tag based simple sequence repeat (EST-SSRs) markers were developed for P. edulis based on EST sequences of P. barbatus deposited in the GenBank. These markers were used for genetic diversity analyses of 287 individual plants representing 12 populations, and a total of 128 alleles were identified across the entire loci and populations. Different parameters were used to estimate the genetic diversity within populations; and gene diversity index (GD) ranged from 0.31 to 0.39 with overall mean of 0.35. Hierarchical analysis of molecular variance (AMOVA) showed significant but low population differentiation with only 3% of the total variation accounted for variation among populations. Likewise, cluster and STRUCTURE analyses did not group the populations into sharply distinct clusters, which could be attributed to historical and contemporary gene flow and the reproductive biology of the crop.
These newly developed EST-SSR markers are highly polymorphic within P. edulis and hence are valuable genetic tools that can be used to evaluate the extent of genetic diversity and population structure of not only P. edulis but also various other species within the Lamiaceae family. Among the 12 populations studied, populations collected from Wenbera, Awi and Wolaita showed a higher genetic diversity as compared to other populations, and hence these areas can be considered as hot spots for in-situ conservation as well as for identification of genotypes that can be used in breeding programs.
Ethiopia is one of the countries in the world that has the greatest crop genetic diversity and considered as a primary gene center for several crops [1,2,3] including edible roots and tubers . However, the food potential of horticultural crops, particularly those of indigenous edible roots and tubers have not been fully exploited despite their significant contributions to the livelihood of subsistence farmers. Such crops are overlooked in terms of research and breeding and their production and management systems are restricted to local farmers’ varieties maintained by farmers using local knowledge [5,6,7].
Plectranthus edulis (Vatke) Agnew (Lamiaceae), also locally known as Ethiopian potato syno. Ethiopian dinich, is one of the most economically important indigenous tuber crops [2, 8] with wide distribution in parts of Africa, largely in wild form . It produces edible stem tubers on stolon below ground. The crop can thrive and set tubers under significant environmental constraints including degraded and poor soil and can produce reasonable yield without a need for intensive management practices .
In Ethiopia, this edible tuber crop is commonly cultivated by smallholder farmers around homesteads, alone or mixed with cereals, fruits and pulses, largely for household consumption and rarely for marketing of its tubers. It was one of the most common Ethiopian staple crops  and usually called ‘hunger crop’ as it fills the food supply gap that occurs from August to November, the period before the harvest of cereal crops. It has also been widely used as a folk medicine in several parts of Africa including Ethiopia  and is commonly visited by honeybees for its nectar .
Nowadays, local gene pool of the crop is under the threat of severe genetic erosion as it is disappearing from several areas where it used to be widely cultivated and restricted to highly marginalized land and limited to few elderly farmers in some other areas. The lack of improved planting material and limited awareness about the significance of this crop among younger farmers as well as the introduction of other high yielding tuber crops like Irish potato to the area, recurrent drought and environmental degradation have contributed to the genetic erosion of this crop [5, 14, 15].
Assessing the extent of genetic diversity of crop species is a vitally important step for their effective conservation and improvement. Molecular markers, such as simple sequence repeats (SSRs) are among the most important genetic tools for such purpose [16, 17]. SSRs derived from Expressed sequence tags (ESTs) (EST-SSRs) have been widely used in many plant species including root and tuber crops, as they are relatively fast and cost-effective to develop , have a well-conserved flanking sequences among phylogenetically closely related species and hence are highly transferable among related taxa, and are less susceptible to null alleles . Such cross-genera and cross-species transferable molecular markers are highly important genomic resources to study plant species with little or no DNA sequence information, such as P. edulis. Until recently, very few transferable EST-SSR markers have been developed within the family Lamiaceae [20, 21] and, so far, there is no report of such markers for the genus Plectranthus. Moreover, there is no publication on molecular marker based genetic diversity of P. edulis, which shows that researchers and plant breeders have not given enough attention for conservation and improvement of this crop. Consequently, its production, utilization, and improvement are highly restricted. Hence, this study was initiated with the aim of developing and validating EST-SSR markers for use in population structure and genetic diversity analyses of the crop, which eventually serve as a basis for its improvement and sustainable conservation.
A total of 174 tuber samples of cultivated P. edulis, representing 12 populations, were randomly collected from farmers’ fields with permission from individual farmers (Fig. 1, Table 1). The identity of the samples was confirmed based on the species description provided in Flora of Ethiopia and Eritrea . The collected samples were planted during the regular crop growing season (end of April) in 2016 at Holeta Agricultural Research Centre (which is a part of the Ethiopian Institute of Agricultural Research) field site located 40 Km west of Addis Ababa. This field site is located at a geographic position of 09o04’N, 38o 29′E and altitude of 2400 masl.
Leaf sample collection and DNA extraction
oung leaf tissue from 287 individual plants (one to three individual plants per tuber sample) (Table 1, Additional file 1) were separately collected in a ziplock bag, silica gel dried and transported to the Swedish University of Agricultural Sciences (SLU), Alnarp, Sweden. Genomic DNA was extracted from these samples using a modified Cetyl Trimethyl Ammonium Bromide (CTAB) protocol as described in Geleta et al. . The DNA quality was assessed using 1% agarose gel electrophoresis whereas NanoDrop® ND-1000 Spectrophotometer (Saveen Warner, Sweden) was used to determine the quantity of extracted DNA.
Data mining, designing and screening EST-SSR primers
Initially, 3263 P. barbatus EST sequences were retrieved from the National Centre for Biotechnology Information (NCBI public DNA sequence database (https://www.ncbi.nlm.nih.gov/nucest/?term=Plectranthus+barbatus), and sequences containing SSRs were screened using WebSat, a web-based software for microsatellite marker development  (http://purl.oclc.org/NET/websat/). After excluding redundant, overlapping and short sequences, about 300 sequences containing two to six SSR motifs were identified. Of these, 40 sequences were selected for designing primers using Primer3 primer designing program [25, 26] (http://bioinfo.ut.ee/primer3-0.4.0/).
The 40 newly designed primer-pairs were tested for amplification of their target genomic regions using 36 DNA samples representing the 12 populations of P. edulis. The polymerase chain reaction (PCR) products were electrophoresed on 1.5% agarose gel containing gel-red and visualized using Saveen Werner AB UV camera equipped with SSM930CE Sony Black and white Monitor. The size of amplified products was estimated by loading GeneRuler 50 bp DNA ladder together with the samples on separate lanes. Under optimized PCR conditions, 20 of the 40 primer-pairs consistently amplified their polymorphic loci, and hence were selected for use on all samples included in the present study (Table 2).
Pre-amplification and capillary electrophoresis
For cost effectiveness and improved quality of amplified products, the sequences of the 20 primer-pairs were modified as follows: (1) A 18-bp universal M13 primer sequence (5′-TGTAAAACGACGGCCAGT-3′) was added as a common tail to the 5′-end of all forward primers following Oetting et al. ; and (2) a PIG-tail sequence of 5′-GCTTCT-3′ was added to the 5′-end of the reverse primers to prevent non-templated addition of nucleotides to amplified products as described in Brownstein et al. . M13 primer 5′-end labeled with HEX™ or 6-FAM™ fluorophores was used as a third primer in each PCR amplification.
PCR was carried out using 96-well plates with 25 μl reaction volume [1× reaction buffer, 1.5 mM MgCl2, 0.3 mM dNTPs, 0.08 μM M13-tailed forward primer, 0.3 μM pig-tailed reverse primer, 0.3 μM 6-FAM or HEX labeled M13 primer, 1 U Dream Taq DNA Polymerase, and 25 ng template DNA]. A mixture of all the components except genomic DNA was included as a negative control. Amplification was performed using GeneAMP PCR 9700 thermocycler (Applied Biosystems Inc. USA) according to the following five-stage PCR protocol: (1) initial 15 min denaturation at 95 °C, (2) 35 cycles of 30 s denaturation at 94 °C, 30 s annealing at optimized annealing temperature for each primer-pair (see Table 1), and 30 s primer extension at 72 °C, (3) eight cycles of 30 s denaturation at 94 °C, 45 s annealing at 53 °C and 45 s primer extension at 72 °C, (4) additional primer extension at 72 °C for 10 min, and (5) a final 30 min primer extension at 60 °C. The PCR products were stored at 4 °C until they were electrophoresed. The PCR products were multiplexed into panels based on fragment sizes of the SSRs and the fluorescence label of the M13 primer and diluted 25× using Millipore water. Finally, 0.7 μl of multiplexed and diluted PCR products, 1.9 μl Hi-Di formamide and 0.3 μl size standard (GenScanTm600 LIZ® size standard) were mixed and ccapillary electrophoresis was conducted using Genetic Analyzer 3500 (Applied Biosystems) at SLU, Department of Plant Breeding, Alnarp, Sweden.
Allele scoring and statistical analysis
Peak identification and fragment sizing were done using GeneMarker V2.6.0 (SoftGenetics®) with default settings and 200 threshold intensities. Then, the allele size (bp) data at each locus were exported to excel for statistical analyses. Locus based diversity indices: major allele frequency (MAF), the number of alleles (NA), gene diversity (GD), and polymorphic information content (PIC) were determined using PowerMarker ver. 3.25  (Table 3). The number of effective alleles (Ne), observed heterozygosity (Ho), expected heterozygosity (He) , Shannon’s Information Index (I), and estimate of the deviation from Hardy-Weinberg equilibrium (HWE) over the entire populations and population genetic diversity indices: Ne, percentage of polymorphic loci (PPL), Ho, He, I, Nei’s gene diversity over the entire loci were computed using GenAlEx ver. 6.501  (Table 4). To determine the correlation between observed allelic diversity and sample size of populations, rarified allelic richness (Ar) and private rarified allelic richness (Ap) were estimated using HP-Rare 1.1 software . To identify genetic groups, G, the number of distinct multi-locus genotypes (MLGs) present in each sample were evaluated using GenClone 2.0 . To evaluate the global clonality rate of the sample, the index of clonal diversity was computed as G/N, where G is the number of MLGs and N is the total number of genotyped individuals. To analyze the distribution of genetic variation and to estimate the variance components of the populations, analysis of molecular variance (AMOVA) was computed using Arlequin ver. 188.8.131.52 . Population differentiation tests: Wrights fixation index (FST) and pairwise FST were computed using GenAlEx ver. 6.501 , and significance was tested based on 1000 bootstraps.
Simple matching dissimilarity coefficient-based Neighbor-Joining tree and Nei’s standard genetic distance (DST, corrected)  based Unweighted Pair Group Method with Arithmetic Mean (UPGMA)  tree was constructed using DARwin var. 6.0.14  and POPTREE2 , respectively, and significance was tested based on 1000 bootstraps . The resulting trees were displayed using TreeView (Win 32) 1.6.6 program  and FigTree var. 1.4.3 . Gene flow (Nm) among populations was estimated using the formula, Nm = 0.25(1 − Fst)/Fst .
A Bayesian model-based clustering algorithm in STRUCTURE ver. 2.3.4 [43, 44] was applied to infer the pattern of population structure and detection of admixture. To determine the most likely number of populations (K), a burn-in period of 50,000 was used in each run, and data were collected over 500,000 Markov Chain Monte Carlo (MCMC) replications for K = 1 to K = 12 using 20 iterations for each K. The optimum K value was predicted following the simulation method of Evanno et al.  using the web-based STRUCTURE HARVESTER ver. 0.6.92 . Bar plot for the optimum K was determined using Clumpak beta version .
Validation of the EST-SSRs marker and evaluation of their levels of polymorphism
The 20 newly developed EST-SSRs markers (Table 2) are predominantly trinucleotides and dinucleotides. Trinucleotide SSRs accounted for 55% of the loci with the number of repeats ranging from four to ten whereas dinucleotide SSRs accounted for 35% of the loci with the number of repeats ranging from six to ten. The remaining two loci (10%) were pentanucleotide and hexanucleotide repeats with four and five number of repeats, in that order. All the 20 loci were polymorphic and produced a total of 128 alleles (an average of 6.4 alleles per locus) (Table 3), out of which 62 (48.4%) were rare (frequency < 0.01) and 22 (17.2%) were scarce (frequency between 0.01 and 0.05). The frequency of seven alleles (5.5%) was between 0.05 and 0.1 whereas 37 alleles (28.9%) had a frequency of 0.1 or higher (Additional file 2).
The maximum number of alleles detected per locus was 12 (PE_16; Table 3). The least major allele frequency (MAF; 0.46), and largest effective number of alleles (Ne) (3.17), allelic richness (4.97), Nei’s gene diversity (GD) (0.70), polymorphic information contents (PIC) (0.66) and Shannon information index (I) (1.33) were recorded for PE_06. The highest MAF (0.97), private allelic richness (0.46), and the least Ne (1.06), I (0.12), GD (0.05) and PIC (0.05) were recorded for PE_07. In terms of the overall PIC, one SSR locus (PE_06) was found to be highly informative (PIC ≥0.5), 12 loci (60%) were moderately informative (0.5 < PIC ≥0.25), and the remaining seven were less informative (PIC < 0.25) (Table 3). The highest observed heterozygosity (Ho) (0.97), the lowest fixation index (− 0.75) and the highest value for gene flow (Nm) (35.1) were recorded for PE_02 (Table 3). Ten of the twenty loci showed a highly significant deviation from HW-equilibrium over the entire populations (Table 3).
Genetic variation within and among populations
Among the 12 populations studied, not much differences were observed in terms of a number of genetic diversity paramters, such as effective number of alleles (Ne), observed heterozygosity (Ho) and expected heterozygosity (He), gene diversity (GD) and Shannon diversity index (I). However, Wen, WS and Aw populations scored higher values in Ne, GD and I while SwSh and HKT populations scored a higher Ho as compared to the other populations. Five populations: Aw, Gur, EW, WSh, and YeL, in the order of magnitude, scored slightly less than the mean Ho whereas IAB, Jim, and GG populations had a mean Ho value. Although Ho value is the lowest for Gur population, it comes first in terms of allelic richness (Ar) including richness in private alleles (Arp) with Ar of 3.33 and Arp of 0.28 (Table 4). Jim and WSh populations ranked second and third in terms of overall Arp. The mean number of distinct multi-locus genotypes (MLGs) and index of clonality (G/N) over the entire samples (populations) was 13.08 and 0.55, respectively (Table 4). The analysis of percentage of polymorphic loci (PPL) showed that at least 90% of the loci were polymorphic in each population studied, with a mean PPL of 94.2% (Table 4).
Population genetic differentiation and gene flow
The hierarchical AMOVA was conducted without grouping the populations as well as by grouping the populations according to administrative regions (AR) and geographical regions (GR). In all cases, variation within individuals accounted for at least 97% of the total variation. The variation among populations, AR and GR accounted for only 3, 2 and 2% of the total variation, respectively. Hence, most of the within population variation is due to the heterozygosity of the individuals within each population. The overall FST value was very small (0.03) (Table 5, Additional file 3). The overall gene flow among the populations was estimated to be high (Nm = 5.84) (Table 3) whereas pairwise population differentiation test computed according to Weir , with exclusion of null alleles, ranged from 0.01 to 0.07.
Genetic distance between the populations
Nei’s standard genetic distance between populations ranged from 0.02 to 0.05. The highest pairwise genetic distance (0.05) was observed between four pairs of populations (Aw vs WS, Wen vs HKT, WS and YeL). The mean genetic distance of each population from the other populations ranged from 0.02 to 0.04 and, in terms of this parameter, Wen is the most distantly related population with a mean genetic distance of 0.04 (Table 6). Similarly, Weir’s estimation of population differentiation (FST) [30, 48] ranged from 0.01 to 0.07 with the highest value recorded between Wen and YeL populations.
Cluster analysis, PCoA, and population genetic structure
The neighbor-joining based cluster analysis of 60 individual samples randomly selected across the 12 populations (five samples per population) resulted in four major clusters (C1, C2, C3, and C4) with the first three clusters further divided into two sub-clusters. Each of the four clusters comprised individual plants from different collection zones (geographic regions). However, samples are more or less grouped according to their geographic region of origin at sub cluster levels (designated as i and ii on each major cluster) although there is considerable intermixes (Fig. 2). UPGMA  cluster analysis with bootstrap tests  has been conducted using Nei’s standard genetic distance at population level. The 12 populations formed four major clusters (I, II, III and IV) with cluster IV comprising eight populations that formed two sub-clusters (i and ii) (Fig. 3).
A PCoA analysis revealed that the majority of samples were placed at the center of a two-dimensional coordinate plane (Fig. 4) forming roughly three groups (C1, C2 and C3), showing poor population structure. The first three axes explained 32% of the total variation.
The Bayesian approach-based assignment of the 287 individual plants to different populations and determination of their population structure , using STRUCTURE outputs, predicted K = 3 to be the most likely number of clusters (Fig. 5a). Based on this value, Clumpak result (bar plot) showed wide admixtures and hence there was no clear geographic origin-based structuring of populations (Fig. 5b).
EST-SSR markers validation and polymorphism evaluation
P. edulis is an indigenous tuber crop of Ethiopia that plays an indispensable role in food security of subsistence farmers in areas where it is cultivated. As the first step in the development of genomic tools and resources that can promote the conservation and breeding of this crop, we developed and validated 20 polymorphic EST-SSR markers. This work reports the transferability of SSR markers from P. barbatus to P. edulis and hence it enriches the limited reports on cross-genera  and cross-species  transferability of molecular markers, and molecular marker based genetic diversity analyses of Lamiaceae species. In the present study, the screening of 3263 P. barbatus ESTs resulted in 301 sequences (9.2%) containing SSRs. This proportion suggest that EST-SSRs are less abundant in P. barbatus than in Salvia miltiorrhiza (14.7%)  and Lavandula species (18.8%) , and slightly more abundant in P. barbatus than in Mentha piperita (8.4%) , which are all Lamiaceae species. The results also suggest that the abundance of the EST-SSRs in P. barbatus is higher than their abundance in cereal crops such as rice (4.7%), sorghum (3.6%), barley (3.4%) and maize (1.4%) . However, other factors such as the SSR search criteria and size of the dataset might have partly contributed to the differences .
A maximum of two alleles are expected per individual plant at single copy microsatellite loci in diploid species. In the present study, a maximum of two alleles per plant were detected at each of the 20 loci, indicating that they are all single copy and have disomic inheritance regardless of the thus far reported basic chromosome number and ploidy level for several members of the genus Plectranthus . However, the chromosome number of P. edulis has not been reported and, hence, further in-depth cytogenetic analysis is important to confirm the finding of this study.
The EST-SSR markers developed in the present study contained di-, tri- penta- and hexanucleotide repeats. Studies have shown that di, tri and tetra-nucleotide repeat SSRs are the most commonly used motifs in molecular genetic studies . Tri-nucleotide repeat motifs were relatively abundant. This is attributed to the fact that they are more frequent in the EST’s coding regions, unlike in non-coding regions, in almost all taxa studied [54, 57,58,59] because of the positive selection for specific amino-acid stretches  or the prevalent selection against frameshift mutations in these regions for dinucleotides and other non-triplet repeat motifs . As length and total size of perfect array of microsatellites increases, the frequency of repeats decreases and hence the informativeness increases [59, 62] owing to the higher mutation rates in longer microsatellites . In agreement with this, the average number of repeats in dinucleotide SSRs is higher (8.7) than trinucleotides SSRs (5.7) among the SSRs developed in the present study. Similarly, the informativeness in terms of a number of alleles, GD and PIC was higher for di-nucleotide repeats than trinucleotides suggesting higher rate of evolution for SSRs with shorter repeat motifs than SSRs with longer repeat motifs.
The use of molecular markers for efficient selection of genotypes with desirable traits and enhancing the efficiency of breeding by allowing effective simultaneous selection of various desirable traits is a well-established approach [64, 65]. Hence, the large number of alleles detected in the present study suggests the suitability of microsatellites in general and those developed in this study in particular for genetic linkage and QTL mapping of desirable traits followed by marker assisted selection (MAS) in breeding programmes. However, most of the alleles were rare and scarce suggesting minimum selection pressure against the alleles. Otherwise, clonally propagating crops are expected to bear less proportion of such alleles as compared to seed-propagating crops. Moreover, the higher number of private alleles observed at several loci (Example: PE_07, PE_16, PE_18, PE_08) could offer a good opportunity to evaluate P. edulis genetic materials for the association of particular alleles with traits of interest and for conservation. Such alleles are useful in comparing diversity between species or populations  and also for measuring genetic distinctiveness of individuals in a population .
The average percent polymorphism per population revealed in the present study across the 20 loci (94%) is by far greater than the level reported by Kumar et al.  for 13 accessions of M. piperita (61%), which shares the same family with P. edulis. However, it was similar with that reported for 28 alfalfa accessions (97%)  and 37 Opium poppy accessions (96%) . Such high percent polymorphism together with the PIC values obtained, which provides an estimate of the discriminatory power of a locus , and the allelic diversity suggest great potential of the markers for use in future genetic studies. However, the informativeness of a considerable number of loci is low and hence, there is a need to develop more highly informative EST-SSRs or other type of DNA markers that are suitable to characterize P. edulis genetic resources for efficient conservation and breeding.
The loci studied displayed differences between Ho and He in which half of them showed excess heterozygosity that led to a significant departure from HWE across populations. Such excess heterozygosity is expected in historically outcrossing species that maintain their heterozygosity through vegetative propagation, or if other factors such as natural and artificial selection pressure favor heterozygosity. Similar results have been reported in sweet potato [71, 72].
Population genetic diversity
Higher genetic diversity is expected in larger and older populations when compared with small and newly established ones because of higher levels of accumulated and maintained genetic variation  which is important in increasing fitness and therefore reduces the likelihood of local extinction . However, the mean observed heterozygosity (0.39), Shannon’s information index (0.61) and Nei’s gene diversity (0.35) obtained in the present study showed a medium level of genetic variation within populations. This could be mainly due to a relatively narrow genetic basis of the populations that resulted from limited germplasm resources accessible to farmers, or due to reduction in population size both due to natural as well as human factors, such as replacing cultivation of P. edulis by other crops. In addition, farmers’ preferences for selected traits of economic importance such as tuber size, tuber skin color, tuber texture, maturity etc. and asexual mode of reproduction of the crop (clonal propagation), which was evidenced from the considerably higher global clonality index (G/N), could have contributed to limited genetic variation in P. edulis populations. There are similar reports on potato cultivars from Yunnan province, China .
Wen, WS, and Aw populations are genetically more diverse than the other populations as estimated by parameters such as gene diversity, heterozygosity and Shannon diversity index and hence the areas representing these populations could be considered as genetic diversity hot spots and a potential in-situ conservation sites for P. edulis. Among the populations, EW has the least genetic diversity, which might suggest current rapid genetic erosion from the area (population bottleneck) or intensive artificial selection pressure to maximize tuber yield. In terms of allelic richness, Gur, Jim, WS and WSh populations are the top four in that order, and hence are more interesting in terms of genetic and evolutionary studies on this crop because allelic richness is more informative in this regard as it is sensitive to the presence of rare alleles  (which is prominent in this study) and population bottlenecks when compared to other parameters such as expected heterozygosity. Moreover, these populations except WS bear a relatively high proportion of private alleles which may indicate certain level of independent evolution of their gene pools that allowed maintenance of private alleles at a population level .
Population genetic differentiation
AMOVA revealed that P. edulis has very low genetic differentiation among populations, which accounted only for 3% of the total genetic variation. The result is in line with previous reports on clonally propagating crops, such as Ensete (Ensete ventricosum) [78, 79], as such species tend to be more diverse within populations (but largely lower than sexually reproducing species) than among populations. Likewise, FST averaged across all loci (FST = 0.03) and pairwise FST for all pairs of populations (highest value = 0.07) was generally low to moderate on the bases of Wright  and Hartl and Clark  suggestions. Wright  indicated that genetic differentiation among populations can be considered high if the value of FST is greater than 0.25. This could be partly attained if gene flow, which is a powerful force to decrease differentiation among populations, is low (Nm < 1) . Hence, the present study showed that P. edulis has very little population sub-structuring. The low population differentiation is supported by high gene flow (mean Nm = 18.29) owing to step-wise pollen movement across populations, germplasm exchange in the form of tubers and seeds through sharing common markets among several of the adjacent areas where different populations were collected. This study also showed the minimal effects of regions or geographic origins of populations on genetic variation in P. edulis. This could be partly explained by the extensive exchange of tubers as planting materials among farmers (gene flow), common origin of the populations, the clonally propagating nature of the crop in which only a limited number of individuals contribute tubers to the next generation, which gradually leads to recent or old population bottlenecks and hence facilitate genetic drift.
A pair-wise population genetic differentiation analysis resulted in a seven-fold variation in FST value among pairs of populations (ranging from 0.01 to 0.07). The highest population differentiation was observed between Wen and HKT, Wen and WS, Wen and YeL as well as between Aw and WS populations. Wen population showed the highest (0.05) pairwise Nei’s standard genetic distance with WS and HKT populations and is the most genetically distinct population with a mean Nei’s standard genetic distance 0.04 (Table 6). This can be partly explained by the fact that Wen and Aw populations were collected from a relatively pocket location and are separated from the other populations with a relatively longer geographic distance that probably restricted recent seed and tuber exchange. Hence, these populations may serve as potential sources of new genetic variation of important traits that can be used in breeding programs.
Population genetic relationship and structure analysis
Neighbour joining cluster analysis in which each population is represented by five individual plants revealed a weak clustering pattern confirming low genetic differentiation among the populations and suggesting that the genetic background of P. edulis populations does not always correlate with their geographical origin. Although UPGMA and PCoA analyses also showed a certain level of populations clustering according to their geographical regions, the clustering pattern is weak to support the concept of “isolation by distance” . Similarly, structure analysis revealed a close relationship (weak sub-division) among the samples from the 12 collection zones, and in general, three inferred groups (K = 3) with potential admixtures and have been observed. It is interesting to indicate that all individual plants analyzed have alleles originated from the three clusters, which supports the presence of a strong gene flow that led to poor population differentiation.
In this study, we developed 20 EST-SSR markers for P. edulis based on EST sequences of P. barbatus deposited in the GenBank. All the markers were polymorphic in the populations studied and are valuable genetic tools to help evaluate the extent of genetic diversity and population structure of not only P. edulis but also of various species in the family Lamiaceae. These markers detected a larger number of alleles, some of which were private alleles that may be linked to important agronomic traits. The study also showed the potential of EST-SSR markers in defining how P. edulis genetic diversity is structured, and hence contribute to the development of better in-situ and ex-situ management strategies as well as selection criteria for the germplasm to be used in breeding programs for the improvement of various desirable traits in this crop. Among the 12 administrative zones and woredas considered, Wenbera, Awi and Wolaita have populations with a relatively high genetic diversity, and hence can be considered as hot spots for in-situ conservation of P. edulis as well as sources of desirable alleles for breeding values. Further studies that include germplasm from the remaining administrative zones and combine molecular characterization with agro-morphological analysis would be important to reveal additional potential sites for conservation and development of best-performing varieties. Overall, this study offered baseline information that promote further studies to exploit the high economic and endogenous values and to stop and reverse the current rapid genetic erosion of P. edulis.
Analysis of molecular variance;
Cetyltrimethyl ammonium bromide
Expressed sequence tags
National Centre for Biotechnology information
Principal coordinates analysis
Southern Nations, Nationalities, and Peoples’ Region
Simple sequence repeats
Unweighted pair group with arithmetic mean
Vavilov NI. The origin, variation, immunity, and breeding of cultivated plants. Cambridge: Cambridge University Press; 1951. p. 1–387.
Harlan J. Ethiopia: a centre of diversity. Econo Bot. 1996;23:124–32.
Westphal E. Agricultural systems in Ethiopia. Wageningen: Centre for Agricultural Publishing and Documentation; 1975. p. 1–299.
Zohary D. Centers of diversity and centers of origin. In: Frankle OH, Bennett E, editors. Genetic resources of plants- their exploration and conservation. Oxford: Blackwell; 1970. p. 33–42.
Mekbib Y, Deressa T. Exploration and collection of root and tuber crops in east Wollega and Ilu Ababora zones: rescuing declining genetic resources. Indian J Trad Know. 2016;15:86–92.
Asfaw Z. Survey of indigenous food plants, their preparations, and home gardens in Ethiopia. Indigenous African food crops and useful plants. Bede N. Okigbo ed. NU/INRA. Assessments Series No.: B6. ICIPE Science Press: Nairobi; 1997. p. 1–16.
Nebiyu A, Garedew W, Tofu A, Abebe W, Kifle A, and Etissa E. Crop management for other root and tuber crops (taro, cassava, and yam). In: Root and tuber crops: the untapped resources (Gebremedihin Woldegiorgis, Endale Gebre and Berga Lemaga, eds.). EIAR: Addis Ababa Ethiopia; 2008. p. 317–322.
Hedge IC. A global survey of biogeography of the Lamiaceae. In: Harley RM, Reynolds J, editors. Advances in labiate science, royal bot. Garden: Kew; 1992. p. 7–8.
Codd LE. Plectranthus (Labiatae) and allied genera in southern Africa. Bothalia. 1975;11:371–442.
Edward S. Crops with wild relatives found in Ethiopia. In: Engels, JMM, Hawkes, JG, Melaku W. (ed): Plant genetic resource of Ethiopia. Cambridge: Cambridge University Press; 1991. p. 42–74.
Asfaw Z, Tadesse M. Prospects for sustainable use and development of wild food plants in Ethiopia. Econ Bot. 2001;55:47–62.
Hora A. Anchote -an endemic tuber crop. Oromiya: Jimma College of Agriculture; 1995. p. 1–47.
Reinhard F, Adi A. Honeybee flora of Ethiopia. Ethiopia: The National Herbarium, A.A.; 1994. p. 1–510.
Demissie A. Potentially valuable crop plants in a Vavilovian center of diversity: Ethiopia. Proceedings of the international conference on crop genetic resources of Africa, Nairobi, Kenya, Attere F, Zedan H, ng, NQ & Perrino P, eds. International board for plant genetic resources, Rome; 1988.
Nebiyu A, and Awas T. Exploration and collection of root and tuber crops in Southwestern Ethiopia: Its implication for conservation research. Proceeding of the 11th Conference of the Crop Sciences Society of Ethiopia, Addis Ababa, Ethiopia; 2004.
Singh RP, Huerta-Espino J, William HM. Genetics and breeding for durable resistance to leaf and stripe rusts in wheat. Turkish J Agri Fores. 2005;29:121–7.
Leal AA, Mangolin CA, Do-Amaraljunior AT, Goncalves LSA, Scapim CA, Mott AS, Eloi IBO, Cordoves V, MFP DS. Efficiency of RAPD versus SSR markers for determining genetic diversity among popcorn lines. Genet Mol Res. 2010;9:9–18.
Ellis JS, Knight ME, Darvill B, Goulson D. Extremely low effective population sizes, genetic structuring, and reduced genetic diversity in a threatened umblebee species, Bombus sylvarum (Hymenoptera: Apidae). Mol Ecol. 2006;15:4375–86.
Uchiyama K, Iwata H, Moriguchi Y, Ujino-Ihara T, Ueno S, Taguchi Y, et al. Demonstration of genome-wide association studies for identifying markers for wood property and male strobili traits in Cryptomeria japonica. PLoS One. 2013;8:798–812.
Mehme K, Ayse GI, Adnan A, Saadet TA. Cross-genera transferable e-microsatellite markers for 12 genera of the Lamiaceae family. J Sci Food Agric. 2012;93:1869–79.
Guojie X, Chunsheng L, Luqi H, Xueyong W, Yuanyuan Z, Siqi L, Caili L, Qingjun Y, Xiaoli Z. Development of new EST-derived SSRs in Salvia miltiorrhiza (Labiatae) in China and preliminary analysis of genetic diversity and population structure. Biochem Syst Ecol. 2013;51:308–13.
Ryding O, Iwasson M, Morton JK, Persson E, Sebald O, Seybold S, Demissew S. Lamiaceae. In: Hedberg I, Kelbessa E, Edwards S, Demissew S, editors. Flora of Ethiopia and Eritrea, vol. 5. Sweden: The national herbarium, Addis Ababa University, Ethiopia and Department of Systematic Botany, Uppsala University; 2006. p. 592–5.
Geleta M, Herrera I, Monz A, Bryngelsson T. Genetic diversity of arabica coffee (Coffea arabica L) in Nicaragua as estimated by simple sequence repeat markers. Sci World J. 2012;10:1–11.
Martins WS, César D, Lucas S, de Souza Neves KF, Bertioli DJ. WebSat- a web software for microsatellite marker development. Bioinformation. 2009;3:282–3.
Koressaar T, Remm M. Enhancements and modifications of primer design program, Primer3. Bioinformatics. 2007;23:1289–91.
Untergasser A, Cutcutache I, Koressaar T, Ye J, Faircloth BC, Remm M, Rozen SG. Primer3 new capabilities and interfaces. Nuc Acids Res. 2012;40(15):e115.
Oetting WS, Lee HK, Flanders DJ, Wiesner GL, et al. Linkage analysis with multiplexed short tandem repeat polymorphisms using infrared fluorescence and M13 tailed primers. Genomics. 1995;30:450–8.
Brownstein MJ, Carpten JD, Smith JR. Modulation of non-templated nucleotide addition by Taq DNA polymerase: primer modifications that facilitate genotyping. BioTechniques. 1996;20:1004–6.
Liu KJ, Muse SV. PowerMarker: an integrated analysis environment for genetic marker analysis. Bioinformatics. 2005;21:2128–9.
Weir BS. Genetic Data Analysis II. Sunderland: Sinauer; 1996. p. 1–445.
Peakall PR, Smouse PP. GenAlEx 6.5: genetic analysis in excel. Population genetics software for teaching and research– an update. Bioinformatics. 2012;28:2537–9.
Kalinowski ST. HP-RARE 1.0: a computer program for performing rarefaction on measures of allelic richness. Mol Ecol Notes. 2005;5:187–9.
Arnaud-Haond S, Belkhir K. GENECLONE 2.0: a computer program to analyse genotypic data, test for clonality and describe spatial clonal organization. Mol Ecol Notes. 2007;7:15–7.
Excoffier L, Lischer HEL. Arlequin suite ver 3.5: a new series of programs to perform population genetics analyses under Linux and windows. Mol Ecol Resour. 2010;10:564–7.
Nei M. Genetic distance between populations. Am Nat. 1972;106:283–92.
Sneath PHA, Sokal RR. Numerical taxonomy. San Francisco: WH Freeman and Company; 1973. p. 1–573.
Perrier X, Jacquemoud-Collet JP. DARwin software. 2006. http://darwin.cirad.fr/. Accessed 6 Feb 2018.
Takezaki N, Nei M, Tamura K. POPTREE2: software for constructing population trees from allele frequency data and computing some other population statistics with windows interface. Mol Biol Evol. 2010;27:747–52.
Felsenstein J. Phylogenies and the comparative methods. Am Nat. 1985;125:1–15.
Page RDM. TREEVIEW: an application to display phylogenetic trees on personal computers. Comp App Bioscie. 1996;12:357–8.
Andrew R. FigTree: Tree figure drawing tool, Version 1.4.3. Institute of Evolutionary Biology. United Kingdom: University of Edinburgh; 2016. http://tree.bio.ed.ac.uk/software/figtree/.
Slatkin M, Barton NH. A comparison of three indirect methods for estimating average levels of gene flow. Evolution. 1989;43:1349–68.
Pritchard JK, Stephens M, Donnelly P. Inference of population structure using multi-locus genotype data. Genetics. 2000;155:945–59.
Falush D, Stephens M, Pritchard JK. Inference of population structure using multilocus genotype data: linked loci and correlated allele frequencies. Genetics. 2003;164:1567–87.
Evanno G, Regnaut S, Goudet J. Detecting the number of clusters of individuals using the software STRUCTURE: a simulation study. Mol Ecol. 2005;14:2611–20.
Earl DA, Von Holdt BM. STRUCTURE HARVESTER: a website and program for visualizing STRUCTURE output and implementing the Evanno method. Cons Genet Resour. 2012;4:359–61.
Kopelman NM, Mayzel J, Jakobsson M, Rosenberg NA, Mayrose I. CLUMPAK: a program for identifying clustering modes and packaging population structure inferences across K. Mol Ecol Res. 2015;15:1179–91.
Chapuis MP, Estoup A. Microsatellite null alleles and estimation of population differentiation. Mol Biol Evol. 2007;24:621–31.
Karaca M, Ince AG, Aydina A, TAyb S. Cross-genera transferable e-microsatellite markers for 12 genera of the Lamiaceae family. J Sci Food Agric. 2013;93:1869–79.
Adal MA, Demissie AZ, Mahmoud SS. Identification, validation and cross-species transferability of novel Lavandula EST-SSRs. Planta. 2015;241:987–1004.
Xu G, Liu C, Huang L, Wang X, Zhang Y, Liu S, Liao C, Yuan Q, Zhou X. Development of new EST derived SSRs in Salvia miltiorrhiza (labiate) in China and preliminary analysis of genetic diversity and population structure. Bioch Syst Ecol. 2013;51:308–13.
Kumar B, Kumar U, Yadav HK. Identification of EST–SSRs and molecular diversity analysis in Mentha piperita. Crop J. 2015;3:335–42.
Kantety RV, La Rota M, Matthews DE, Sorrells ME. Data mining for simple sequence repeats in expressed sequence tags from barley, maize, rice, sorghum, and wheat. Plant Mol Biol. 2002;48:501–10.
Varshney RK, Graner A, Sorrells ME. Genomics-assisted breeding for crop improvement. Trends Plant Sci. 2005;10:621–30.
de Wet JMJ. Chromosome numbers in Plectranthus and related genera. S Afri J Sci. 1958;34:153–6.
Selkoe KA, Toonen RJ. Microsatellites for ecologists: a practical guide to using and evaluating microsatellite markers. Ecol Lett. 2006;9:615–29.
Toth G, Gaspari Z, Jurka J. Microsatellites in different eukaryotic genomes survey and analysis. Genome Res. 2000;10:967–81.
Eujayl I, Sledge MK, Wang L, May GD, Chekhovskiy K, Zwonitzer JC, Mian MA. Medicago truncatula EST-SSR reveal cross-species genetic markers for Medicago spp. Theor Appl Genet. 2004;108:414–22.
Wang Z, Li J, Luo Z, Huang L, Chen X, Fang B, et al. Characterization and development of EST-derived SSR markers in cultivated sweet potato (Ipomoea batatas). BMC Plant Biol. 2011;11:139.
Metzgar D, Bytof J, Wills C. Selection against frameshift mutations limits microsatellite expansion in coding DNA. Genome Res. 2000;10:72–80.
Li YC, Korol AB, Fahima T, Beiles A, Nevo E. Microsatellites: genomic distribution, putative functions and mutational mechanisms: a review. Mol Ecol. 2002;11:2453–65.
Katti MV, Ranjekar PK, Gupta VS. Differential distribution of simple sequence repeats in eukaryotic genome sequences. Mol Biol Evol. 2001;18:1161–7.
McConnell R, Middlemist S, Scala C, Strassmann JE, Queller DC. An unusually low microsatellite mutation rate in Dictyostelium discoideum, an organism with unusually abundant microsatellites. Genetics. 2007;177:1499–507.
Dudley JW. Molecular markers in plant improvement: manipulation of genes affecting quantitative traits. Crop Sci. 1993;33:660–8.
Edwards MD, Helentjaris T, Wright S, Stuber CW. Molecular-marker-facilitated investigations of quantitative trait loci in maize. Theor Appl Genet. 1992;83:765–74.
Mahmodi F, Kadir J, Puteh A. Genetic diversity and pathogenic variability of Colletotrichum truncatum causing anthracnose of pepper in Malaysia. J Phytopathol. 2014;162:456–65.
Kalinowski ST. Counting alleles with rarefaction: private alleles and hierarchical sampling designs. Cons Genet. 2004;5:539–43.
Wang Z, Yan H, Fu X, Li X, Gao H. Development of simple sequence repeat markers and diversity analysis in alfalfa (Medicago sativa L.). Mol Biol Rep. 2013;40:3291–8.
Selale H, Celik I, Gultekin V, Allmer J, Doganlar S, Frary A. Development of EST-SSR markers for diversity and breeding studies in opium poppy (Papaver somniferum L.). Plant Breed. 2013;132:344–51.
Marulanda ML, López AM, Isaza L, López P. Microsatellite isolation and characterization for Colletotrichum spp, causal agent of anthracnose in Andean blackberry. Genet Mol Res. 2014;13:7673–85.
Zawedde BM, Harris C, Alajo A, Hancock J, Grumet R. Factors influencing diversity of farmer’s varieties of sweet potato in Uganda: implications for conservation. Econ Bot. 2014;68:337–49.
Gao SJ, Han JL, Liu AP, Yan ZJ, Chang XQ. RAPD analysis on genetic divergence among populations of Oedaleus asiaticus B. Bienko and Oedaleus infernalis Saussure. Acta Agri Boreali-Sinica. 2011;26:94–100.
Rampersad SN, Perez-Brito D, Torres-Calzada C, Tapia-Tussell R, Carrington VF. Genetic structure and demographic history of Colletotrichum gloeosporioides sensu lato and C truncatum isolates from Trinidad and Mexico. BMC Evol Biol. 2013;13:130.
Futuyma DJ. Sympatric speciation: norm or exception? In: Tilmon KJ. (ed): Specialization, speciation, and radiation: the evolutionary biology of herbivorous insects. Berkeley: Univ. of California Press. 2008. p. 136–48.
Liao H, Guo C. Using SSR to evaluate the genetic diversity of potato cultivars from Yunnan Province (SW China). Acta Biol Crac Ser Bot. 2014;56:16–27.
Leberg PL. Effects of bottlenecks on genetic divergence in populations of the wild Turkey. Cons Biol. 1991;5:522–30.
Slatkin M. Rare alleles as indicators of gene flow. Evolution. 1985;39:53–65.
Magule TO, Tesfaye B, Pagnotta MA, Enrico Pè M, Catellani M. Development of SSR markers and genetic diversity analysis in enset (Ensete ventricosum (Welw.) Cheesman), an orphan food security crop from southern Ethiopia. BMC Genet. 2015;16:98.
Tobiaw DC, Bekele E. Analysis of genetic diversity among cultivated enset (Ensete ventricosum) populations from Essera and Kefficho, southwestern part of Ethiopia using inter simple sequence repeats (ISSRs) marker. Afr J Biotechnol. 2011;70:15697–709.
Wright S. Isolation by distance. Genetics. 1943;28:114–38.
Hartl DL, Clark AG. Principles of population genetics, third edition. Sunderland: Sinauer Associates, Inc. Publishers; Massachusetts. 1997. p. 519.
Slatkin M. Gene flow and the geographic structure of natural populations. Science. 1987;236:787–92.
This work is part of the first author’s PhD thesis. The authors would like to thank Addis Ababa University and Mada Walabu University for material and technical supports for this research, and individual farmers for allowing us to collect the tuber samples from their fields. We would also like to thank the Ethiopian Biodiversity Institute (EBI) for allowing the transfer of the study material to Sweden, and the Swedish University of Agricultural Sciences for providing laboratory facilities to conduct this research.
This work is financially supported by the Swedish Research Council (VR) through Swedish Research Link project and Addis Ababa University’s Thematic Research Project. The role of the funding bodies is limited to direct funding of the project activities in the field and laboratory that results in this manuscript.
Availability of data and materials
Full passport data of the 174 tuber samples and 287 leaf samples representing the 12 populations used in the present study is provided in Additional file 1. Allele frequency distribution and overall percentage of rare alleles (f ≤ 0.01) across populations are provided in Additional file 2. Estimates of the overall Nei’s heterozygosity, population differentiation measures and proportion of progenies produced by selfing is provided in Additional file 3.
Ethics approval and consent to participate
Consent for publication
The authors declare that they have no competing interests.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Passport data of the 174 tuber samples and 287 leaf samples representing the 12 populations used in the present study. (DOCX 36 kb)
Allele frequency distribution and overall percentage of rare alleles (f ≤ 0.01) across populations. (XLSX 9 kb)
Estimates of the overall Nei’s heterozygosity, population differentiation measures and proportion of progenies produced by selfing. (DOCX 18 kb)
About this article
Cite this article
Gadissa, F., Tesfaye, K., Dagne, K. et al. Genetic diversity and population structure analyses of Plectranthus edulis (Vatke) Agnew collections from diverse agro-ecologies in Ethiopia using newly developed EST-SSRs marker system. BMC Genet 19, 92 (2018). https://doi.org/10.1186/s12863-018-0682-z