Rice of Northeast India harbor rich genetic diversity as measured by SSR markers and Zn/Fe content

Background Rice (Oryza sativa L.) is one of the most important crops of the world and a major staple food for half of the World’s human population. The Northeastern (NE) region of India lies in the Indo-Burma biodiversity hotspot and about 45% of the total flora of the country is found in the region. Local rice cultivars from different states of NE India were analyzed for genetic diversity and population structure using microsatellite markers, and their zinc and iron content. Results A total of 149 bands were detected using twenty-two microsatellite markers comprising both random and trait-linked markers, showing 100% polymorphism and high value of expected heterozygosity (0.6311) and the polymorphism information content (0.5895). Nali Dhan cultivar of Arunachal Pradesh possessed the highest genetic diversity (0.3545) among studied populations while Moirangphou Khonganbi of Manipur exhibited the lowest genetic diversity (0.0343). The model-based population structure revealed that all the studied 65 rice cultivars were grouped into two clusters. Cluster I was represented by 36 cultivars and cluster II by 29 cultivars. Badalsali cultivar of Assam possessed the highest Zn content (75.8 μg/g) and Kapongla from Manipur possessed the lowest (17.98 μg/g). The highest and the lowest Fe content was found in Fazu (215.62 μg/g) and Idaw (11.42 μg/g) of Mizoram. Conclusion The result suggested rice cultivars of NE India possessing high genetic diversity (Nali dhan), high Zn (Badalsali) and Fe (Fazu) content can be useful as a source of germplasm for future rice improvement programs.


Background
Rice (Oryza sativa L.) is the staple food for more than 50 % of the world's population. The rice production and consumption in Asia alone accounts for more than 90 % of the global rice yields [1]. It is imperative to develop measures to improve global rice production to warrant food security for increasing human populations. Although rice production has increased to about two folds in the past few decades with the introduction of improved varieties and proper crop management strategies, the need for high yielding, better varieties still remain unchanged. Bouis and Welch [2] suggested that increased rice productivity and the ability to deliver all the essential nutrients is crucial to meet both the energy needs and adequate nutritional health for the people in developing countries. Kennedy et al. [3] have reported that more than two billion people are affected with Iron (Fe), Iodine (I), Zinc (Zn), and vitamin A deficiencies, especially in poor families of developing countries, of which more than five million children die every year due to nutrient malnutrition [2]. Fe and Zn are essential micronutrients for all forms of life due to their functional importance in cell development and gene expression [4,5]. Zn deficiency is known to be one of the most important malnutrition problems [6]. The effects of Zn deficiency include growth retardation, diarrhea, emotional disorders, reduction or absence of hormone secretion in male adolescents, rough skin, poor appetite, mental lethargy, delayed wound healing, weight loss, etc. [7]. Fe deficiency leads to blood loss, mal-absorption, chronic diseases, genetic disorders, etc. [8][9][10][11]. Increased Zn and Fe uptakes are required during crucial metabolic periods such as early human growth and pregnancy, so children and pregnant women are at higher risk of these nutrients deficiency [6,12,13]. It has been suggested by rice workers that the development of rice varieties with higher nutrient content may improve the nutritional health of people whose major diet is rice.
The Northeastern states of India, comprising Arunachal Pradesh, Assam, Manipur, Meghalaya, Mizoram, Nagaland, and Tripura, lies within the international boundaries of Bhutan and China in the north, Bangladesh in the southeast and Myanmar in the west. This region constitutes the Indo-Burma biodiversity hotspot [14] and is inhabited by various ethnic groups of people who speak different dialects and perform different cultural practices. The topography and biogeography of the region make this place a picturesque and also rich in biodiversity of flora and fauna. About 45% of the total flora of the country is found in the region [15]. This region harbors the richest genetic diversity reservoir for agri-horticultural crops. Rice cultivation provides the main source of food and employment for the people of this region as most of the population is involved in agriculture and allied activities. About 72% of the total cultivated area is under agricultural cultivation practices in upland, lowland, and water fed areas [15]. Although a large number of rice cultivars are available, most of the rice cultivated in the region are high yielding varieties (HYV) developed using modern genetic engineering tools. This trend implies a possible narrowing of the natural gene pool. However, it is also surprising to know that the many indigenous farmers of the hilly areas are still practicing their own landrace or cultivar cultivation that they inherit from their forefathers, which suit the local microclimate and adaptation. The cultural importance of the local landraces is also depicted by these people.
Knowledge on the extent of genetic variation and relationship among genotypes is necessary for developing more effective breeding and conservation programs [16], Understanding and utilizing the genetic diversity in crop plants is crucial for sustaining the increasing global and local food demands [17]. Assessment of the genetic diversity of local rice landraces or cultivars will provide a valuable source as it can be useful for crop improvement programs, Integrated Pest Management (IPM) measures and sustainable development of agriculture. Rice varieties of this region possess unique traits which are of great interest to the plant breeders. Some of the useful qualities identified in these landraces include unique adaptive traits for cold tolerance, flooding and salt tolerance, etc. [15]. Many molecular markers have been used to assess genetic diversity within and between populations. Among them, microsatellite or SSR (Simple sequence repeat) are one of the most preferred for assessment of genetic diversity because they are reliable, rapid, easy to score, cost-effective and require only a small amount of DNA [18,19]. The present study was performed to assess the genetic diversity of the local rice landraces of Table 1 Zn and Fe content, Gene diversity and percentage polymorphism in the studied cultivars Highest value cell was indicated in red colour and lowest value cell was indicated in green colour for each parameter the Northeast states of India using SSR markers, with two aims i) to estimate the Zn and Fe contents and ii) to facilitate conservation and utilization of these landraces. Table 1 summarizes the Zn and Fe content of rice cultivars used in this study. Zn content in the studied cultivars ranged from 17.98 μg/g to 75.8 μg/g with an average of 36.65 μg/g (Table 1). Badalsali cultivar of Assam possessed the highest Zn content and Kapongla of Manipur possessed the lowest. The Zn contents of Northeast rice cultivars (38.55 μg/g) were higher than that of improved varieties (32.17 μg/g) used in the current investigation.

Zn and Fe content
Fe content ranged from Idaw (11.42 μg/g) to Fazu (215.62 μg/g) with an average of 59.29 μg/g ( Table 1). Similar to Zn, Fe contents of Northeast rice cultivars (62.9 μg/g) were higher than that of improved varieties (39.41 μg/g) in the present study.

SSR polymorphism
The agarose gels showing banding patterns of some rice cultivars were presented in Figs. 1 and 2. Table 2 shows a summary of the genetic markers used in the current study. A total of 149 bands were detected using twentytwo SSR primers. All twenty-two SSR markers were found to be polymorphic (100% polymorphism). The average number of alleles per locus was 6.7727 and the maximum number of band [12] was generated by RM223 and the minimum [2] was generated by RM315. The mean number of effective alleles was found to be 3.  Table 2.

Population structure analysis
The model-based population structure analysis using STRUCTURE showed that the highest value of ΔK was at K = 2 ( Fig. 3), grouping all the studied 65 rice cultivars into two clusters ( Fig. 4), designated here as cluster I and cluster II. Principal Coordinates Analysis (   cultivars into two groups. Cluster I was represented by 36 cultivars and cluster II was represented by 29 cultivars. In UPGMA tree, Cluster I was subdivided into four groups exhibiting rice cultivars of Manipur, Assam, Arunachal Pradesh, and Japonica varieties. And cluster II could also be subdivided into four groups comprising rice cultivars of Mizoram, Nagaland, Meghalaya and Indica varieties. Analysis of molecular variance (AMOVA) showed that the genetic variation of two clusters of 65 rice cultivars was distributed into 73% among populations and 27% within populations. Average distances (expected heterozygosity) between individuals in the same cluster varied from Cluster I (0.5197) to cluster II (0.5686). Fst values of Cluster I and Cluster II were found to be 0.2635 and 0.2107 respectively with an average of 0.2371. The mean alpha value was found to be 0.0663.
Comparative analysis of gene diversity, Zn and Fe content showed that there was no significant correlation among all the three parameters. However, few cultivars with high gene diversity also showed higher Zn and Fe content ( Table 3, Fig. 7).

Discussion
In this study, genetic structure and diversity analysis of 55 indigenous rice cultivars of Northeast India and 5 indica and 5 japonica test varieties were performed using twenty-two SSR markers comprising 9 random and 13 trait-linked markers, and Zn and Fe content. Assessments of genetic diversity of NE rice using molecular markers has been reported previously [20][21][22][23][24][25]. Though high genetic diversity was previously shown in the NE rice accessions, reports on micronutrients diversity are scarce. Micronutrient deficiencies to Zn and Fe, constitute the two most common nutrient deficiencies in humans [23,26,27], especially in developing countries [28]. Although rice is a major staple food for a large part of the world especially in Asia, it has been reported as a poor source of essential micronutrients and vitamins [29]. In the current study, relatively high Zn and Fe contents were detected in some of the cultivars. The Fe content in the present study was found to be higher than that of rice cultivars of West Bengal and adjoining areas, though zinc content was lower [23]. High Fe content was also previously reported in the Indian cultivars by Brar et al. [30]. The Zn content was higher and the Fe content was found to be lower than a previous report on local rice germplasm of Tripura state [31]. Average Zn and Fe contents in the present study were comparable with a previous report [30,32]. In another report by Verma and Srivastav [33], among some aromatic and nonaromatic Indian rice cultivars, aromatic rice had higher Zn and Fe contents. Interestingly, Zn and Fe contents in the current study was found to be higher than the ones reported by Verma and Srivastav [33]. Therefore, to overcome the micronutrient deficiencies, the present study will be helpful for designing crop improvement programs, though more investigations are still  needed to further find out higher contents of Zn and Fe since these micronutrients are essential for human health and development.
The NE rice cultivars contain considerable genetic diversity and variable traits which might be good sources for various improvement programs [20]. All SSR markers used in the present study were found to be polymorphic. A combination of random and trait-linked markers was utilized since Yadav et al. [34] reported trait-linked markers gave higher value of genetic diversity and Polymorphism Information Content (PIC) in some Indian rice germplasm than random markers, whereas several other workers have shown high genetic diversity in NE rice cultivars using random markers [20,21,24]. The number of alleles per locus (6.7727) found was higher than the ones reported earlier by Upadhyay et al. [35] (3.96 alleles per locus) and lower than that reported by Choudhury et al. [20] (13.57 alleles per locus). However, it was comparable with 7.9 alleles per locus reported by Das et al. [21]. The mean H E and PIC found in the present study showed a high value of heterozygosity index. The mean Fst values for all loci and between the two clusters were found to be 0.7786 and 0.1987 respectively indicating very high genetic differentiation among loci and among the clusters. Based on SSR analysis, there were seventeen highly informative markers (PIC> 0.50), viz., RM1, RM154, RM131, RM135, RM72, RM171, RM287, RM3825, RM246, RM260, RM525, RM219, RM223, RM8094, RM493, RM3412 and RM169; two informative markers (PIC between 0.25 and 0.50), RM153, RM125, and RM302; and two slightly informative markers (PIC< 0.25), RM315 and RM443 [36,37].
Population structure analysis using STRUCTURE showed highest ΔK value at K = 2 revealing that the studied 65 rice cultivars were grouped into two clusters. The number of the cluster was in agreement to the previous studies: two clusters among 29 varieties of cultivated rice of NE India [20] and two clusters among 6 landraces of North-Western Indian Himalayas [38]. Roy et al. [24] have also reported a similar result of K = 2, among hill rice of Arunachal Pradesh, NE India, belonging to indica and japonica. In the current study, the identified two main clusters can also be divided into sub-clusters corresponding to state-wise grouping. A similar result of state-wise grouping was also observed in aromatic rice germplasm from North Eastern India [13]. According to Evano et al. [39], alpha value closed to zero indicated that most of the individuals were from one population or another, and an alpha value greater than 1 indicated that most individuals were admixed. The observed small alpha value in this study (0.0663) might indicate that most of the individuals originated from one population or another.
In some areas of NE India, rice has been cultivated in shifting or jhum lands which only depend on the Monsoon rain. These cultivars survive in long spells of rainless weather and may be good candidates to look for these variable traits. Other important traits include dark color and aroma in Chakhao rice of Manipur, resistance against blast,  resistance to gall midge, deep water tolerance in Baon of Assam, drought resistance in Hmawrhang of Mizoram, etc. [15,40,41]. As evident from the current study, the genetic diversity of indigenous rice cultivars was found to be higher than that of agronomically improved varieties. These results are in agreement to a similar pattern observed for rice varieties of the Eastern Himalayan region of Northeast India [20]. The use of such genetic variability in breeding programs is a key factor for crop improvement [42]. Among the studied rice cultivars, Nalidhan cultivar of Arunachal Pradesh possessed the highest genetic diversity, followed by Vak and Boleng ammo cultivars. These high genetic diversity cultivars are promising candidates as sources for effective breeding or future rice improvement programs. However, some cultivars such as Moirangphou khonganbi, Moirangphou possessed a low level of genetic diversity suggesting necessary actionsshould be taken on the conservation of these landraces. Cultivars such as Vak, Bogaahoo and Tsulu tsuk possessed high genetic diversity and high Zn concentration. Similarly, Kawnglawng, Jakjatsuk, Yarte, Mezamew, etc. possessed high Fe content and high genetic diversity. Nalidhan, the cultivar with the highest genetic diversity also possessed Zn and Fe contents higher than the average observed for these studied populations. Lumre also possessed high genetic diversity, high Zn and average Fe content. The highest Zn containing Badalsali cultivar possessed a lower genetic diversity than the average of all the studied populations. Similarly, the Fazu cultivar with the highest Fe content showed lower genetic diversity than the average of all the studied populations. The present investigation showed that the majority of the cultivars with high genetic diversity had high Zn contents and many cultivars also exhibited high genetic diversity along with high Fe content.

Conclusion
The current study provides a better understanding of genetic structure, diversity, and micronutrient (Zn and Fe) richness in the indigenous rice cultivars of NE India. The cultivars possessing high genetic diversity (Nali dhan), high Zn (Badalsali) and Fe (Fazu) contents are promising candidates as parental lines for future rice breeding programs. These findings will further facilitate the conservation strategies and utilization of these landraces for developing sustainable rice improvement programs.

Plant material, collection and planting
Rice landraces were collected from six states of NE India. Details of collection sites are shown in Table 4. Indica and japonica check varieties were kind gifts from ICGEB, New Delhi, ABF, Hyderabad and ICAR, Kolasib. For isolation of DNA, individual cultivars were planted on polypots at Department of Botany, Mizoram University, India.

Estimation of Zn and Fe content
Dehusked rice seeds were crushed into a fine powder using mortar and pestle. The powdered sample (0.1 g) was placed in a 100 ml conical flask and 20 ml of Nitric acid (HNO 3 ) was added to it. The mixture was kept on a hot plate till the fuming of nitrogen dioxide ceased. Another 20 ml of HNO 3 was added and the samples were kept on the hot plate at a high temperature until the solution turned colourless. Then hydrogen peroxide (H 2 O 2 ) was added to make the solution colourless. The mixture was heated until the solution was reduced to 3-5 ml. This extract was diluted to 20 ml with de-ionized water and then filtered through Whatman filter paper 1.
The extract was then injected into Atomic Absorption Spectrophotometer (Shimadzu AA-7000, Japan) and the results were expressed in μg/g.

Genomic DNA isolation and PCR amplification
Genomic DNA was isolated from 15-day old seedlings following Edwards et al. [43]. Single leaflet of 15-day old seedling was used for isolation of DNA. The leaflet was macerated using a micropestle in a 1.5 ml centrifuge tube. After maceration, 400 μl of extraction buffer (200 mM Tris HCl pH 7.5, 250 mMNaCl, 25 mM EDTA, 0.5% SDS) was added to the tube. The sample was then vortexed vigorously for 1 min and centrifuged at 13000 rpm for 5 min. Then, 300 μl of the supernatant was transferred to a fresh centrifuge tube and an equal volume of Isopropanol was added. The samples were kept at room temperature for 2 min and then were centrifuged at 13,000 rpm for 5 min. The resulting pellets were air dried at room temperature and dissolved in 100 μl TE (10 mM Tris, 1 mM EDTA) buffer. Twenty-two simple sequence repeats (SSR) primers (Table 5) were used for amplification of genomic DNA. Amplification was performed in ABI Veriti 96 well Thermal cycler (ABI, USA) in 25 μl reaction containing 1X PCR buffer, 100 μM dNTP mixture, 3 mM MgCl 2 , 1 U Taq polymerase (Genie, India), 50 ng of each primer and 50 ng template DNA. The amplification conditions were set as, initial denaturation at 94°C for 5 min, 35 cycles of denaturation at 94°C for 30 s, annealing for 30 s, extension at 72°C for 1 min followed by a final extension at 72°C for 7 min. The amplified products were electrophoresed on 2.5% agarose gel and visualized by standard ethidium bromide staining [43,44].

Genetic data analysis
Bands were scored using Alpha View software (Alpha Imager, Protein Simple, USA). Total number of alleles, number of effective alleles, number of polymorphic loci, observed and expected heterozygosity, Nei'sgenetic diversity [45], Fst, and population-wise diversity were calculated using genetic analysis package POPGENE 1.31 [46]. Major allele frequency (MAF) and the polymorphism information content (PIC) were calculated using PowerMarker 3.25 [47]. Analysis of molecular variance (AMOVA) and principal co-ordinates analysis (PCoA) were performed in GenAlEx 6.5 [48]. The unweighted pair group method with an arithmetic  [50]. The parameter was set as 100,000 for the length of burn-in period and Markov Chain Monte Carlo (MCMC) repeats after burn-in was set as 100, 000. A possible number of subpopulations (K) was set from K = 1 to K = 10. Structure Harvester [51] was used to find the final K value. Then, the relationship among genetic diversity (gene diversity), Zn and Fe contents were measured using STATISTICA 5.0 (Statsoft Inc., USA, 1995). Chr. no. Chromosome number, T a Annealing temperature