De novo genome assembly and analysis unveil biosynthetic and metabolic potentials of Pseudomonas fragi A13BB

Objectives The role of rhizosphere microbiome in supporting plant growth under biotic stress is well documented. Rhizobacteria ward off phytopathogens through various mechanisms including antibiosis. We sought to recover novel antibiotic-producing bacterial strains from soil samples collected from the rhizosphere. Pseudomonas fragi A13BB was recovered as part of this effort, and the whole genome was sequenced to facilitate mining for potential antibiotic-encoding biosynthetic gene clusters. Data description Here, we report the complete genome sequence of P. fragi A13BB obtained from de novo assembly of Illumina MiSeq and GridION reads. The 4.94 Mb genome consists of a single chromosome with a GC content of 59.40%. Genomic features include 4410 CDSs, 102 RNAs, 3 CRISPR arrays, 3 prophage regions, and 37 predicted genomic islands. Two β-lactone biosynthetic gene clusters were identified; besides, metabolic products of these are known to show antibiotic and/or anticancer properties. A siderophore biosynthetic gene cluster was also identified even though P. fragi is considered a non-siderophore producing pseudomonad. Other gene clusters of broad interest identified include those associated with bioremediation, biocontrol, plant growth promotion, or environmental adaptation. This dataset unveils various un−/underexplored metabolic or biosynthetic potential of P. fragi and provides insight into molecular mechanisms underpinning these attributes.

against phytopathogens through a variety of mechanisms, including the ability to gain competitive advantage for nutrients and trace elements and/or produce one or more antibiotics effective against such pathogens [1,2]. Whilst the latter characteristic (which is common to many soil dwelling bacteria) has been exploited to develop many clinically useful antibiotics, it remains the case that less than 1% of all known bacterial species have had their metabolic capabilities exploited in this way [3]. We therefore sought to recover potential novel antibiotic-producing bacterial strains from soil samples collected from the rhizosphere of various plants. Pseudomonas fragi strain A13BB was isolated as part of this effort.
P. fragi is a Gram-negative, rod-shaped, aerobic psychrophile. It is widely distributed in nature and commonly associated with meat and dairy spoilage [4,5]. It is rarely reported as a PGPR except by Selvakumar et al [5] and Fahr et al [6] who reported its phosphate solubilisation activity and its ability to improve tolerance against aluminium stress in acidic soils, respectively. However, to the best of our knowledge, it has not been previously reported as an antibiotic producer. Therefore, being a species not readily associated with antibiotic production, the genome of P. fragi A13BB was sequenced to facilitate mining for potential antibioticencoding secondary metabolite biosynthetic gene clusters (smBGCs) and other gene clusters that may be responsible for its environmental adaptation and plant growth promotion.

Data description
P. fragi A13BB was isolated from the rhizosphere of a plant in Aberdeen, Scotland (57.101 N 2.078 W) using an ultra-minimal substrate medium (data file 1) [7]. Purified strain was cultivated in nutrient broth (Oxoid, UK) at 28°C for 24 h before gDNA was extracted from pellets with the DNeasy® Ultraclean® Microbial Kit for DNA Isolation (Qiagen, UK). The extract was used as template to amplify the 16S rRNA gene in PCR reactions using 27F and U1510R universal primers, with thermocycler parameters set as follows: Initial denaturation at 95°C for 2 min followed by 30 cycles of further denaturation at 95°C for 30 s, primer annealing at 45°C for 30 s and elongation at 72°C for 105 s. A final elongation was carried out at 70°C for 5 min. Amplified DNA fragment was sequenced using the 27F primer. Isolate was subsequently identified by 16S rRNA gene comparison as P. fragi with 99% identity score.
Illumina reads were trimmed with Trimmomatic [8] v0.36 operated in the sliding window mode with Q25 quality cut-off and minimum read length of 100. The quality of trimmed reads was assessed with FastQC [9] v0.11.8 and results were aggregated with MultiQC [10] v1.8 (data file 2) [11]. Mean quality score across each base position was ≥31. Quality assessment of GridION reads was performed with NanoPlot [12] v1.28.2. Quality statistics are summarised in data file 3 [13], while average read quality plot is displayed in data file 4 [14].
The complete genome of P. fragi A13BB comprises a single chromosome 4,940,458 bp in size with a GC content of 59.40%. Genomic features include 4410 CDSs, 25 rRNA, 73 tRNA, 4 ncRNA, 3 CRISPRs, 3 prophage regions and 37 predicted genomic islands (data file 8) [30]. Also, 353 subsystems comprising of various gene clusters including those associated with bioremediation, environmental adaptation, biocontrol, and plant growth promotion were identified (data file 9) [31]. Two β-lactone smBGCs, both showing low homology (20%) to known smBGCs, were identified. β-lactones are known for their antibiotic, anticancer and antiobesity properties [32]. A siderophore smBGC was identified even though P. fragi is considered a non-siderophore producing member of the genus Pseudomonas [33]. Arylpolyene and NAGGN smBGCs were also identified which, along with the siderophore smBGC, are likely to contribute to the environmental fitness of the strain [34][35][36]. Table 1 provides the links to data files 1-9.
We believe the dataset presented in Pseudomonas fragi strain A13BB chromosome, complete genome [39] and in this data note form a sound basis for further in-depth study of the metabolic and biosynthetic capabilities of this strain, and indeed of other closely related species. The dataset also provides useful insights into the molecular mechanisms that underpin these capabilities.
Furthermore, being only the fourth publicly available complete genome sequence of P. fragi, the data will enrich the comparative genomics study of the species.

Limitations
IslandViewer 4 was run with default parameters. Crucially, IslandPick was run with default comparison genomes; different comparison genomes at different phyletic distances may influence the output of the analysis i.e. number of predicted genomic islands.

Authors' contributions
The project was conceived and designed by OKA and AJL. Data acquisition was performed by OKA. Data analysis and interpretation was performed by OKA, NHO, ADS and AJL. The project was jointly supervised by NHO, ADS and AJL. AJL was the principal investigator. The manuscript was written by OKA and revised by NHO, ADS and AJL. All authors read and approved the final manuscript.

Funding
The project was supported by Tenovus Scotland (grant number G16.04). Tenovus Scotland played no role in the design of the study or the collection, analysis, and interpretation of data, or in writing the manuscript.

Availability of data and materials
Data files 1-9 described in this Data note can be freely and openly accessed on Figshare (https://figshare.com/) [7,11,13,14,17,19,21,30,31]. Datasets 1 and 2 can be freely and openly accessed on the NCBI database. Illumina and GridION reads generated have been deposited in the Sequence Read Archive under accession number SRP251948 (Dataset 1) [37]. The genome assembly of P. fragi A13BB has been deposited in GenBank under accession number GCA_015767515.1 (Dataset 2) [38]. The BioProject accession number for the entire project is PRJNA610978. Please see Table 1 and references for details and links to the data.

Declarations
Ethics approval and consent to participate Soil sampling was undertaken on private land in Aberdeen, Scotland, UK with full landowner permission.

Consent for publication
Not applicable.

Competing interests
The authors declare no competing interests.