Introduction

1,4,5,6-Tetrahydro-2-methyl-4-pyrimidinecarboxylic acid (C6H10N2O2, molecular weight 142.16) is a natural pigment produced within the cytoplasm of salt-loving bacteria (e.g. genus Ectothiorhodospira; Halomonas). This pigment helps bacterium to perform osmoregulatory function termed as ‘ectoine’ in general [6]. The moderately halophilic members of family Halomonadaceae displays osmoadaptation facilitated by pigments such as betaine and ectoine [3] and hydroxyectoine [12]. Family Halomonadaceae possess total 18 child taxa. Of these, names of 14 child taxa are validly published with their correct name. Other16 child taxa have their validly published name including synonyms under the International Code of Nomenclature of Prokaryotes (ICNP). On similar note, currently Genus Halomonas represented by 114 type strains with 112 candidates having validity published name and correct name and 10 candidates with synonyms. Also, three species have orthographic misspelled variants, and 18 invalidated species were not validated by ICNP [7]. Description of all Halomonas species is given on LPSN portal managed by Leibniz Institute DSMZ-German Collection of Microorganisms and Cell Cultures GmbH, Germany. Halomonas species are known producer of biotechnologically important ectoine. Being suspended in the cytoplasm, ectoine and hydroxyectoine coded by Halomonas species has benefits to cell. It performs various activities in cell such as stress tolerant chaperones, as a compatible solute, stabilize of cell membrane and reduce cell damage [10]. Moreover, ectoine and hydro-ectoines are high-value chemicals and exploited for cosmetics, immune protection, stabilization of antibodies, anti-inflammatory and tissue protective agent, for co-production of bioplastic polyhydroxybutyrate [8], as a skin aging and protectant agent against harsh environments viz. radiation and extreme temperatures. Whole cell and macromolecule under hostile conditions were protected by intracellular ectoine from freezing, drying, high salinity, heat stress, oxygen radicals, radiation and denaturing agents [10]. Various applications of ectione produced by Halomonas species reflect presence of diverse gene profiles and other conserved genes in their genomes. It is therefore important to evaluate indicative signatures genes that codes ectoine and governs vital biological function under extreme environmental conditions among the genus Halomonas.

Present study is a blue print of ectoine coding genes identified from H. elongata. Genome annotations of existing Halomonas spp., have uncovered existence of some common genes that codes ectoine (s) among members of the genus Halomonas. Thus, genome-wide evaluations of ectoine coding genes were assessed. We also analyzed highly close 32 Halomonas spp., with Halomonas elongata 1H9, which has phylogenetic related ectoine coding child taxa inferred using identified single copy genes.

Main text

Methods

128 type strains 16S rRNA genes and 94 Halomonas spp., genomes

One hundred twenty-eight 16S rRNA genes of type strains and 94 complete genomes and reference sequences of Halomonas spp., were obtained from LPSN and NCBI genome database deposited during 2006 to 2020.

Radar chart

Halomonas spp., possesses multiple quantitative variables (species in particular) i.e. variable genome length/data points for visualization. Radar chart makes the way easy to compare the intra-species variable length to see similar values and find high or low scoring within outliers in the genus.

RAST genome analysis

Complete genome sequences of Ectothiorhodospira haloalkaliphila ATCC 51935 (CP007268), H. elongata 1H9 (NC_014532), Halorhodospira halochloris DSM 1059 (AP017372) and Halorhodospira halophila SL1 (CP000544) analyses done using RAST v2.0 (https://rast.nmpdr.org/) [11]. RAST server is a SEED-based National Microbial Pathogen Database Resource (NMPDR), prokaryotic genome annotation service, to predict system coverage, subsystem category distribution and subsystem feature count [2].

Identification of protein families and single copy genes

Protein families and single-copy genes in 93 Halomonas spp., were identified using PATRIC 3.6.9 (https://www.patricbrc.org/). PLfams within the genus were computed with MCL inflation = 3.0 to obtain higher sequence similarity and better specificity for intra-genus/species close comparisons.

Selecting single copy number genes

PLfams of 1,4,5,6-Tetrahydro-2-methyl-4-pyrimidinecarboxylic acid coding genes among 93 Halomonas spp., were extracted. Common genes coded by Halomonas species were selected for analysis. The topology of the phylogenetic tree generated using concatenated sequences was compared with the topology of 16S rRNA based Halomonas spp., child taxa tree.

Phylogeny reconstruction and topology analysis

The evolutionary history of one hundred twenty-eight16S rRNA and 33 Halomonas single-copy genes were inferred using standalone tool MEGA X with 1000 bootstrap analysis followed by best scoring ML, NJ and ME tree. The Jukes-Cantor method and are in the units of the number of base substitutions per site. The closest child taxa of biotechnological important ectoine producing H. elongata 1H9 were deciphered. It helps for phylogenetic analysis and topology comparison to delineate nearest species and 1,4,5,6-Tetrahydro-2-methyl-4-pyrimidinecarboxylic acid gene coding species.

Results

Phylogenetic analysis of 16S rRNA genes in the genus Halomonas

H. elongata 1H9 is a bacterium that prefers saline environment and known for 1,4,5,6-Tetrahydro-2-methyl-4-pyrimidinecarboxylic acid (ectoine) producer under extreme environmental condition.

RAST genome analysis of the H. elongata 1H9 shows that various subsystem feature consists of various pathways (Additional file 1: Figure S1) coded by bacterium. In addition, member of the genus Halomonas encodes and produce molecular variants of 1,4,5,6-Tetrahydro-2-methyl-4-pyrimidinecarboxylic acid. Therefore, the diversity of ectoine coding Halomonas might form distinct cluster with a similar kind of Halomonas species. Hence, phylogenetic analysis of 16S rRNA sequences of type strain amongst genus Halomonas revealed that type strains AJ261, 1H9, M8, 5-3, RS-16, AAD6, SS20, 11S, NTU-107, TBZ21, 5CR, F8-11, SL014B-69, TBZ202, KCTC 42685, Z-7009, SL014B-85, CIP 105456, 204, KMM 1376, 10-C-3, Hwa etc., (Additional file 2: Figure S2) formed a discrete clustered together from extracted sequences. This suggests that those species have a similar gene pool regardless of their genome length were grouped in one cluster. Variation in some branches may occur due to the use of single 16S rRNA genes for phylogenetic analysis. Hence, members of the genus Halomonas might possess similar single-copy ectoine coding genes reveals that apart from the 16S RNA gene.

Identification of protein families, single copy genes and Pearson correlation

Whole-genome analyses and annotation have resolved the misery of unique genes distributed among the genus Halomonas spp. The radar chart shows that existing genomic data of Halomonas spp., possesses complete genome sequences, reference genomes and some scaffolds (Additional file 3: Figure S3, Additional File 6: Table S1). Available genomic sequence data shows a similar gene pool and all ectoine-coding sequences from 93 type strains not having sets of genes. To resolve this issue and find relevant species in the genus Halomonas, we, therefore, annotated all genomes and identified the single-copy gene that codes ectoine. It was noticed that few Halomonas species that more than 11 single copy ectoine-coding genes. Therefore, inferred ML tree (Additional file 4: Figure S4) some type strains shows that ectoine biomarker (in 1H9, F9-6, AJ261, SP4, ACAM 71, 62, Hb3, DSM 15,911, N12, NTU-107, G-16.1, ZJ2214, TBZ3, M29, 79, BJGMM-B45, LCB169, CFH 9008, AIR-2, DQD2-30, 4A, SL014B-69, TBZ202, DX6, 9-2 and MC28) possessed by species were more or less similar kind of representative species similar to concatenated sequence of 32 Halomonas species (Fig. 1). It was observed that of the 93 annotated genome sequences, 31 + 1 (32) species have 11 ectoine coding genes (DoeA-DoeC-DoeX-EctC-EctD-EutB-EutC-TeaA-TeaB-TeaC-UspA) as single copy number genes (Additional file 5: Figure S5; Table 1). Heatmap of 11 ectoine coding genes shows a high degree of Pearson correlation (Fig. 2) value lies between 0.50 and ± 1 (0 = no correlation, 1 = high degree correlation).

Fig. 1
figure 1

Maximum-likelihood (ML) analysis of concatenated sequences of 11 genes (DoeA-DoeC-DoeX-EctC-EctD-EutB-EutC-TeaA-TeaB-TeaC-UspA) from 32 Halomonas species in MEGA X. The evolutionary distances were computed using the Jukes-Cantor method and are in the units of the number of base substitutions per site. The percentage of replicate trees in which the associated taxa clustered together in the bootstrap test (1000 replicates)

Table 1 Functions of ectoine-coding genes in the Genus Halomonas under different scenarios
Fig. 2
figure 2

Heatmap of 11 ectoine coding genes in Halomonas spp., showing genome and protein-pairwise average linkage using Pearson correlation. Heat map has been inferred from annotated genes among the genera Halomonas under investigation in this study

Novel Universal stress protein in Ectoine TRAP cluster (UspA) and resistance mediated by UspA gene

Studies on genome sequence analyses and analysis of various ectoine coding in Halomonas spp., uncovered that type strains viz. H. aestuarii Hb2 (NZ_CP018139), H. anticariensis DSM 16096 (GCF_000409775), H. azerbaijanica TBZ202 (GCF_004551485), H. bachuensis DX6 (GCA_011742165), H. beimenensis NTU-111 (NZ_CP021435), H. campisalis SS10-MC5 (NZ_CP065435), H. caseinilytica DSM 18067 (GCF_001662285), H. cerina CECT 7282 (GCF_014192215), H. cupida (GCF_900142755), H. daqingensis CGMCC 1.6443 (GCF_900108215), H. denitrificans DSM 18,045 (GCF_003056305), H. endophytica MC28 (GCF_002879615), H. eurihalina MS1 (GCF_008274785), H. gudaonensis (GCF_900100195), H. halmophila NBRC 15537 (GCF_006540005), H. heilongjiangensis 9-2 (GCF_003202165), H. huangheensis BJGMM-B45 (NZ_CP013106), H. kenyensis DSM 17331 (GCF_013697085), H. korlensis CGMCC 1.6981 (GCF_900116705), H. lactosivorans KCTC 52281 (GCF_003254665), H. litopenaei SYSU ZJ2214 (GCF_003045775), H. niordiana ATF 5.4 (GCF_004798965), H. organivorans CECT 5995 (GCF_014192055), H. pacifica (GCF_007989625), H. qijiao**gensis KCTC 22228 (GCF_014651875), H. saliphila LCB169 (GCF_002930105), H. stenophila CECT 7744 (GCF_014192275), H. taeanensis (GCF_900100755), H. urmiana TBZ3 (GCF_005780185), H. ventosae (GCF_004363555), H. xinjiangensis TRM 0175 (GCF_000759345) and H. zincidurans B6 (GCF_000731955) possess superfamily of conserved gene—UspA—suggests that the UspA gene/domain has been inherited from ancient protein family found in primitive bacteria. UspA protein helps Halomonas species provide support and assist Halomonas to function and produce ectoine in the saline environment under stressful conditions like high salt, low water activity and low temperature etc. Hence, UspA—stress protein—found in 32 species is a new report in the genus Halomonas.

Moreover, ectoine or ectoine derivatives investigated by various groups worldwide for their biotechnological applications. For instance, few reports suggests that ectoine or ectoine derivatives were been in use for oral care, vulvovaginal conditions and in some in cosmetic formulations to protect cell damage and avoid microbial infections. For instance, reports suggest that ectoine and ectoin derivatives in combination with natural essential oil were employed as effective solution against pathogenic Pseudomonas aeruginosa [1] and antifungal resistant Candida strains causing candidiasis [4, 5]. Therefore, in biotechnological perspectives ectoine and derivatives of ectoines may have application against antimicrobial resistance and multi-drug resistant microorganisms.

Conclusion

Ectoine signatures can be found in 93 Halomonas genome sequences that are publicly available. 32 Halomonas species have 11 separate ectoine genes in a single copy number in their genomes, which help Halomonas spp. produce ectoine under stressful conditions. Based on existing genomic data, it was discovered that H. elongata 1H9 has distinct ectoine-producing machinery from other Halomonas species. The existence of 11 distinct genes in 32 species, including the UspA gene, suggests that Halomonas species evolved directly from their primitive ancestor, shedding light on their evolutionary significance.

Limitations

A possible restriction would be the presence of biomarkers other than existing ectoine-coding genes responsible for Halomonas spp. producing 1,4,5,6-Tetrahydro-2-methyl-4-pyrimidinecarboxylic acid.