Introduction

Synthetic biology aims to engineer new, or modify existing, cellular functions by designing, editing, and assembling the underlying genetic material. To achieve this goal, knowledge-based designs have been traditionally used that identify genes of known molecular functions to piece together the targeted cellular function. However, identifying all the necessary genetic components is not trivial and requires large-scale analysis such as genome-wide knockout studies and Tn-Seq1,2. This approach also calls for serial steps of genetic transformation with separate testing of each individual gene of the targeted function3. For example, the development of the biosynthetic pathway of the antimalarial drug artemisinin demanded extensive research efforts and engineering4,5,6. To circumvent the difficulties, computational tools have been developed to supplement experimental procedures and aid design7,8,9. More importantly, recent advances in new data analytics applied to large transcriptomic datasets has enabled an alternative approach. Using Independent Component Analysis (ICA), we can now identify sets of independently modulated genes (called iModulons) that constitute a particular cellular function10.

This advancement opens up the possibility to identify targeted functions in a particular strain and transfer them into alternate hosts. Successful cross-species transfer of desired functions would thus require: (i) the identification of the full genetic basis for the trait, (ii) the use of recombinant or DNA synthesis methods to capture these genes into a plasmid, (iii) the transfer of the plasmid into the target host, and (iv) making any needed changes to the new host that are critical to accommodate the transferred function. All these capabilities now exist, with (ii) and (iii) representing known approaches, while (i) and (iv) require novel transcriptomic analysis and the use of automated adaptive laboratory evolution.

ICA can be applied to find source regulatory signals in bacterial transcriptomes10,11,12,13. iModulons are fundamental units of bacterial transcriptomes that have been found to represent the genetic basis for various cellular functions and are associated with particular transcriptional regulator(s)14,15,16. Many identified iModulons include genes that are distantly located on the genome, genes of unknown functions, or contain accessory genes that augment the targeted cellular function. iModulons thus represent an advanced scale of synthetic biology to transfer naturally evolved traits across species.

Here, we use iModulons to create cellular functions in a new host. We show that transferring iModulons is superior to using operons or single genes identified by genome annotation algorithms and rational design approaches9,17,18. In some cases, the transferred function may not work optimally in the new host. In such cases, we can use adaptive laboratory evolution (ALE) to enable the host to optimally use the new function under selection pressure. We thus demonstrate that cross-species iModulon transfer is a versatile tool for synthetic biology.

Results

Cross-species transfer of Pseudomonas iModulons into E. coli

To initiate the project and prior to implementing cross-species iModulon transfer, we refactored a known cellular function within the original host as a proof of concept. Successful homologous refactoring and complementation of E. coli’s branched-chain amino acid (BCAA) metabolism was achieved (Supplementary Note, section 1 and Supplementary Fig. 1) to demonstrate identification, reconstruction, and transfer of genetic constituent of a biological function based on iModulon (i, ii, and iii). This motivated us to investigate the potential for transferring biological functions across species. Among the available species with iModulon structures in iModulonDB12, Pseudomonas is well-known for its versatile metabolism to degrade and utilize diverse compounds, including aromatics19,20,21. First, we chose to reconstruct and transfer a simple bioconversion process from Pseudomonas putida15 to E. coli in order to examine iModulon’s capability to rapidly identify genes associated with specific functions.

The VanR iModulon that is responsible for vanillate (VA) transport and conversion into protocatechuate (PCA) was chosen for our first cross-species iModulon transfer (Fig. 1A). It comprises three genes with annotated functions, vanA, vanB, vanK, and predicted porin-like galP-IV (Fig. 1B) in two converging operons (Fig. 1C). Notably, the iModulon exactly matches with the genes for the vanillate transport and metabolism22,23. Four genes, vanA, vanB, galP, and vanK are functionally annotated to encode for vanillate O-demethylase oxidoreductase complex, outer-membrane porin, and a major facilitator superfamily transporter, respectively22. Although the function of the outer membrane OprD-domain containing galP-IV has never been addressed, it is hypothesized that it facilitates the diffusion of the ligand through the outer membrane23,24. Since the mechanism of VanR regulation has not been established, the four genes constituting the VanR iModulon were cloned and heterologously expressed under the control of IPTG-inducible Trc promoter on a plasmid, pVanR_iM (Fig. 1D). When refactoring iModulons for heterologous expression, we tried to preserve native genetic arrangement, for VanR and following iModulons if possible, to ensure optimal expression levels of the gene members as demonstrated elsewhere25,26.

Fig. 1: Cross-species transfer of the Pseudomonas putida VanR iModulon into E. coli.
figure 1

A Vanillate transport and conversion in P. putida. OM outer membrane. CM cytoplasmic membrane. B iModulon weights of genes in P. putida. Four genes (green circles) with high weighting constitute the VanR iModulon. Gray lines indicate thresholds for determining iModulon membership. Gray circles identify genes not in the iModulon. C Graphical representation of vanR locus on the P. putida chromosome. D The VanR iModulon was refactored in a single operon under the control of trc promoter (PTrc), resulting in the pVanR_iM plasmid. Shades show genetic rearrangement for cloning purposes. E Vanillate (VA) conversion of E. coli carrying empty or pVanR_iM plasmid into protocatechuate (PCA). Gray circles, green diamonds, and orange triangles indicate cell density, VA, and PCA levels of the culture, respectively. Measurements from E. coli carrying empty or pVanR_iM plasmid are represented by hollow or filled symbols, respectively. Data were presented as mean values ± SD. Error bars indicate the SD of three replicate cultures. Source data are provided as a Source Data file.

E. coli carrying pVanR_iM converted VA into PCA up to 15.34 mg/l passively diffused to the supernatant27,28 during 48 h of fermentation in M9 glucose (4 g/l) medium supplemented with 100 mg/l VA, while the negative control carrying empty plasmid did not metabolize any VA (Fig. 1E). This first cross-species iModulon transplantation illustrates the rapid identification of enzymes required for biotransformation by ICA. Furthermore, iModulon engraftment provided a rapid way to biochemically verify a predicted pathway in a heterologous host.

Auxiliary genes may be needed for optimal function of cross-species transferred iModulons

Next, we chose to transfer an ampicillin resistance function of Pseudomonas aeruginosa to E. coli. P. aeruginosa displays beta-lactam resistance with endogenous beta-lactamase, AmpC, and has an iModulon involved in the inducible ampicillin resistance16. Activity levels of the AmpC iModulon are highly induced against beta-lactam challenge, but not under other antibiotic treatments (Supplementary Fig. 2). In the previous iModulon engraftment examples, genes comprising an iModulon matched with the predicted genes necessary for building the desired function. However, identifying all the genes necessary to build a biological function may not be trivial, given previous characterization efforts. Many iModulons contain genes whose functions are unknown or are seemingly unrelated to the overall function being transferred.

The AmpC iModulon comprises class C beta-lactamase encoded by the ampC gene29 that serves as a core for the functionality and six lesser characterized auxiliary genes, carO (PA0320), creD (PA0465), PA0466, PA0467, PA4111, and PA4112 (Fig. 2A). The seven iModulon genes are distributed across three genomic loci separated by over 4 Mb. P. aeruginosa readily becomes resistant to ampicillin by transcriptional activation of ampC30. However, it is not known if the resistance trait is carried by this single gene. To examine if this resistance function is transferable across species, the constituent genes were refactored into a single operon (Fig. 2B). In addition, we constructed a plasmid that contained beta-lactamase alone to address any involvement of auxiliary factors in the function.

Fig. 2: E. coli carrying the Pseudomonas aeruginosa AmpC iModulon confers better ampicillin resistance than cells expressing beta-lactamase alone.
figure 2

A iModulon weights of genes in P. aeruginosa. Seven genes constitute the AmpC iModulon (blue circles). Gray lines indicate thresholds for determining iModulon membership. Gray circles identify genes not in the iModulon. B Refactoring the P. aeruginosa AmpC iModulon on bacterial artificial chromosome (BAC). Genes are expressed with the trc promoter (PTrc). Shades show genetic rearrangement for cloning purposes. C Dose-kill curves of P. aeruginosa and E. coli carrying empty BAC, BAC_ampC, or BAC_AmpC_iM. Data were presented as mean values ± SD. Error bars indicate the SD of biological replicates (n = 3). Note that the range of ampicillin concentration (Amp) is different, due to the huge difference in ampicillin tolerance. D Cell density of cultures treated with different ampicillin concentrations after 10 h of incubation. Data for P. aeruginosa and E. coli carrying empty BAC, BAC_ampC, or BAC_AmpC_iM are in orange, gray, light blue, and blue, respectively. Arrows indicate the minimum inhibitory concentration (MIC). Data were presented as mean values ± SD. Error bars indicate the SD of biological replicates (n = 3). Source data are provided as a Source Data file.

Ampicillin disc diffusion assay revealed that E. coli carrying the AmpC iModulon or ampC gene were resistant to ampicillin, while E. coli carrying empty plasmid were not (Supplementary Fig. 3). The source of AmpC iModulon, P. aeruginosa, showed ampicillin resistance with the minimum inhibitory concentration (MIC) of 2048 µg/ml (Fig. 2C). The MIC of ampicillin for laboratory E. coli strain MG1655 with empty plasmid was 16 µg/ml, which is comparable to previous reports31,32 (Fig. 2C, D). E. coli strain with the P. aeruginosa beta-lactamase showed a dramatic increase in ampicillin resistance with an MIC of 1024 µg/ml, while it was lower than that of the original host (Fig. 2D). Strikingly, E. coli harboring the entire AmpC iModulon, six auxiliary genes in addition to ampC, had an MIC of 4096 µg/ml, which was four times higher than that with ampC alone (Fig. 2D).

Although little is known about the molecular function of auxiliary genes, they were required to completely replicate the ampicillin resistance characteristics of P. aeruginosa. Previous reports have shown a decrease in beta-lactam resistance of the inner membrane protein creD knockout mutant of P. aeruginosa33 and growth enhancement of E. coli by endogenous creD overproduction (shares 37.4% sequence identity; BLOSUM62)34. Although the function of CreD is still elusive, reports indicate its relevance in biofilm development in P. aeruginosa35 and envelope integrity in Stenotrophomonas maltophilia36. Additionally, calcium-regulated oligonucleotide/oligosaccharide binding (OB)-fold protein CarO has been reported to be related to susceptibility to various stresses in bacteria37. Also, it shares similarity with Salmonella enterica stress-related protein VisP (38% sequence identity), which binds to peptidoglycan and inhibits the lipid A modifying enzyme LpxO38. Since lipid A is an anchor of lipopolysaccharide to the outer membrane and affects the properties of the outer membrane, expression of carO might be beneficial for cells to maintain structural integrity under cell wall deficient conditions induced by beta-lactam39.

Engrafting Pseudomonas iModulons to E. coli highlighted critical properties of iModulon gene membership. Harnessing only core genes for transferred cellular function may not be sufficient, as auxiliary genes may be needed to reconstruct an optimal function. Full iModulon gene membership helps to recreate the targeted cellular function, even without a complete understanding of the molecular function of all the genes involved.

Complete iModulon gene membership is needed for successful cross-species transfer

As illustrated by the AmpC case, we further investigated the iModulon-based transfer of cellular traits and compared it to the alternative conventional methods. The 2,3-butanediol (2,3-BDO) iModulon was chosen to examine the role of iModulon genes of unknown functions. 2,3-BDO is a byproduct of bacterial fermentation processes that can be produced by a variety of microorganisms, including Pseudomonas species40,41,42. In Pseudomonas, 2,3-BDO can serve as a carbon and energy source and is degraded by enzymes in the 2,3-BDO catabolic pathway42. This catabolic pathway involves the conversion of 2,3-BDO into acetoin, which is further converted into acetaldehyde and acetyl-CoA by butanediol dehydrogenase and acetoin dehydrogenase, respectively (Fig. 3A).

Fig. 3: Cross-species transfer of 2,3-butanediol (2,3-BDO) utilization iModulon of Pseudomonas putida in E. coli.
figure 3

A A pathway responsible for 2,3-BDO utilization. B Scatter plot shows weights of genes in P. putida to AcoR iModulon. Gray lines indicate thresholds for determining iModulon membership. Five genes constitute the AcoR iModulon (orange circles). Gray circles identify genes not in the iModulon. Black circles are three neighboring genes. C Genomic structure of the AcoR iModulon. Orange shade shows predicted operonic structure. Genes in the iModulon are in orange. Arrows indicate three different plasmid constructs for cross-species transfer. D 2,3-BDO degradation by P. putida. The formation of acetoin was negligible. Blue and yellow boxes represent 2,3-BDO and acetoin in the culture medium. Red circles show cell density. Dots indicate individual data points. Data were presented as mean values ± SD. Error bars indicate the SD of the three biological replicates. E 2,3-BDO and acetoin degradation by E. coli carrying empty plasmid or one of the three constructs. 2,3-BDO was added at the start of the culture and the remaining amount and acetoin formation was measured. Blue and yellow boxes represent 2,3-BDO and acetoin in the culture medium. Red circles show cell density. Dots indicate individual data points. Data were presented as mean values ± SD. Error bars indicate SD of the three biological replicates. Source data are provided as a Source Data file.

We transferred the 2,3-BDO iModulon of P. putida (called the AcoR iModulon15) to E. coli. The AcoR iModulon comprises acoABC (encoding acetoin dehydrogenase complex), bdhA (encoding 2,3-BDO dehydrogenase), and a gene acoX (Fig. 3B). AcoX encodes for a protein of unknown function and co-exists with acetoin-utilizing genes in various bacteria41,43. Operon prediction also suggests that the transcriptional unit contains acoX and two other hypothetical proteins (PP_0550 and PP_0551) in addition to characterized metabolic enzymes, acoABC-bdhA (Fig. 3C)18,44.

To examine which genes are required for recreating the 2,3-BDO catabolic pathway, we built three different plasmid based on (1) operonic structure (Op353; acoXABC-bdhA-PP_0551-PP_0550), (2) iModulon structure (acoXABC-bdhA), and (3) four genes encoding enzymes predicted to be sufficient for converting 2,3-BDO into acetaldehyde and acetyl-CoA based on current gene annotations (pathway; acoABC-bdhA) (Fig. 3C). 2,3-BDO dehydrogenase activities of the source organism and E. coli strains carrying the three plasmids individually were examined during 96 h of batch cultivation in LB medium supplemented with 2 g/l of 2,3-BDO. The original strain, P. putida KT2440, showed 2,3-BDO utilization with a negligible level of acetoin (Fig. 3D). The negative control, E. coli MG1655 carrying an empty plasmid converted 0.77 g/l of 2,3-BDO into acetoin, possibly due to endogenous promiscuous alcohol dehydrogenase activity (Fig. 3E). On the other hand, the plasmids based on the pathway, operonic structure, and iModulon showed higher conversion of 2,3-BDO with amounts of 1.36, 1.75, and 1.96 g/l, respectively (Fig. 3E).

Interestingly, the strains showed varying levels of acetoin dehydrogenase activity. First, all the 2,3-BDO consumed by the negative control resulted in roughly the equimolar amount of acetoin; not surprising since there is no acetoin dehydrogenase introduced. The strain carrying the functional gene annotation-based pathway plasmid did not further convert acetoin into downstream products, even though it contained genes encoding for the acetoin dehydrogenase complex. Second, strains with the full operon or AcoR iModulon not only consumed more than 1.7 g/l of 2,3-BDO, but there was only a small amount of acetoin left in the medium, indicating conversion of acetoin by acetoin dehydrogenase. The difference between annotation-based and iModulon-based plasmid is the presence of acoX (Fig. 3C), a gene encoding a predicted small molecule kinase that has been reported to have no acetoin, NAD, or pyruvate kinase activityHPLC

Culture supernatant was collected by filtering 200 µl of the crude culture through a 96-Well PVDF Filtration Plate (0.2 µm; Agilent, 203980-100). Exo-metabolites were analyzed from 10 µl of the sample using 1260 Infinity II HPLC System (Agilent) equipped with Multisampler (G7167A), HIP Degasser (G4225A), Binary Pump (G1312C), and Refractive Index Detector (G1362A), Thermostatted Column Compartment (G1316A), and Aminex HPX-87H HPLC Column (300 × 7.8 mm; BioRad, 1250140). Five millimolar sulfuric acid was used as a mobile phase, and the detector temperature was maintained at 30 °C. Vanillate, protocatechuate, acetoin, and 2,3-butanediol were detected at a column temperature of 65 °C with a mobile phase flow rate of 0.6 ml/min. For detection of malonate column temperature was maintained at 45 °C.

Quantitative PCR

To measure the plasmid-to-chromosome ratio and the expression level of the mdcR iModulon construct, cells were grown in 15 ml M9 malonate medium and sampled at the mid-log phase. Total DNA was extracted from 1 ml of the culture using a Quick-DNA Miniprep Kit (Zymo Research, D3024) as instructed by the manufacturer. 100 ng of DNA extract was subject to quantitative PCR in a 20 µl reaction containing AccuPower PCR 2× Master Mix (Bioneer, K-2018), 10 µM each of primers, SYBR Green I Nucleic Acid Gel Stain (Invitrogen, S7563). Fluorescence signals were monitored by the CFX Duet Real-Time PCR System (RioRad, 12016265). The amount of beta-lactamase (bla) gene on the plasmid was compared to the alaA gene on the chromosome to estimate the relative number of the plasmid-to-chromosome. PCR efficiencies were measured from three twofold dilution series of each gene. Plasmid-to-chromosome ratio (P/C ratio) is calculated by

$$P/C\,{ratio}=\frac{{E}_{{alaA}}^{{Cq}\cdot {alaA}}}{{E}_{{bla}}^{{Cq}\cdot {bla}}}\,$$
(1)

where \({E}_{{alaA}}\) and \({E}_{{bla}}\) are the PCR efficiencies of alaA gene and bla gene, respectively. \({Cq}\cdot {alaA}\) and \({Cq}\cdot {bla}\) are the quantification of cycles of respective genes.

To estimate the expression level of MdcR iModulon, the relative expression level of mdcA to 16S rRNA was measured. RNA was extracted from 14 ml of the culture using Quick-RNA Fungal/Bacterial Miniprep Kit (Zymo Research, R2014) as instructed by the manufacturer. Residual DNA was removed by incubating 2 µg of RNA extract at 37 °C for 30 min in a 50 µl reaction containing 2 U of RNase-free DNase I (New England Biolabs, M0303) followed by a purification using RNA Clean & Concentrator Kit (Zymo Research, R1013). cDNA was synthesized from 300 ng of the DNA-depleted RNA sample using the SuperScript II First-Strand Synthesis System (Invitrogen, 11904018) as instructed by the manufacturer. About 1 µl of cDNA synthesis reaction was subject to quantitative PCR in a 20 µl reaction containing AccuPower PCR 2× Master Mix, 10 µM each of primers, SYBR Green I Nucleic Acid Gel Stain. The relative amount of mdcA transcript was quantified by the ΔΔCq method using rrsA transcript as a reference, after adjusting PCR efficiencies that were measured from three twofold dilution series of each gene. All the primers were designed by Primer-BLAST58 to have no predicted cross-reactivity. Primer sequences are summarized in Supplementary Table 2.

Antibiotics sensitivity assay

For disc diffusion assay, overnight grown E. coli culture was diluted to 5 × 108 cells/ml and 500 µl of the diluted culture was spread on an LB-agar plate. Thirty microliters of ampicillin solutions with different concentrations were dropped on sterilized filter paper disks (9 mm diameter; Sigma-Aldrich, 1703932). After drying for 15 min, the disks were placed on the LB-agar plate and the plate was incubated at 37 °C overnight. For the dose-killing assay, overnight grown E. coli culture was diluted to 5 × 108 cells/ml in LB medium containing an appropriate concentration of ampicillin, and 100 µl of the diluted culture was transferred to a 96-well microplate. Cells were incubated at 37 °C with agitation in Infinite 200 Pro microplate reader (Tecan) and A600 was monitored every 15 min up to 10 h.

Adaptive laboratory evolution

E. coli K-12 MG1655 carrying pMdcR_iM plasmid was evolved via serial propagation of 150 µl into 15 ml M9 malonate (2 g/l) minimal medium containing 100 µg/ml carbenicillin in three biologically replicated cultures. Cultures were incubated at 37 °C, aerated by magnetic stirring, and reinoculated at A600 of 0.3 (Tecan Sunrise microplate reader; equivalent to an A600 of 0.750 on a conventional spectrophotometer with a path length of 10 mm) using an automated system. After 40 days of propagation, evolved cultures were subjected to a malonate utilization assay, and three clones were isolated from each culture.

DNA sequencing and analysis

Genomic DNA was isolated using Quick-DNA Miniprep Kit as instructed by the manufacturer. Whole-genome DNA-seq libraries were generated with a NEBNext Ultra II DNA Library Prep Kit for Illumina (New England Biolabs, E7645) and run on an Illumina NovaSeq X Plus with 100 cycles pair-ended recipe. The sequencing results were processed with the in-house pipeline59 that incorporates the BreSeq pipeline to identify mutations60. E. coli K-12 MG1655 genome sequence (National Center for Biotechnology Information accession no. NC_000913.3) was used as a reference sequence.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.