Introduction

MicroRNA (miRNA)s are short, highly conserved, RNAs that repress the expression of their target genes [1]. They directly/indirectly regulate many cellular processes like cell cycle, apoptosis, senescence, aging, and migration [2, 3], and have a role in both physiology and pathology. In cancer, miRNAs can act as tumor suppressors or oncomiRs [4,5,6], and have been linked with all hallmarks [6,7,8,9,10,11]. Epithelial-to-mesenchymal transition (EMT) is an embryonic development program, which many cancers hijack to increase the migratory and invasive capacities [12]. EMT is enforced by specialized transcription factor families like SNAIL, basic helix loop helix (bHLH), TWIST, and ZEB [13, 14], which can work as E-Cadherin repressors and promote the expression of EMT effector genes. MiRNAs can target EMT-inducing transcription factors [13, 15]. For example, miR-200 inhibits the transcription factors of the ZEB family in a reciprocal feedback loop [16] and miR-34 can target SNAIL [17]. The equilibrium between EMT-suppressing miRNAs and key EMT transcription factors determines the epithelial cellular plasticity, see Supplementary Fig. 1A.

MiRNAs are usually functionally investigated by overexpression or knocking-down techniques, and the downstream effects are measured by biochemical identification of target genes and by the analysis of cellular phenotypes. However, high-throughput functional assessment indicated that the majority of miRNAs expressed in cells do not show detectable targeting activity [18], suggesting the need for robust miRNA functional reporters to better investigate their physiological role. MiRNA sensors have been recently developed [18,19,20,21], and in limited cases used to isolate cells with distinct biological properties [22]. However, the systems so far proposed were based on artificial 3′UTRs containing multiple repeated perfect complementarity sequences or miRNA binding sites, instead of naturally occurring 3′UTRs, and generally lacked essential non-binding mutant controls [23].

To develop a robust and controlled plasmid-based fluorescent miRNA sensor, we cloned a 3′UTR fragment of the miR-200 target ZEB2 [14], containing three strong miR-200 seed matches fused to DsRed (R) fluorescent protein, driven by a CMV promoter. As a non-targeting control, in the same plasmid we fused GFP (G) to the ZEB2 3′UTR carrying mutated seed matches, driven by an identical CMV promoter (the RwtGmut vector). Once expressed in living cells, this plasmid is designed to respond to miR-200 levels altering the intensity of the red fluorescence, while green fluorescence serves as a transfection control. A double mutant non-binding control vector (RmutGmut) was also generated, Fig. 1A. We previously showed that the RwtGmut sensor is highly specific for the miR-200 b/c/429 cluster [24]. In this study, we report that EMT biology can be explored in living cells by its transient transfection, and this approach could be exported to virtually all biologically relevant miRNAs in different cellular models, with several important applications for miRNA research.

Fig. 1: The miRNA sensor plasmid can be transiently transfected in cells to detect miR-200b/c levels and distinguish differential EMT states.
figure 1

A Schematic representation of the miR-200b/c sensor. GFP Green Fluorescent Protein, DsRed Discosoma Red Fluorescent Protein, RmutGmut and RwtGmut. wt wild-type, mut mutated, ZEB2 3′UTR 3′ untranslated region of ZEB2 gene. B Schematic representation of the separation of miR-200b/c sensor-transfected cells in a FACS plot. The X axis represents green fluorescence intensity, and the Y axis represents red fluorescence intensity. Red dots indicate miR-200b/c low cells, yellow dots are miR-200b/c high cells, and in green are cells with very high miR-200b/c levels (for instance of exogenous source). C FACS plots showing the fluorescence intensity of HCT116 cells with FITC-A (Green) and PE-A (Red) channels after transfection with sensor plasmid in the presence of either pre-control or pre-miR-200c at 100 nM concentration. Indicated are the percent of gated cells over the total amount of cells in the experiment, including un-transfected. D FACS plots showing transfection of PANC-1 cells with RmutGmut or RwtGmut plasmids and bar graphs showing the percent of inhibition. E FACS plots showing transfection of SKOV3 cells with RmutGmut or RwtGmut plasmids and bar graphs showing the percent of inhibition. F Western blot quantification of ZEB1 and E-Cadherin protein expression of HCT116 cells with miR-200c knockout (MIR200C-KO), compared to parental cells. β-Actin was used as a loading control. G FACS plots showing the transfection of sensor plasmids in MIR200C-KO HCT116 cells or in parental control cells. H Bar graphs showing the percent of inhibition of FACS analysis done in (G). In DE p values are from Student’s t test. In H p values are from two-way ANOVA. Points are average ± SD. *<0.05, **<0.01, ***<0.001, ****<0.0001.

Results

The sensor plasmid can detect endogenous miR-200b/c activity

The dual fluorescence sensor was overexpressed in cells by transient transfection and its ability to report endogenous miR-200b/c levels was monitored by FACS. The scheme in Fig. 1B illustrates the different cell populations based on their miR-200b/c levels, as visualized in a FACS following sensor transfection. As a cellular model, we chose the colorectal cancer (CRC) cell line HCT116, which expresses high levels of miR-200s [14] and has an epithelial-like phenotype [25]. HCT116 cells were co-transfected with RmutGmut control plasmid in the presence of miR-200c mimics (pre-miR-200c) or scrambled control (pre-control), and the green and red fluorescence intensity were recorded on a flow cytometer. As a result, we found that the double mutant plasmid allowed a robust and coordinated expression of the fluorescent transgenes (cells along a diagonal line in the FACS plot), an important pre-requisite for the detection of changes of signal intensities due to the binding of endogenous miRNAs. Overexpression of pre-miR-200c produced no detectable alteration of fluorescence in these cells, Supplementary Fig. 1B, indicating the lack of binding activity. By contrast, using the signal from the non-binding RmutGmut plasmid to set the FACS gates, we could observe the appearance of a population with reduced red intensity in cells transfected with the RwtGmut sensor (with scrambled control), Fig. 1C. This result suggests that, when expressed in cells with detectable levels, the miR-200b/c sensor can report a significant binding activity in a sub-population of cells, potentially carrying the highest endogenous miR-200b/c levels. The percentage of cells in the gated area for high miR-200b/c further increased upon miR-200c overexpression, Fig. 1C, suggesting the notion that the sensor can be used to monitor cells with high miR-200b/c from endogenous and exogenous source. Similar results were obtained from another miR-200b/c positive cell line, the human bladder carcinoma RT112, Supplementary Fig. 1C, and endogenous levels were visualized also in other cancer cell lines, like PANC-1 and SKOV3, Fig. 1D, E.

The sensor reports miR-200 activity in cells with differential EMT states

The miR-200 family has a strong EMT-repressing role and is normally reduced in mesenchymal-like cells [15]. We sought to determine the activity of the sensor in cells with altered EMT status using an isogenic cellular system with high and low miR-200. For this, we tested mouse pancreatic cancer cell lines derived from a Pdx1-cre;KrasLSL.G12D/+;Tp53LSL.R172H/+ mouse model (KPC), which has been further modified to obtain the Zeb1 knockout, called KPCZ, to show a role of Zeb1 in the metastatic cascade of pancreatic tumors [26]. The mature sequence of miR-200b/c is highly conserved between human and mice [27]. Cells obtained from KPCZ tumors showed a more marked epithelial-like phenotype, Supplementary Fig. 2A, B, compared to KPC cells, with increased expression of miR-200 members b and c, Supplementary Fig. 2C. Upon transfection with miRNA sensors, KPCZ cells showed a superior inhibition rate, see Supplementary Fig. 2D, E, in line with the data obtained with the pre-miR transfection experiment.

To further test the specificity of the detection, we used a CRISPR/Cas9-based approach to permanently knockout miR-200c, the most prominent EMT-suppressing member of the miRNA family [28], taking advantage of a protospacer adjacent motif (PAM) sequence adjacent to the miRNA-200c seed match, Supplementary Fig. 3A. As a result, MIR200C-KO HCT116 cells showed a drastic reduction of miR-200c levels, as evaluated by qPCR, Supplementary Fig. 3B. In light of the high-sequence homology between the miR-200b and c, we also verified the effect of the KO on the MIR200B gene and a significant >2-fold reduction was also detected, Supplementary Fig. 3C. DNA sequencing on single-cell clones obtained from MIR200C-KO cells showed alterations around the PAM in the miR-200c, but not in the miR-200b region, Supplementary Fig. 3D, suggesting that the observed reduction in miR-200b is likely an indirect effect due to EMT induction, possibly mediated by ZEB1 increase. Phenotypic analysis of the MIR200C-KO cells, in fact, showed that they had undergone EMT, as indicated by a pronounced mesenchymal-like morphology and dispersed growth pattern, and confirmed by western blotting and immunofluorescence staining (Fig. 1F and Supplementary Fig. 4A). To check that EMT was propelled by the miR-200c knockout and not by an unwanted off-target effect [29], we reconstituted the cells with exogenous miR-200c by transient transfection and could observe an attenuation of the phenotype, Supplementary Fig. 4B. Once transfected with miR-200b/c sensors, the miR-200C-KO cells displayed a significant reduction in the percentage of cells with inhibition of red fluorescence, Fig. 1G, H, further supporting the high specificity of the miR-200b/c sensor system. Overall, these data indicated that the sensor can monitor changes in the miR-200b/c levels in living cells with distinct EMT states.

Utility of the sensor to sort cells by endogenous miR-200b/c levels/EMT status

We then tested the possibility to use the sensor for FACS-sorting cells with a differential EMT status based on the endogenous high- and low-miR-200b/c expression. HCT116 cells were transfected with RwtGmut and RmutGmut plasmids on a larger scale and separated by their RwtGmut non-inhibited (miR-200b/c low) and red-inhibited (miR-200b/c high) population, Fig. 2A. Gated areas were selected to collect cells with the same intensity of green fluorescence (the mutant control) and a significant difference in red. After sorting, cells were re-plated and allowed to attach and grow for 3 days to recover, to exclude dead cells from the further characterizations and to increase in number. A western blot conducted to monitor the expression of the epithelial marker E-Cadherin identified no change in E-Cadherin abundance in RwtGmut compared to RmutGmut transfected cells (unsorted), to control that the miR-200b/c sensor itself did not alter the EMT phenotype. Additional experiments on HCT116, RT112, and PANC-1 parental unsorted cells were further performed to better rule out the possibility that the sensor was sequestering miR-200b/c and favouring EMT upon transfection, Supplementary Fig. 5A, B. Analysis of sorted cells showed that E-Cadherin levels were also unaltered in RmutGmut compared to parental cells, while a significant reduction was found in miR-200b/c low, and a relatively marked increase was detected in miR-200b/c high cells obtained from RwtGmut transfection, Fig. 2B. In a separate experiment, qPCR analysis confirmed the differential gene expression of miR-200c and EMT markers in miR-200b/c high- and low-sorted cells (Fig. 2C). Similar results were obtained by sorting sensor-transfected RT112 cells, Supplementary Fig. 5C, D. In addition, we found that sorted high and low cells had a strong propensity for quickly reverting their fluorescence after re-plating, as quantified by FACS analysis and video imaging (Supplementary Fig. 5E, F), indicating that miRNA activity can be monitored for a few days in vitro in individual sensor-transfected cells. Interestingly, despite the fact that a differential EMT status could not be evidenced by the microscopic morphological examination of sorted cells, miR-200b/c low clearly showed a significantly lower ability to attach after sorting, compared to high cells. Quantifications confirmed that they were more rounded and of smaller size when re-plated, indicating a lower adhesion capacity, before normalizing in the following days (Supplementary Fig. 5G, H). Altogether, this was the first indication that the sensor-guided FACS-sorting strategy was capable of physically separating cells with distinct miR-200b/c levels and EMT properties.

Fig. 2: The sensor can be used to sort cells by endogenous miR-200b/c levels.
figure 2

A FACS plots showing the gates used for sensor-sorting miR-200b/c low and high cells after transfection with the RwtGmut plasmid (right panel). Left panel shows RmutGmut transfected control cells and their gate for sorting control. B Western blot quantification of E-Cadherin in HCT116 cells transfected either with RmutGmut or RwtGmut plasmids (unsorted, left panel), or sorted with the gates shown in (A) and re-plated for 3 days to grow and remove dead cells (right panel). C qPCR quantification of relative mRNA levels of CDH1 (gene coding E-Cadherin), VIM (Vimentin), ZEB1 and miR-200c in miR-200b/c low or high HCT116 cells sorted as in (A). Gene set enrichment analysis of RNA-sequencing data depicting pathways D upregulated and E downregulated in sensor-sorted miR-200b/c low cells compared to miR-200b/c high cells. F qPCR validation of RNA-sequencing data. CDC25B, CDCA3, CCNF, SPC25, TNFAIP3, SERPINE1, LTBP2, ZEB1 and ZEB2 were quantified using GAPDH as a housekee** gene. p values are from Student’s t test. Points are average ± SD. *<0.05, **<0.01, ***<0.001, ****<0.0001.

miR-200b/c sensor-sorted cells can be used to molecularly characterize endogenous EMT states

EMT is a complex multifactorial process. Although a few markers like E-Cadherin, Vimentin and ZEB1 can represent a good surrogate for the determination of the EMT state [14, 30], this should be better concluded from broader, and possibly genome-wide characterizations [31]. We, therefore, repeated the sorting experiment twice independently (with duplicates) and subjected all the eight RNAs isolated from high and low cells to sequencing. Using a cut-off of twofold for defining differentially expressed genes, we identified 224 upregulated and 73 downregulated genes in the miR-200b/c low compared to high cells, as an overlap between the two distinct sorting experiments, see Supplementary Fig. 6A and Supplementary Table 1A, B. A geneset enrichment analysis revealed that the signatures most significantly upregulated in the miR-200b/c low cells were belonging to the TNF-alpha signaling via NF-κB, to EMT, and the TGF-beta signaling pathways, Fig. 2D. On the other hand, analysis of the genes downregulated in miR-200b/c low cells indicated the significant prevalence of cell-cycle-related targets, like those linked with the E2F transcription factors and genes involved in the G2/M progression through the cells division cycle, Fig. 2E. Of note, ZEB1 and ZEB2 were identified among the most differentially expressed genes, Supplementary Fig. 6B. This is particularly important as a further assay validation, since the sensor plasmid is designed as a functional readout of the miR-200b/c binding to the ZEB transcription factors. Moreover, clustering analysis of miR-sensor low and high cells with the parental HCT116 cells indicated that HCT116 cells resembled and clustered with miR-200b/c high cells, considering both differentially expressed genes (Supplementary Fig. 6C) and known miR-200b/c target genes (Supplementary Fig. 6D-F), in line with the fact that HCT116 have a predominant epithelial-like nature. Quantitative PCR was used to independently validate the RNA-sequencing results and confirmed the down- or upregulation of relevant genes in the identified pathways, like CDCA3 among the E2F target genes and TNFAIP3 for TNF-alpha signaling along with ZEB1 and ZEB2, Fig. 2F.

We next sought to functionally validate the role on EMT of the pathways associated with the gene signatures obtained from sensor-sorted cells. TGF-beta is an established master EMT inducer [32]. TNF-alpha has also been reported as an EMT-inducing cytokine in different cancer cell lines, including cells from colorectal origin, with and without the NF-κB-mediated upregulation of EMT transcription factors [24, 57], here confirmed by the analysis of endogenous levels in different cellular models and by overexpression and knockdown approaches. However, the sensitivity can be the object of further improvements, allowing to extend the assay to cells with lower miRNA endogenous expression. Sensors designed with artificial 3′UTRs with multiple fragments perfect complementary to the targeting miRNA can guarantee a higher level of sensitivity, as previously shown [58]. However, the sensor here presented has the advantage of carrying naturally occurring sequences properly controlled with non-binding mutants, in analogy with the dual reporter vectors used to validate bona fide miRNA targets. This method is preferable to better represent physiological conditions [58] by (1) making sure that the sensors are reliably reporting endogenous miRNA activity, i.e., not overestimating their effects, and by (2) minimizing or eliminating miRNA-sponge or decoy effects [18, 59], which could alter the biological properties. Another unappreciated factor is that the presence of non-natural 3′UTRs (either containing repeated seed matches or complementary sequences) carrying random spacer sequences increases the chances to introduce novel unwanted, albeit specific, miRNA binding sites. In the absence of non-binding controls, the contribution of these off-target detections can be difficult to estimate. To improve sensitivity, therefore, multiple 3′UTRs controlled with non-binding mutants in a 1:1 ratio could in future be cloned in tandem and tested. Another limiting factor is that this sensor can only be transiently introduced in the cells, thus limiting the number of cell lines to be potentially investigated and the amount of output cells per sorting round, an important factor for the downstream techniques requiring higher amount of cells. MiRNA-reporter cells with integrated sensors would allow better live tracking of miRNAs activity in experimental settings in vitro and in vivo. However, integration of this vector into the genome produced recombination events possibly due to the highly repetitive sequences [24], a fact that precluded its further use. To overcome this limitation, future studies should be conducted to improve the sensor architecture, like using single bidirectional promoter vectors [60] to minimize the presence of repetitive sequences. Once improved, this approach could be implemented for all biologically relevant miRNAs, leading to fundamental discoveries in biomedicine.

Methods

Cell culture and chemicals

HCT116 cells were cultured in McCoy’s 5A (Lonza), RT112 and COLO205 cells were cultured in RPMI1640 (Sigma), PANC-1 and HEK 293 (all from ATCC) were cultured in DMEM (Sigma), and SKOV3 cells (NCI) were cultured in RPMI1640 supplemented with 1 mM sodium pyruvate (Gibco), 1% MEM NEAA (Gibco), and 1% MEM Vitamin solution (Gibco). Media were supplemented with 10% FBS (Sigma), 1% pencillin/streptomycin (Sigma) and 1% l-glutamine (Sigma). Cells were cultured at 37 °C and 5% CO2 in a humidified incubator. Cells were STR authenticated and tested for mycoplasma regularly (Invivogen). HCT116-Cas9 cell line was generated by lentiviral transduction with lentiCas9-Blast (Addgene) and selection with 4 μg/mL of blasticidin (Sigma) for 3 days. KPC (Pdx1-cre;KrasLSL.G12D/+;Tp53LSL.R172H/+) and KPCZ (Pdx1-cre;KrasLSL.G12D/+;Tp53LSL.R172H/+; ZEBfl/fl) are mouse pancreatic cancer cells without and with ZEB1 knockout [26] and were cultured in DMEM. TNF-alpha was from Gibco, and Cdk inhibitor (CGP-60474) from Tocris.

FACS analysis

HCT116 cells were seeded in six-well plates at a density of 0.5 million per well. The next day cells were transfected with 1.5 µg of either RmutGmut or RwtGmut plasmids. After 48 h, cells were trypsinized, washed, and resuspended in FACS buffer (5 mM EDTA and 2% FBS/PBS). Samples were run on Cytoflex FACS machine (Beckman). FACS data were analyzed using FlowJo software v10.6.

CRISPR screen

HCT116-Cas9 cells were transduced with lentiviral Human GeCKO v2 library part A and part B at MOI of 0.3 in the presence of 4 μg/mL of polybrene for 24 h, then replaced the virus medium with fresh growth medium and continued to culture the cells for 48 h. The cells were selected with 4 μg/mL of puromycin for 3 days. Then combined both half libraries cells and collected 2.5 × 107 cells for genomic DNA isolation. Next-generation sequencing was performed on the Illumina HiSeq 2500 platform in Deep Sequencing Facility of TU Dresden. The raw FASTQ files were analyzed with MAGeCK-VISPR.

Additional methods are available in the Supplements.