Introduction

Breast cancer (BC) is a heterogeneous malignancy, standing as the most prevalent form of cancer among women1. Around 70-80% of early-diagnosed BC cases can be treated successfully. However, advanced-metastatic BC remains challenging, currently lacking a cure1.

Through analysis of gene expression signatures, BC has undergone systematic classification into well-defined molecular subtypes, namely Luminal A, Luminal B [both distinguished by the presence of the Estrogen receptor (ER)], HER2-enriched, and Basal-like. Each of these subtypes exhibits distinct molecular features, clinical characteristics, and therapeutic implications2.

Furthermore, an alternative classification method is based on immunohistochemistry profiles, wherein BC can be categorised according to the expression of specific biomarkers such as ER, progesterone receptor (PR), and HER2. In cases where ER, PR, and HER2 are not detected, the subtype is referred to as triple-negative BC (TNBC).

Like many other malignancies, BC is characterised by a heterogeneous population of cancer cells. Within the tumour mass, a minor subset comprising only a small fraction (0.1-1%) consists of tumour-initiating cells (TICs), which are also referred to as cancer stem cells (CSCs)3. TICs possess the unique ability to undergo self-renewal, leading to the generation of non-tumourigenic, proliferating progeny, as well as the capacity to initiate the formation of new tumours3. CSCs are also called TICs, because when inoculated into severe combined immunodeficiency disease (SCID) mice, represent the minority of breast cancer cells capable of forming new tumours and are CD44+/CD24–/low/lineage 4. TICs can be isolated or enriched through various methods. This includes sorting breast cancer cells based on the CD44+/CD24–/low phenotype, or culturing cells under non-adherent, non-differentiating conditions to promote the formation of tumourspheres5. Interestingly, studies have demonstrated that adding B27 to the culture media in non-adherent conditions improves tumoursphere formation efficiency and enhances the enrichment of the CD44 + /CD24–/low lineage, which can reach up to approximately 95% in breast cancer cell lines. This indicates that using tumoursphere 3D cultures is sufficient to study this tumourigenic cancer cell population.

Given their pivotal role, TICs are considered the driving force behind cancer progression and are implicated in conferring resistance to various therapeutic approaches. Consequently, strategies focused on targeting TICs hold significant promise in combating metastatic cancers and preventing disease recurrence6. TICs can be studied by exploiting models based on three-dimensional (3D) culturing conditions and low plating densities to form clonal cultures of TIC-enriched tumourspheres7,8.

MicroRNAs (miRNAs) are short, non-coding RNA molecules, approximately 22 nucleotides in length, that play a pivotal role in the post-transcriptional control of gene expression9. Their biogenesis initiates with the transcription of a primary miRNA (pri-miRNA). In the nucleus, the Microprocessor complex, composed of DROSHA and DGCR8, processes the pri-miRNA into a precursor miRNA (pre-miRNA). Subsequently, the pre-miRNA is cleaved by DICER1 in the cytoplasm to form the mature miRNA. Some miRNAs, however, can bypass the need for the Microprocessor or DICER1 in their biogenesis mechanism10.

They have emerged as key contributors to various human diseases, with significant implications for different aspects of cancer progression11. Through base-pair complementary interactions, they suppress the post-transcriptional expression of mRNA targets. However, each miRNA is predicted to target thousands of mRNAs12. Given this extensive capacity for direct gene co-regulation, identifying which of these targets serves as the key effector impacting the phenotype controlled by the miRNAs becomes challenging.

The exploitation of miRNAs as therapeutic targets in tumours holds great promise, owing to all miRNAs being inherently targetable. Oncogenic miRNAs can be effectively and selectively suppressed using synthetic, complementary anti-miRNAs (anti-miRs), which interfere with their function, thus attenuating their detrimental effects on tumourigenesis11. Conversely, in the case of tumour suppressor miRNAs, their expression can be enhanced or re-established in tumour cells to restrain tumour growth and potentially reverse cancer-associated phenotypes11.

The role of miRNAs in cancer, including BC, has primarily been elucidated through miRNA expression profiling studies conducted in cancerous versus normal samples, comparison of tumour stages or within cancer cell lines13. Nevertheless, this approach relies on the averaged expression levels of miRNAs, and thus, it may not fully uncover their functional significance. For example, a subset of miRNAs might have the ability to impact cancer vulnerabilities, but their expression remains unchanged between cancer and normal samples. In such cases, experimental conditions used to discover important miRNAs are based solely on differences in expression levels, such as RT-qPCR, microarray o RNA-seq might overlook important miRNAs.

CRISPR pooled screens have allowed for uncovering pivotal cancer gene drivers, revolutionising our comprehension of the intricate mechanisms through which genes and pathways contribute to tumourigenesis14. However, CRISPR screens have predominantly been carried out in 2D cell cultures, which often fail to fully capture the intricate complexities of tumour biology15. To help address these limitations, recent studies have conducted screens in 3D cultures15,16. These investigations have demonstrated that screens exhibiting stronger effects in 3D cultures, as compared to 2D cultures, are enriched for genes that have been found to be mutated in cancer15. This finding emphasises the potential of CRISPR screens in 3D cultures as an ideal model for effectively identifying key players in cancer.

Importantly, despite their transformative impact, to date, CRISPR screens have not been exploited for identifying the miRNAs that exert influence on breast tumourigenesis. miRNA-focused CRISPR screens may unveil novel therapeutic targets and pivotal molecular pathways11.

To discover the miRNA-dependent vulnerabilities in breast tumourigenesis, we conducted genome-wide CRISPR-CAS9 screens targeting genes and miRNAs in both ER+ and TNBC cancer cells, employing both 2D monolayer and 3D tumoursphere models (Fig. 1a). This approach allowed us to reveal the miRNAs and their critical targets controlling TIC viability in BC. By therapeutically addressing these critical miRNAs, we hold the potential to combat metastatic cancers and significantly reduce the risk of disease recurrence6.

Fig. 1: CRISPR screening strategy and performance.
figure 1

a Schematic of the genome-wide screening strategy in the 2D and 3D tumoursphere culture. b: Correlation of miRNA 3D-2D Gecko screen MAGeCK Beta scores between the cell lines MCF7 and HCC1395.

Results

CRISPR-CAS9 screens in 3D tumourspheres versus 2D cultures identify miRNA-dependent vulnerabilities of breast TICs

Given that 3D tumourspheres are enriched with TICs, which are able to self-renew, and are known to play a pivotal role in tumour progression3, we conducted simultaneous CRISPR-Cas9 screens in two BC cell lines under both 3D tumoursphere and 2D growth conditions to identify the miRNAs and their targets that affect tumoursphere growth. (Fig. 1a). By using both culture types in the screens we could specifically identify factors that influence 3D breast tumoursphere growth. To ensure a comprehensive exploration across two BC subtypes, we utilised MCF7 cells, representing the luminal A subtype (ER+, PR+), and HCC1395 cells, which belong to the aggressive TNBC subtype (ER-, PR-, HER2-). For the screens, we used the Genome-Scale CRISPR Knockout (GeCKO) v2 library17. This library allows us to systematically target a total of 19,052 genes and 1,864 miRNAs17, enabling us to identify the miRNAs and their crucial target transcripts essential for breast tumourigenesis.

To perform this experiment, we first optimised 3D tumourspheres for large-scale culture necessary to conduct the screens. Next, we subjected the cells to lentiviral infection with the pooled GeCKO v2 library (MOI 0.3) and applied antibiotic selection, then split the screen into 2D/3D growth conditions and cultured for 4 weeks. After this step, we conducted next-generation sequencing (NGS) to analyse the incorporation of single guide RNAs (sgRNAs) into the genome. Subsequently, we utilised Model-Based Analysis of Genome Wide CRISPR-CAS9 Knockout (MAGeCK)18 to analyse the NGS data. The screen performance was excellent with over 80% of the library mapped at time 0 (T0) and in the 2D or 3D samples that we left to grow for 4 weeks, in both cell lines. For ER + MCF-7 cells, using MAGeCK analysis we identified negatively selected sgRNAs (FDR < 0.05 in 3D condition versus T0) targeting 235 protein coding genes (PCGs) with a negative 3D-2D ß-score cutoff of < -0.4 and positively selected sgRNAs that target 17 PCGs with a positive 3D-2D ß-score of > 0.4 (Supplementary Data 1). Furthermore, for TNBC, HCC1395 cells, we identified negatively selected sgRNAs targeting 320 PCGs (FDR < 0.05 in 3D condition versus T0) with a negative 3D-2D ß-score < −0.4 and positively selected sgRNAs targeting 24 PCGs with a positive 3D-2D ß-score > 0.4 (Supplementary Data 1). Supporting the validity of our CRISPR screens, the identified PCGs with 3D-2D ß-score < -0.4 are known to be important for TIC expansion and stemness (Fig. S1). Accordingly that, pathway enrichment analysis using EnrichR26,27. Contrary to DICER1 processing of canonical pre-miRNAs, DICER1 cleavage of TSS-miRNA generates a single 3p-miRNA. However, our analysis by measuring miR-4787-3p in cells KO for DROSHA and DICER1 unveiled its dependence on both (Fig S5). Additionally, although pre-miR-4787-3p maps within the 5’UTR of DOCK3, it is located 166 nucleotides from its TSS, suggesting that that length include enough nucleotide sequence of the 5’UTR to become a Microprocessor complex substrate. We then integrated RNA-seq data upon miR-4787-3p inhibition with the CRISPR screens conducted in both MCF-7 and HCC1395, along with a miRNA-target identification algorithm (TargetScan v8.0) to identify its crucial target transcripts. Our reasoning was that as miR-4787-3p is essential for tumoursphere formation (Figs. 1, 2), leading to a negative ß-score in the 3D tumoursphere screen (Table S1), and as miRNAs act by post-transcriptional repression of direct targets, the upregulation of genes in RNA-seq following its inhibition with positive ß-scores in CRISPR screens would denote crucial target transcripts. Consistent with our hypothesis, the ß-score values of potential miR-4787-3p targets were significantly more positive than those of a random gene list, specifically in the 3D screens for both cell lines (Fig. 4c, g). Among these transcripts, we experimentally validated ARHGAP17, FOXO3A, and PDCD4 as significant direct targets of miR-4787-3p in BC tumourigenesis. ARHGAP17 is thought to have a Tumour Suppressive Role in Colon cancer36 and Cervical cancer32. FOXO3A is a known tumour suppressor and has recently been shown to be antagonised by miRNAs in several cancer types, including in BC37. PDCD4 inhibits protein translation to suppress tumour progression and is often decreased in BC. Numerous regulators including non-coding RNAs control PDCD4 expression in BC. PDCD4 loss is responsible for drug resistance in BC. Modulating the microRNA/PDCD4 axis is suggested as a strategy for overcoming chemoresistance in BC38. Interestingly our rescue experiments suggest that, in MCF7, miR-4787-3p mainly acts through the inhibition of PDCD4 whereas in HCC1396, miR-4787-3p mainly acted through the regulation of ARHGAP17.

In conclusion, our study underscores the critical role of miR-4787-3p in BC tumourigenesis, particularly in the aggressive basal-like subtype enriched in tumour-initiating cells (TICs). Targeting miR-4787-3p showed significant inhibition of tumoursphere formation, revealing its potential as a therapeutic target in BC. Moreover, our study suggests that elevated miR-4787-3p expression could serve as a prognostic biomarker for poor outcomes in BC patients. The validated targets of miR-4787-3p, such as ARHGAP17, FOXO3A, and PDCD4, shed light on the molecular mechanisms underlying its oncogenic role in BC. Further investigations and clinical studies focusing on miR-4787-3p inhibition and its prognostic utility hold promise for improved therapeutic strategies and prognostic assessments in BC.”

Materials and methods

Cell culture

BC cell lines MCF7 and HCC1395 were obtained from Genome Damage and Stability Centre (Sussex) Cell Bank. Cell bank lines were authenticated by ECACC using short tandem repeat (STR) profiling. MCF7 cells were grown in DMEM (Sigma) HCC1395 in RPMI-1640 Medium (Sigma) supplemented with 10% FCS, 2 mmol/l l-glutamine, 100 U/ml penicillin, and 100 mg/ml streptomycin. Between thawing and the use in the described experiments, the cells were passaged no more than 5 times. All cell lines were monthly tested for mycoplasma (MycoAlert, Lonza) and were found negative.

Large scale cell culture was performed in Corning Cell Culture Multi flasks, 3-layer format providing 525 cm2.

Zen and 2′OMe modified miRNA inhibitors were purchased from IDT technologies and transfected with RNAi Max (Fisher). A negative control inhibitor was used as recommended by IDT: NC1 Negative control (human); sequence: ucguuaaucggcuauaauacgc.

3D cell culture (tumourspheres)

MCF7 and HCC1395 cells were plated in single-cell suspension in ultralow attachment plates (Corning, # CLS3471). Cells were grown in serum-free DMEM/F12 medium (Gibco) supplemented with B27 (1:50, Gibco), 20 ng/mL basic fibroblast grown factor (bFGF, Biolegend), and 20 ng/mL EGF (Sigma).

For the Gecko screen cells were plated at a density of 5×106 cells per ultra-low attached flask (T75). After 7 days of sphere formation spheres were dissociated with Accutase and 5 × 106 cells re-seeded into the ultra-low attachment flask. Tumoursphere sample pellets were collected after 4 weeks of sphere formation.

For sphere formation assay, BC cells were plated in ultralow attachment plates (24 well) at a density of 1.5 × 103 for HCC1395 and 1 × 103 for MCF7 cells/well, and formed spheres with a size larger than 50 μm were counted under the microscope. The percentage of sphere formation efficiency was calculated as a ratio between the number of formed spheres divided by the number of cells seeded, multiplied by 100. For each condition at least 6 wells of a 24 well plate were counted and 3 independent biological replicates performed.

To measure the corresponding 2D growth the CellTiterGlo 2.0 assay was used (Promega) according to the manufacturers instructions. For each condition at least 4 wells of a 24 well plate were assayed and 3 independent biological replicates performed.

Plasmids and primers

The lentiviral construct used for the GeCKO library was lentiCRISPRv2 (catalogue no. 52961; Addgene) with psPAX2 (catalogue no. 12260; Addgene) and PMD2.G (catalogue no. 12259; Addgene) as VSV-G envelope-expressing plasmids used with lentiviral vectors to produce lentiviruses.

CRISPR-CAS9 library preparation

We obtained the GeCKO v2 library from Addgene, amplified it by large scale electroporation with Endura competent cells (Lucigen) then amplified a sample by PCR to produce a library for next generation sequencing (NGS) on the Illumina MiSEq platform. The library passed the quality control checks for library representation.

We produced the GeCKO library viruses by transfecting into HEK293FT cells with two lentiviral packaging vectors and harvesting the supernatant 2 days later. The lentivirus virions were titrated in MCF7 and HCC13965 cells using the functional readout of Puromycin resistance. We generated a mutant cell pool from infecting 180 ×106 cells at a low MOI of 0.3 to ensure only 1 virus per cell. Cells were selected under Puromycin for 10 days to produce a stable mutant pool then a T0 sample harvested from at least 60 ×106 cells. The screen target population was then split into 2D and 3D culture conditions and cultured for 4 weeks.

RNA isolation and RT-qPCR assays/Taqman

Total RNA from cultured cells was extracted from TRIZOL Reagent (Sigma) using Direct-Zol RNA MiniPrep kit (Zymo) following the manufacturer’s instructions including DNase I treatment. For gene expression, cDNA was synthesised from 1 μg of purified DNase-treated RNA using RevertAid M-MuLV reverse transcriptase and random hexamer primers (Thermo Scientific), according to the manufacturer’s protocols. RT-qPCR assays were performed on a StepOne Real-Time PCR System using Fast SYBR Green Master Mix (both from Applied Biosystems).

RNAseq—quantseq and analysis

Total RNA from cultured cells was extracted from TRIZOL Reagent (Sigma) using Direct-Zol RNA MiniPrep kit (Zymo) following the manufacturer’s instructions including DNase I treatment. Quality and quantity of the extracted RNA samples were assessed with a 2100 Bioanalyzer using RNA 6000 Pico Kit (Agilent).

Single-indexed mRNA libraries were prepared from 100 ng of RNA with QuantSeq 3′ mRNA-Seq Library Prep Kit FWD (Lexogen), according to manufacturer’s instructions. Quality of libraries was measured using 2100 Bioanalyzer DNA High Sensitivity Kit (Agilent). Sequencing was performed with BGI DNBseq System (BGI) with PE100 reads. QuantSeq 3′ mRNA-Seq Integrated Data Analysis Pipeline on Bluebee® (Lexogen) was used for preliminary quality evaluation of the RNA sequencing data and Differential Expression analysis was performed on the platform using standard settings for the Quantseq 3’ kit (www.lexogen.bluebee.com). Bluebee uses DESeq2 to generate significant gene expression change and we used p-adjusted < 0.05 to select the differentially expressed genes.

3’ UTR reporter assay

3’UTR reporter vectors were purchased from Active Motif/Switchgear Genomics, along with the miR-4787 mimic, control mimic, negative control reporters. Reporters and mimics/inhibitors were reverse transfected with Lipofectamine 3000 (Fisher) and luminescence read after 48 h with the LightSwitch™ Luciferase Assay Kit (Active Motif).

3’UTR sequences are available here: https://switchdb.switchgeargenomics.com.

Product codes for the reporters used were as follows: S810530 PDCD4; S810360 PNKD; S880942 FOXO3A; S801327 BVES; S806225 ARHGAP17.

Bioinformatics

To identify putative targets of miR-4787-3p, we used TargetScan v8.0 by selecting all the identified potential target transcripts without applying any ranking score.

For RNA-seq analysis the QuantSeq 3′ mRNA-Seq Integrated Data Analysis Pipeline on Bluebee® (Lexogen) was used for preliminary quality evaluation of the RNA sequencing data and Differential Expression analysis was performed on the platform using standard settings for the Quantseq 3’ kit (www.lexogen.bluebee.com). Bluebee uses DESeq2 to generate significant gene expression change and we used p-adjusted < 0.05 to select the differentially expressed genes.

For miRNA analysis of patients’ samples miRNA-seq expression normalised data and clinical data were downloaded from the Xena browser (https://xenabrowser.net). Plots of miRNA expression values were generated in R version 4.1.0 using the ggplot2 package version 3.3.6. The heatmap was generated using the pheatmap package version 1.0.12. Kaplan-Meier curves to show patient overall survival based on miRNA expression data from the TCGA or the METABRIC were produced using the KMplot (https://kmplot.com/analysis/). Pathway enrichment analysis of our RNA-seq data was performed using Enrichr (https://maayanlab.cloud/Enrichr/), whereas the Gene Set Enrichment Analysis (GSEA) was run through the fgsea package version 1.18.0 in R.

Statistics and reproducibility

Unless otherwise stated at least 3 separate biological replicates were performed and statistical comparisons between groups were assessed with the Mann Whitney Rank Sum test.

Unless otherwise stated Box plots are described as follows: Centre lines show the medians; box limits indicate the 25th and 75th percentiles as determined by R software; whiskers extend 1.5 times the interquartile range from the 25th and 75th percentiles, outliers are represented by dots.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.