Introduction

Breast cancer is one of the most common malignancies in women worldwide [1]. It displays complex diversity in both molecular alterations, clinical manifestations, and pathological characteristics [2,3,4,5]. Estrogen receptor (ER), progesterone receptor (PR), and human epidermal growth factor receptor-2 (ERBB2/HER2) are considered as the molecular markers for diagnostic classification of breast cancer subtypes [6, 7]. Breast cancer with the genetic signature of ER-negative, PR-negative, and ERBB2/HER2-negative has been classified as triple-negative breast cancer (TNBC) [8], which represents the most aggressive clinical subtype with a poor prognosis. Breast cancer, and in particular TNBC, presents significant genomic defects [9,10,11,12] involving protein degradation [13,14,15], mutations or deregulation of the p53 family members of tumor suppressors [16,17,18,19,20] as well as of other transcription factors [21,22,23]. In addition, also defects in metabolism or in hypoxia response [24,25,52,53,54,55]. As shown in Fig. 1G, the patient indeed showed absence of ER1, ER2, PR and HER2 mRNAs as compared to the clinical cohort (580 breast cancer patients). Further characterization of the tumor revealed a basal subtype of TNBC characterized by low to absent luminal differentiation marker expression, and high expression of epithelial-to-mesenchymal transition (EMT) and cancer stem cell-like markers (e.g., low claudin). The patient underwent a comprehensive genomic profiling that indicated the presence of four concurrent heterozygous somatic mutations in the ephrin type-A receptor 3 (EphA3), TP53, BRCA1-associated protein (BAP1) and MYB genes (Table 2).

Fig. 1: Histopathological analysis and molecular characterization of the tumor.
figure 1

A Hematoxylin and eosin staining shows a G3 infiltrating ductal carcinoma with inflammatory infiltrates. Scale bar represents 50 µm. B High magnification of panel A highlights the presence of peritumoral inflammatory cells (asterisk). Scale bar represents 50 µm. C Ki67 expression in more than 70% of breast cancer cells. Scale bar represents 50 µm. D c-Erb-B2 staining revealed score 0. Scale bar represents 20 µm. E PDL-1 (SP142) immunostaining shows positivity in more than 1% of tumor-associated lymphocytes (asterisk). Scale bar represents 100 µm. F High magnification of panel (E). scale bar represents 20 µm. G Expression mRNA levels (TPM) of estrogen receptor 1 (ESR1), progesterone receptor (PR1), ERBB2 receptor tyrosine kinase 2 (HER2) and proliferation marker KI-67 (MKI67) for the patient (red triangle) and the clinical cohort (blue boxplot).

Table 1 Clinical data of the breast cancer patient enrolled in this study.
Table 2 Genetic alterations detected in the tumor.

The His214 frameshift mutation of TP53 (Table 2) lies within its DNA-binding domain (DBD), suggesting that it may impair its ability to contact DNA (Fig. 2A). This mutation has not been described previously in cancer patients. Interestingly, Yaupt and co-authors reported a truncated protein (p53d1214), composed by the first 214 amino-terminal residues of p53 that lacks the transactivation function, even though retains the ability to induce apoptosis [41, 56]. The relevance of this mutation in vivo and its biological significance warrants further studies.

Fig. 2: Molecular and chromosomal alterations in the breast cancer patient.
figure 2

A Schematic structure of the p53 protein and lollipop plot showing the incidence of mutations in the TP53 gene in METABRIC cohort. Patient’s mutation is indicated. B Schematic structural features of the BAP1 protein and lollipop plot showing the incidence of BAP1 gene mutations in METABRIC cohort. Patient’s mutation is indicated by arrow. C Schematic structural features of the BRCA2 protein and lollipop plot showing incidence of mutations in the BRCA2 gene in METABRIC cohort. Patient’s mutation is indicated by arrow. AC Data were obtained from cBioPortal. D Mutational contribution of HR-related signatures. E The patient has a higher TMB as compared to the cohort median (~80% percentile). F MSI score (MSI High: score > 0.901). The patient is observed as having MSI Low status (score = 0.09). G Chromosomal instability: CNH is higher in the patient compared to the median disease cohort, whereas numerical and structural CIN values are not much higher than median. The patient (red triangle) is compared to the clinical cohort (blue boxplot).

We also identified a somatic mutation in the BAP1 gene encoding a ubiquitin carboxy-terminal hydrolase that regulates several important cellular responses including HR DNA repair and cell growth [57, 58]. Germline inactivation of BAP1 confers an increased risk for develo** cutaneous and uveal melanoma, mesothelioma, renal cell carcinoma, and breast cancer, albeit to a lesser extent [59]. Mutations of BAP1 have been included in the HR-deficiency-associated pathways in breast cancer, particularly in the TNBC subtype, characterized by a relative high mutation frequency [85], implying that the patient presented in this study would be probably sensitive to this treatment.

Methods

Collection of samples

Tumor tissues were globally collected using a standardized protocol, minimizing the ischemia time until freezing in liquid nitrogen. To ensure the quality of the samples, all tissues were Hematoxilin and Eosin stained and subjected to a pathological QC. Samples need to be invasive, have a tumor content of ≥30% and Necrosis ≤30%. Normal tissues were processed in parallel and need to be free of tumor and representative regarding the tumor tissue to be included.

Approximately 10 mg tissue were taken for nucleic acid extraction and protein lysate preparation each. To account for tumor heterogeneity, pathological QCs were performed on two sections, before and after taking the analysis material. The tissues stay frozen during the entire process.

Immunohistochemical analysis

Approximately 1 × 1 × 0.5 cm of tissue was formalin-fixed and paraffin-embedded (FFPE). Serial sections were used to evaluate prognostic and predictive biomarkers including ER, PR, Ki67, and HER2 through immunohistochemistry. Briefly, sections were stained using the automated Leica Bond IHC platform (Leica Biosystems, Deer Park, IL). After antigen retrieval, 4-μm thick sections were incubated with the following primary monoclonal antibodies: mouse monoclonal anti-Ki67 (clone MM1; Leica Biosystems), mouse monoclonal anti-HER2 (clone CB11, Leica Biosystems) and rabbit monoclonal anti-PDL1 (clone sp142; Ventana Roche, USA). Reactions were revealed using BOND-PRIME Polymer DAB Detection System (Leica Biosystems, Deer Park, IL). Immunohistochemistry was evaluated by two blind pathologists.

Nucleic acid extraction and quality assessment

Frozen tissue slices were mixed with beta-mercaptoethanol containing sample buffer and homogenized using the BeadBug system. DNA and RNA were extracted in parallel from the same sample using the Qiagen AllPrep Universal Kit according to the manufacturer’s instructions.

DNA and RNA concentration were quantified using Qubit fluorometer with the Qubit dsDNA BR assay or Qubit RNA BR assay respectively.

DNA and RNA quality were assessed using the Agilent Tapestation with the Agilent Genomic DNA kit or Agilent High-Sensitivity RNA ScreenTape kit respectively. RNAs need to have a RIN ≥ 4 or a DV200 ≥ 60 to be selected for library preparation.

Library preparation and NGS sequencing

Libraries for whole genome sequencing (WGS) were prepared using the PCR-free KAPA Hyper Prep Kit (Roche). For whole transcriptome sequencing, RNA samples were depleted of the ribosomal RNA using the Ribo Zero Kit (Illumina) and library preparation was performed using the TruSeq Stranded Total RNA Kit (Qiagen). For small RNA sequencing the QIAseq miRNA Kit (Qiagen) was used All library preparation kits were used according to manufacturer’s instructions. Sequencing was performed on a NovaSeq6000 system (Illumina).

For WGS, average coverage for tumor samples was ≥60X and ≥30X for normal samples with a total genomic coverage of ≥95%.

Whole transcriptome sequencing datasets have ≥100 million total reads with <20% of ribosomal origin and ≥20 million reads map** to mRNAs according to Ensembl reference. Ribosomal depletion was performed to remove nuclear rRNA and mt-rRNA.

NGS data processing

NGS data was aligned against Grch38 genome assembly. Identification and annotation of short genomic variations in normal sample was done using Haplotype Caller (genome analysis toolkit; GATK) [86]. WGS somatic variation were called using a consensus of Mutect2(ref. 87), Strelka [88], Varscan [89] and Somatic Sniper [90]. Structural variations were called using R packages TitanCNA [91]and DellyCNV [92].

RNA-Seq differential expression was based on normalized readcount data (TPM: transcripts per million).

Mass spectrometry phospho-proteome profiling

For phospho-proteome profiling, 5–10 mg of fresh-frozen tissue was lysed in 2 mL Precellys® CK14 tubes containing 1.4 mm ceramic beads and using a lysis buffer containing PhosSTOP™ and bead shaking using a Precellys® Evolution Homogenizer equipped with a Cryolys® cooling module. After overnight digest samples were acidified and subjected to peptide desalting using Waters HLB Oasis 30 mg 96-well plates. 500 μg of peptide preparation was subjected to phospho-peptide enrichment using MagReSyn® Ti-IMAC magnetic beads (ReSyn Biosciences) as described in ref. 93 with modifications to enable processing using a KingFisher™ Flex robot equipped with a 96-magnetic pin head. Peptides were desalted using Waters μElution plates, dried down and resolubilized.

For DIA LC-MS/MS measurements, 5 μg of peptides per sample were injected to a reversed phase column (nanoEase M/Z Peptide CSH C18 Column, 1.7 μm, 300 μm X 150 mm) on a Waters ACQUITY UPLC M-Class LC connected to a Thermo Scientific™ Orbitrap Q Exactive™ HF-X mass spectrometer equipped with an EASYspray source. The nonlinear LC gradient was 1–60 % solvent B in 60 min at 50 °C and a flow rate of 5 μL/min. The DIA method consisting of one full range MS1 scan and 50 DIA segments was adapted from Bruderer et al. [94].

Tissue-specific spectral libraries were generated combining high-fractionated DDA and DIA measurements on a pool of tissue material and raw data processed using Biognosys’ software Spectronaut 13.

Bioinformatical analyses

Mutational signatures were calculated using the R package MutationalPatterns [95]. MSI classification was done using R package MSIseq [96]. PAM50 subty** as well as risk scores were investigated using R package genefu [97].

TMB was calculated as the number of non-synonymous mutations of protein-coding genes divided by exome size in Megabases.