Introduction

Pediatric neoplasms in the central nervous system (CNS) are extremely heterogeneous and diagnostically challenging tumors. According to data from the Central Brain Tumor Registry of the United States (CBTRUS), CNS tumors have become the leading cause of cancer-related death in childhood1. Accurate diagnosis is crucial for an optimal management of children with these diseases. During the last several years, remarkable advances in our understanding of the molecular underpinnings of these tumors have occurred as a result of comprehensive (epi-)genetic profiling and led to substantial progress in the classification and therapy of pediatric CNS tumors2. In addition, several novel and extremely rare tumor types have been identified using state-of-the-art molecular methods such as DNA methylation arrays and genetic profiling3. More recently, a wide range of different oncogenic gene fusions outside the mitogen-activated protein kinase (MAPK) pathway have shown to play an important role in driving tumorigenesis of pediatric CNS tumors4,5,6, some of them with the potential to provide novel therapeutic options.

DNA methylation profiling of CNS tumors has been demonstrated to be a powerful tool for molecular tumor classification with the additional evaluation of copy number profiles being extremely useful for the identification of oncogenic gene fusions4,7,8,9. Such an approach is particularly valuable for the discovery and characterization of rare and novel tumor types that show a wide variety of clinicopathological appearances5.

Here, we describe a novel molecular CNS tumor type, primarily occurring in children, identified through unsupervised visualization of a large cohort of genome-wide DNA methylation profiling data, together with targeted next-generation DNA sequencing, and RNA transcriptome sequencing.

Results

DNA methylation profiling reveals an epigenetically distinct group of pediatric-type neuroepithelial tumors

Through unsupervised visualization of genome-wide DNA methylation data from a large cohort of approximately 90,000 pediatric and adult CNS tumor samples, we identified an epigenetically distinct group of tumors (n = 16), that did not match any known DNA methylation class. This group was comprised of tumors with a wide spectrum of original histological diagnoses, including predominantly high-grade glioma (such as glioblastoma, anaplastic astrocytoma, or ganglioglioma), with many tumors considered as not classifiable or with a descriptive diagnosis (Supplementary Table 1). In addition, two further CNS tumor samples harboring a CIC::LEUTX fusion (see below) that have already been published were included into subsequent molecular profiling10. A more selected visualization (t-SNE) of DNA methylation patterns of tumors in this novel cluster compared with well-characterized reference samples (tumor samples included in the current version of the Heidelberg DNA methylation brain tumor classifier with a calibrated score >0.9; Supplementary Table 2) confirmed a clearly distinct grou** (Fig. 1a). Importantly, no similarity was seen with the recently described tumor type “CIC-rearranged sarcoma” (previously CNS Ewing sarcoma family tumor with CIC alteration; Fig. 1b)11. Further analysis of differentially methylated regions between tumors within the novel group and CIC-rearranged sarcoma showed aberrant methylation patterns, including promoter region hypomethylation amongst others of CD44, EMP3, and VIM in these tumors (Fig. 1c). This was supported by an inverse expression profile of the respective markers by immunohistochemistry (n = 4; Fig. 1d). Analysis of copy number profiles derived from the raw intensities of the DNA methylation array probes revealed recurrent structural aberrations on chromosome 19q around the genetic loci of the capicua transcriptional repressor (CIC) and leucine twenty homeobox (LEUTX) in all samples (Fig. 2a and Supplementary Fig. 1). Further recurrent copy number alterations included: loss of chromosome 1p, 13q, 14q, and 22q (Fig. 2b). Gain of chromosome 8, typically present in CIC-rearranged sarcoma (Fig. 2c), was seen in a high proportion of cases as well (Fig. 2b). A summary of detected structural aberrations is given in Supplementary Table 3.

Fig. 1: Molecular classification of high-grade neuroepithelial tumors CIC fusion-positive by DNA methylation profiling.
figure 1

a Unsupervised, nonlinear t-distributed stochastic neighbor embedding (t-SNE) projection of DNA methylation array profiles from 1059 tumors. DNA methylation profiling reveals a molecular distinct group of high-grade neuroepithelial tumors (HGNET, CIC fusion-positive; n = 18). b t-SNE analysis of DNA methylation array profiles of HGNET, CIC fusion-positive, and CIC-rearranged sarcoma (CNS SARC, CIC). For DNA methylation class abbreviations, see Supplementary Table 2. c Volcano plot comparing differentially methylated probes between HGNET, CIC fusion-positive and CNS SARC, CIC. d Immunohistochemical expression of vimentin (VIM), CD44, and EMP3 in HGNET, CIC fusion-positive and CNS SARC, CIC. Scale bars 300 μm.

Fig. 2: Molecular characteristics of high-grade neuroepithelial tumors CIC fusion-positive (HGNET, CIC fusion-positive).
figure 2

a Copy number profile derived from DNA methylation array data of a HGNET, CIC fusion-positive showing structural alterations affecting chromosome 19q around the CIC and LEUTX locus. b, c Summary plot of copy number alterations in HGNET, CIC fusion-positive and CIC-rearranged sarcoma (CNS SARC, CIC). d Visualization of the CIC::LEUTX gene fusion detected by RNA sequencing, in which exons 1–20 of CIC, as the 5′ partner, are fused to exon 3 of LEUTX.

CIC gene rearrangements are a characteristic feature of tumors within the novel group

By targeted next-generation DNA sequencing and/or RNA sequencing, 9 out of 10 tumors analyzed (including the two previously published samples) demonstrated gene fusions between CIC and leucine twenty homeobox (LEUTX) as the 3′ partner, both located on chromosome 19q13.2 (Fig. 2d). In all of the tumors, exons 1–20 of CIC (NM_015125.5) were fused in frame to exon 3 of LEUTX (NM_001143832.2), retaining the DNA-binding high-mobility group (HMG) box of CIC and the suggested 9aaTAD domain of LEUTX12. These findings are in line with the breakpoints detected in a previously reported pediatric embryonal tumor of the CNS13 (Supplementary Table 4). In addition, a fusion between exons 1–20 of CIC and NUT midline carcinoma family member 1 (NUTM1, located on chromosome 15q14) exons 2–7 (NM_001284293.1) was observed, very similar to the CIC::LEUTX rearrangement (Supplementary Fig. 2 and Supplementary Table 4). Apart from the detected fusion events, no additional oncogenic alteration was identified in these tumors based on sequencing.

Clinical characteristics and morphological features indicate pediatric‐type high-grade neuroepithelial tumors

Analysis of available clinical data demonstrated that all tumors were located in the supratentorial compartment, mainly in the parietal and occipital lobe. The median age at presentation was 8.5 years (range 1–19) and the sex distribution was not significantly biased when considering the small number of patients. Clinical outcome data were available for only six patients. Median PFS was 13.5 months (range 6–16 months) with all of the patients experiencing a relapse during the follow-up period. Only one of the patients died of the disease during the follow-up period at 15 months after diagnosis. Together, these initial data suggest an intermediate malignancy grade (Supplementary Fig. 3). Initial histopathologic diagnoses comprised various tumor types of mainly high-grade glioma. More detailed descriptions of the cases are given in Supplementary Table 1. A histopathological review was performed on a subset of the tumors with available material (n = 9) that revealed a morphologically heterogeneous group. Histologically, all reviewed tumors shared a high cellular density with most neoplasms showing slightly pleomorphic neoplastic cells often with remarkably condensed chromatin (Fig. 3a). A more pronounced cellular pleomorphism with multinucleated cells were seen in single cases (Fig. 3b–d). An oligodendrocyte-like phenotype with perinuclear clearing was focally found in four of the cases (Fig. 3e, f). Microcystic changes were present in half of the tumors. Tumors were highly vascularized with hypertrophic or proliferated vessels in most of the cases (Fig. 3g). In three of the tumors, perivascular anucleate zones (pseudorosettes) were observed (Fig. 3h). Necrosis was present in five tumor samples (Fig. 3d). Mitotic activity was generally high, with the exception of one case. Immunostaining for markers of glial differentiation (GFAP and OLIG2) was positive in all tumors (Fig. 3i, j). However, GFAP expression was only weakly positive or restricted to a minor proportion of neoplastic cells in some of the cases. In 4/4 tumors, a focal immunoreactivity for MAP2 was detected (Fig. 3k). All tumors showed a weak and focal positivity for synaptophysin (n = 9; Fig. 3l). CD56 was expressed in all samples analyzed (n = 4). All evaluated tumors had absent immunostaining for NeuN (n = 5). CD34 (n = 6) expression was restricted to the vessels (Fig. 3m). A focal positivity for CD99 was observed in all evaluated samples (n = 4; Fig. 3n). Ki-67 labeling indices ranged from 10 to 70% (Fig. 3o, p).

Fig. 3: Morphological and immunohistochemical features of high-grade neuroepithelial tumors CIC fusion-positive.
figure 3

a Histologically, tumors show a high increase in cellular density of slightly pleomorphic neoplastic cells. b, c A more pronounced cellular pleomorphism with multinucleated cells is present in a subset of cases. d Tumor necrosis. e, f An oligodendroglial morphology with perinuclear halos is focally present in a minor proportion of tumors. g, h Tumors are highly vascularized with a subset of cases demonstrating perivascular anucleate zones (pseudorosettes). i, j Positive immunostaining for markers of glial differentiation (GFAP and OLIG2). k, l Tumor cells show focal immunoreactivity for MAP2 and synaptophysin. m CD34 expression is restricted to the vessels. n CD99 expression is focally present in all evaluated samples. o, p Ki-67 labeling indices range from about 10 to 70% of the neoplastic cells. Scale bars 200 μm.

Discussion

Here, we describe a previously uncharacterized group of rare, pediatric CNS tumors that was discovered through unsupervised visualization of genome-wide DNA methylation profiles. This novel group of tumors, epigenetically distinct from all known CNS neoplasms, shows recurrent gene fusions involving the transcriptional repressor CIC (most commonly with LEUTX) as an additional unifying feature.

CIC, a human homolog of capicua in Drosophila, acts as a transcriptional repressor with a DNA-binding high-mobility group (HMG) box domain that normally inhibits ETV1/4/5 expression and counteracts activation of genes downstream of receptor tyrosine kinase (RTK) signaling14. Aberrations in CIC have been identified in various types of cancer, with loss-of-function mutations frequently observed in oligodendroglioma15,16, leading to activation of downstream RTK signaling. Intriguingly, rearrangements involving CIC and the double homeobox 4 (DUX4) commonly found in high-grade round cell undifferentiated sarcoma17,18, have been shown to enhance the transcriptional activity of CIC downstream targets, including the member of the ETS family of transcription factors, such as ETV1/4/519,20. Consistent with that, an upregulation of members of the ETS transcription factor family have been reported in CIC-rearranged sarcoma (“Ewing sarcoma family of tumors with CIC alterations”) harboring oncogenic fusions between CIC and NUTM111.

In contrast, LEUTX is a member of the paired (PRD)-like homeobox gene family of transcription factors and is expressed almost exclusively in early embryos where it is thought to play a role during preimplantation development21,22. Although rearrangements between CIC and LEUTX have been reported recently10,9 were used to calculate the 1-variance weighted Pearson correlation between samples. The resulting distance matrix was used as input for t-SNE analysis (Rtsne package version 0.13). The following non-default parameters were applied: is_distance = T, theta = 0, pca = F, max_iter = 10,000, perplexity = 30. Estimation of differential methylated positions (DMP) was done in R by using the function “dmpFinder” from the minfi package (v1.43). The Illumina EPIC platform was used to annotate CpGs by their position in the genome and associated genes. The tests were carried out on the M-values of all promoter-associated genes as well as the top 100k CpGs according to mean average deviance. CpGs with an FDR q-value smaller than 0.05 were considered as significant differential methylated. Volcano plots of the DMPs were generated by the R-package ggplot2 (v3.3.6). CpGs are distributed according to their −log10 Q values and fold change (intersect). Further significant different methylated CpGs are depicted in red while the associated gene top100 DMP is shown.

Targeted next-generation DNA sequencing

For a subset of samples with DNA available (n = 9), DNA sequencing using a customized enrichment/hybrid-capture-based next-generation sequencing (NGS) gene panel were performed on a NextSeq 500 or NovaSeq 6000 instrument (Illumina) at the Department of Neuropathology of the University Hospital Heidelberg (Heidelberg, Germany)27. The NGS panel comprised the entire coding (all exons + /– 25 bp) and selected intronic and promoter regions of 170 genes of particular relevance in CNS tumors, and was designed to detect single nucleotide variants (SNV), small insertions/deletions (InDel), exonic rearrangements, and recurrent fusion events. Paired-end sequencing was applied to increase the detection sensitivity of duplicates and possible gene fusions. Sequence reads were mapped to the reference human genome build GRCh37 (hg19) using the Burrows–Wheeler aligner (BWA).

RNA sequencing and analysis

RNA sequencing for the purpose of gene fusion detection of samples for which RNA of sufficient quality and quantity was available (n = 8) was performed as previously described28. In brief, RNA sequencing libraries were prepared using the TruSeq RNA Library Prep for Enrichment kit (Illumina) and paired-end reads were sequenced on a NextSeq 500 or NovaSeq 6000 instrument (Illumina). After adapter trimming, reads were aligned to the human genome (GRCh37) with the STAR aligner29 and counted using RSEM30. Fastq files from transcriptome sequencing were used for de novo annotation of fusion transcripts using the Arriba (v1.2.0) algorithm31 with standard parameters, which removes recurrent alignment artifacts, transcript variants also observed in normal tissue, reads with low sequence complexity, and events with short anchors or breakpoints in close proximity or a low number of supporting reads relative to the overall number of predicted events in a gene.

Survival analysis

Survival analysis was performed using GraphPad Prism 9 (GraphPad Software, La Jolla, CA, USA). Data on survival could be retrospectively retrieved for six patients. Overall survival (OS) and progression-free survival (PFS) probabilities were displayed using the Kaplan–Meier method.

Reporting summary

Further information on research design is available in the Nature Research Reporting Summary linked to this article.