Background

Chromosomal dosage can be altered through loss or gain of chromosomes, which, for autosomes, is generally associated with pathologies. Differential dosage of the mammalian sex chromosomes, by contrast, drives sex determination, in case of the Y, and contributes to sex differences, in case of the X chromosome [1, 2]. The dosage imbalance for X-chromosomal genes between XX females and XY males is largely neutralized in somatic cells through X-chromosome inactivation (XCI), where one X chromosome is nearly completely silenced in each female cell [3]. A subset of genes escape XCI and likely contribute to sex differences, for example in the context of immunity and autoimmune diseases [4,5,6]. During early embryonic development, however, prior to the onset of XCI, the majority of X-linked genes are expressed at double the levels in female compared to male cells, resulting in substantial sex differences in cell state and developmental progression [7].

In many mammalian species, including mice, cows, and humans, female embryos develop more slowly than their male counterparts during early development [8]. Since no fetal hormones are produced at this stage, these observations have been attributed to variations in sex-chromosomal dosage, which in mice has been confirmed by the analysis of X-monosomic XO embryos [9, 10]. These sex differences have been investigated at the molecular level in female mouse embryonic stem cells (mESC), which are derived from early blastocyst embryos and thus carry two active X chromosomes. Female mESCs appear to be shifted towards a more naive ground state of pluripotency, which is associated with reduced activity of the differentiation-promoting MAP kinase (MAPK) signaling pathway, increased levels of (naive) pluripotency factors, and lower levels of global DNA methylation [11,12,13]. As a consequence, exit from the pluripotent state during differentiation is delayed in female compared to male mESCs [11]. Similar patterns have been observed in induced pluripotent stem cells (iPSCs) [12]. These X-dosage effects are likely mediated by X-encoded genes that modulate the stem cell state, the identity of which however remains mostly unknown. They might pose a biological checkpoint to ensure that only cells that have successfully inactivated one of their X chromosomes contribute to the differentiated adult organism.

In somatic cell types, MAPK signaling plays a key role in the regulation of cellular programs such as proliferation, but in mESCs, it drives the exit from the pluripotent state, while its inhibition stabilizes the self-renewing naive ground state of pluripotency [14, 15]. The main growth factors that stimulate MAPK signaling at these early developmental stages belong to the fibroblast growth factor (Fgf) family [16, 17]. Upon activation of the FGF receptor (FgfR), and the subsequent membrane recruitment of the growth factor receptor-bound protein 2 (Grb2), the small GTPase Ras is activated [18, 19]. Ras in turn triggers the kinase cascade of Raf, Mek, and Erk. Erk then translocates to the nucleus and activates MAPK target genes, including Egr1 and Spry4 (Additional file 1: Figure S1A) [20, 21]. Female mESCs express MAPK target genes at reduced levels compared to their male counterparts, suggesting an inhibition of the pathway [11]. To maintain homeostasis, the MAPK pathway is controlled by strong negative feedback loops on multiple levels [22]. MAPK inhibition therefore often leads to a counter-intuitive rise in phosphorylation levels of pathway intermediates due to reduced negative feedback activity [23, 24]. Female mESCs, where the MAPK pathway is inhibited, thus exhibit increased Mek phosphorylation compared to male cells, suggesting inhibition of the pathway downstream of Mek [11, 25].

MAPK signaling and pluripotency are tightly coupled, as the inhibition of this pathway blocks differentiation and leads to an increased expression of naive pluripotency markers and DNA hypomethylation, a hallmark of the naive pluripotent state [16, 26,27,28]. Reduced MAPK signaling in female mESCs thus results in increased levels of naive pluripotency factors such as Nanog and Prdm14, and global DNA hypomethylation [11, 13, 26, 29,30,31,32,33,34].

Although X-chromosomal dosage exhibits global effects on signaling and gene expression, central X-encoded genes that mediate these phenotypes remain to be uncovered [7]. The X-linked Erk phosphatase Dusp9 has been shown to underlie sex differences in DNA methylation, since a heterozygous mutation resulted in DNA hypermethylation as observed in male cells [25, 35]. However, pluripotency factor expression and differentiation has been reported to be unaffected in such mutant cells [12]. Moreover, a series of other X-linked genes, including the transcription factors Zic3 and Tfe3 have been investigated, but their heterozygous deletion in female cells had no detectable effect [12]. Taken together, the genetic determinants that drive sex differences in mESCs remain incompletely understood.

We have performed a series of complementary CRISPR screens to identify X-linked genes that modulate MAPK signaling, pluripotency, and differentiation and found several genes that contribute to these phenotypes. We show that the E3 ubiquitin ligase adaptor protein Klhl13 promotes pluripotency factor expression, while inhibiting MAPK target gene expression and differentiation. Female mESCs carrying heterozygous mutations of Klhl13 and the known X-linked MAPK inhibitor Dusp9 qualitatively recapitulate all aspects of the male pluripotency phenotype. We have thus identified the main drivers of X-dosage-dependent sex differences in mESCs and disentangled their relative contributions. Our approach can serve as a blueprint to investigate dosage effects of other chromosomes, such as those underlying trisomy 21, and our results will be important for development of gender-sensitive iPSC-based therapies.

Results

Pooled CRISPR knockout screen identifies X-chromosomal MAPK regulators

The X chromosome encodes ~ 1000 genes, any of which could potentially mediate the sex differences observed in murine pluripotent stem cells with respect to pluripotency factor expression, MAPK pathway activity, and differentiation efficiency [11, 12]. Since MAPK signaling represses pluripotency factors and promotes differentiation [16, 27, 28], we hypothesized that an X-linked MAPK inhibitor might underlie the female pluripotency phenotype [7, 11]. To comprehensively identify X-encoded MAPK inhibitors, we performed a chromosome-wide pooled CRISPR knockout screen (Fig. 1a). Through transduction of Cas9-expressing mESCs with an X-chromosomal sgRNA expression library, a pool of cells was generated with maximally one gene mutated per cell. Subsequent enrichment of cells with increased MAPK pathway activity and sequencing of their associated sgRNAs allowed identification of genes acting as MAPK inhibitors that, when deleted, increased MAPK signaling.

Fig. 1
figure 1

Identification of X-chromosomal MAPK regulators through a pooled CRISPR knockout screen. a Schematic depiction of the screen workflow: A female mESC line carrying a stably integrated fluorescent MAPK reporter, where expression of GFP is controlled by an SRE-Elk responsive promoter, was transduced with a construct expressing the Cas9 endonuclease. Cells were further transduced with a custom sgRNA library targeting the majority of X-chromosomal genes. GFP-high cells were sorted by flow cytometry, cultured for an additional 2 days and sorted again (double-sorted). The sgRNA cassette was amplified from genomic DNA and sgRNA abundance in the unsorted and double-sorted populations was determined by deep sequencing. The screen was performed in three independent replicates. b Composition of the GeCKOx sgRNA library, targeting X-linked genes and positive control genes known to regulate the MAPK pathway, with 6 sgRNAs per gene. As negative controls, non-targeting sgRNAs were included in the library. c Volcano plot of the screen results, where screen hits (FDR < 0.05, MAGeCK) are labeled in red (positive controls) or blue (X-linked genes)

To be able to enrich cells with high MAPK activity through fluorescence-activated cell sorting (FACS), we generated a female mESC line (1.8-SRE-Elk), where expression of GFP was driven by a synthetic MAPK-sensitive SRE-Elk promoter (containing binding sites for the transcription factors Elk1 and Srf, which are activated downstream of the MAPK pathway) (Fig. 1a). Reporter functionality was confirmed by treatment with an inhibitor of Mek, which resulted in the expected decrease in GFP fluorescence (Additional file 1: Figure S1B). To focus the screen on X-linked genes, we generated a custom sgRNA library (GeCKOx) containing a subset of sequences of the genome-wide GeCKO library [36], targeting 961 X-chromosomal genes with 6 sgRNAs per gene, where possible (Additional file 2: Table S1). As controls, 237 genes implicated in MAPK pathway regulation according to gene ontology (GO) annotation and 100 non-targeting controls (NTC) were included in the library (Fig. 1b). Sequencing of the sgRNA library confirmed an even representation (Additional file 1: Figure S1C).

To investigate the female pluripotency phenotype, cells were generally grown in classical ESC culture conditions, containing Serum and LIF, if not stated otherwise. For the screen, 1.8-SRE-Elk mESCs were first transduced with a lentiviral vector expressing the Cas9 endonuclease, followed by blasticidin selection, sgRNA library transduction, and puromycin selection. After expansion for 7 days (5 days under selection), cells with high reporter activity were FACS-sorted, replated, and cultured for two additional days, before being sorted once again (Fig. 1a). We reasoned that such a double-sorting strategy would increase sensitivity of the screen. The sgRNA cassette was amplified from genomic DNA of all double-sorted (day 9) and unsorted (day 7) cell populations and sgRNA abundance in each sample was quantified by Illumina sequencing. SgRNA counts in all libraries were highly correlated and NTCs were neither enriched nor depleted in the sorted fractions, suggesting that sufficient coverage was maintained at all steps of the screen (Additional file 1: Figure S1D-F).

Several core MAPK pathway components were significantly depleted in the GFP-high population (Erk2, Grb2, Frs2, Mek1, and Ptpn11), while Csk, a MAPK inhibitor [37], was enriched, showing that our screening setup could recover positive controls (Fig. 1c, red, Additional file 3: Table S2). Among the X-linked genes, 9 were significantly enriched and 18 were depleted in the sorted population (FDR < 0.05, MAGeCK, Fig. 1c, blue). Dusp9, Klhl13, and Zic3 were the top-scoring MAPK inhibitors, and Klf8, Nr0b1, and Eras were the strongest activators (Fig. 1c).

In principle, enrichment in the double-sorted fraction at day 9 compared to the unsorted cells at day 7 could also be due to faster proliferation between the two sampling points. To identify genes that affect proliferation or viability, we compared sgRNA frequency in the cloned library and the unsorted cells at day 7 (Additional file 1: Figure S1G; Additional file 3: Table S2). Among the identified X-linked MAPK inhibitors, only H2al1m seemed to affect mESC proliferation positively, which would however lead to a decrease and not an enrichment, in sgRNA abundance between day 7 and day 9. In summary, we found a series of X-encoded inhibitors of the MAPK pathway, which might potentially drive the X-dosage-dependent pluripotency phenotype.

Secondary screens identify X-linked regulators of pluripotency factors, differentiation kinetics, and Mek phosphorylation

Having identified a set of putative X-linked MAPK pathway regulators, we further investigated their function in a series of complementary CRISPR screens. Specifically, we tested whether the identified candidate genes affected pluripotency factor expression, differentiation dynamics, and phosphorylation of Mek in a manner that would phenocopy the male pluripotency phenotype. For this purpose, a sub-library of the GeCKOx sgRNA library (GeCKOxs) was generated, targeting the 50 most enriched and depleted X-linked genes, together with the 10 most enriched and depleted MAPK controls from the primary screen (Fig. 2a; Additional file 1: Figure S2A; Additional file 4: Table S3). For each gene, the three most effective sgRNAs were selected. In addition, sgRNAs targeting 10 pluripotency regulators were included as further controls (Sox2, Tbx3, Tcf3, Fgf2, Stat3, Esrrb, Tfcp2l1, Klf2, Nanog, and Oct4).

Fig. 2
figure 2

Secondary CRISPR screens profiling pluripotency factor expression, differentiation kinetics and Mek phosphorylation. a Composition of the GeCKOxs sgRNA library, targeting hits from the primary MAPK screen and positive control genes with 3 sgRNAs per gene. b Schematic representation of the C-terminal tagging of the Nanog and Esrrb genes with the mCherry fluorescent protein through Cas9-mediated homologous recombination and subsequent Cre-mediated excision of the puromycin resistance cassette. Nanog/Esrrb and mCherry are linked through a P2A self-cleaving peptide. c–e Schematic depiction of the three secondary screens to profile effects on pluripotency factor expression (c), differentiation (d), and Mek phosphorylation (e). Female mESCs, carrying mCherry-tagged Esrrb/Nanog loci, as indicated, expressing the Cas9 endonuclease, were transduced with the sgRNA library in a. c In the Nanog screen, the 25% cells with the weakest mCherry fluorescence were enriched in two consecutive sorts (day 7 and day 9 after transduction). d For the Esrrb screen, cells were differentiated via LIF withdrawal for 3 days and the 10% cells with the lowest mCherry fluorescence were FACS sorted. e In the pMek screen, cells were stained intracellularly with a pMek-specific antibody and the 25% cells with the lowest signal were sorted. Three replicates were generated for the Esrrb and pMek reporter screens and two for the Nanog screen. f Volcano plots of the most enriched and depleted genes in the Nanog, Esrrb, and pMek screens. Genes with an FDR < 0.05 are highlighted as indicated. g Heatmap summarizing the results from all 4 screens. Enrichment of all X-linked (left) and control genes (right) that were significantly enriched or depleted in at least 2 screens is shown. *FDR < 0.05 (MAGeCK), n.d non-determined. h Expression levels for a subset of X-linked genes shown in g in 1.8XX and 1.8XO mESCs assessed by RNA sequencing

To assess effects on pluripotency factor expression, we decided to assay for Nanog levels, which are consistently higher in female compared to male mESCs [11, 12, 25]. As a readout for differentiation efficiency, we monitored Esrrb, a naive pluripotency marker, which is downregulated with faster dynamics in cells with only one X chromosome [11]. We generated two transgenic mESC lines, where the endogenous Nanog and Esrrb genes, respectively, were tagged C-terminally with the fluorescent protein mCherry (Fig. 2b; Additional file 1: Figure S2B-C). Both reporters were downregulated upon differentiation, suggesting that they indeed mirrored expression of Nanog and Esrrb, respectively (Additional file 1: Figure S2D). In the pluripotency screen, we aimed at identifying Nanog activators, which when knocked out would reduce Nanog levels, and therefore sorted cells with low Nanog expression (Fig. 2c). Similarly, the differentiation screen aimed at identifying genes that would, when deleted, induce a more rapid downregulation of Esrrb. We therefore sorted Esrrb-low cells after 3 days of differentiation (Fig. 2d). For the Nanog screen, a double-sorting strategy similar to the primary MAPK screen was used, while only a single sorting step was performed for the differentiation screen, where a transient phenotype was analyzed (Fig. 2c, d).

In a third secondary screen, we aimed to test whether deletion of the candidate genes would result in decreased phosphorylation of Mek as observed in cells with one X chromosome [11, 25]. To this end, we performed an intracellular staining with a pMek-specific antibody and sorted cells with a low pMek signal (Fig. 2e). Staining specificity was confirmed by a higher pMek signal in XX compared to XO cells, together with an increase in pMek levels upon Meki treatment in the latter (Additional file 1: Figure S2E). Since the staining required cell fixation, only a single sorting step was possible. Sufficient sgRNA library representation was maintained throughout all steps of the screens (Additional file 1: Figure S2F). NTCs were neither enriched nor depleted in the pluripotency and differentiation screens, but seemed slightly but significantly depleted in the pMek screen (Additional file 1: Figure S2G). SgRNAs targeting the screen hits however exhibited a much stronger effect (Additional file 1: Figure S2G). Among the known MAPK regulators, the pathway components Erk2, Grb2, and Frs2 were identified as anti-pluripotency and pro-differentiation factors and the negative MAPK regulator Csk showed the opposite behavior (Fig. 2f, red, Additional file 3: Table S2). Erk2 also scored as the strongest negative regulator of Mek phosphorylation due to strong Erk-mediated negative feedback regulation [23, 24]. Ptpn2, a known negative regulator of MAPK signaling [38, 39], was surprisingly identified as an anti-pluripotency and pro-differentiation factor, potentially due to its previously reported inhibitory effect on Jak/STAT signaling, a pro-pluripotency pathway [40]. Moreover, Folliculin (Flcn) was identified as a strong pro-differentiation factor in agreement with its previously reported central role in early differentiation [41]. Finally, also the pluripotency factors Tfcp2l1 and Klf4 were identified as Nanog activators as expected [28, 42, 43]. Nanog and Esrrb themselves were enriched 3.1 (FDR = 0.67) and 1.6-fold (FDR = 0.2), respectively. The low statistical power to detect Nanog enrichment can be attributed to the fact that sgRNAs targeting Nanog become depleted over time, because they affect proliferation (Additional file 1: Figure S2H; Additional file 3: Table S2). Interestingly, the pluripotency regulators Stat3, Esrrb, and Tfcp2l1 scored as positive regulators of Mek phosphorylation, potentially in part due to crosstalk from the Jak/Stat to the MAPK signaling pathway [44]. In summary, all three secondary screens recovered known regulators supporting the validity of the approaches.

In all three secondary screens, 5–6 X-linked genes were enriched in the (double-)sorted populations, while only maximally 2 were depleted (Fig. 2f, blue, Fig. 2g, Additional file 3: Table S2). The only gene that significantly affected all 4 phenotypes (including the SRE-Elk screen, Fig. 1) was Dusp9, a known MAPK inhibitor that dephosphorylates Erk and has previously been implicated in sex differences in ES cells [12, 25, 35]. In addition, Klhl13, two members of the Fthl17 cluster, Fthl17e and Fthl17f, Zic3 and Stag2 significantly affected 2–3 phenotypes and generally showed the expected trend in all screens (Fig. 2g; Additional file 1: Figure S2I). Taken together, we have identified 6 genes that might contribute to the sex differences observed in mESCs, none of which, apart from Dusp9, has previously been implicated in mediating sex differences. Klhl13 encodes a substrate adaptor protein for the Cullin3 E3 ubiquitin-protein ligase complex with no known role in pluripotency or MAPK signaling regulation [45]. The Fthl17 gene cluster encodes ferritin-like proteins with unknown functions that are partially nuclear and lack ferroxidase activity [46]. Zic3 is a transcription factor implicated in pluripotency and early differentiation, whereas Stag2 regulates chromatin conformation and has also been shown to be involved in the maintenance of the pluripotent state in mESCs [47,48,49,50]. Among these candidates, the strongest effects were observed for Dusp9 and Klhl13.

Klhl13 and Dusp9 exhibit higher levels in females in vitro and in vivo

To further characterize the six identified putative mediators of the female pluripotency phenotype, we compared their expression pattern between cells with one and two X chromosomes, both in vitro and in vivo. We generated RNA sequencing data of the female mESC line used in all screens (1.8XX) and a subclone of that line with only one X chromosome (1.8XO, Additional file 5: Table S4). Although X-linked genes showed in general the expected 2-fold higher expression in XX compared to XO cells (Additional file 1: Figure S3A), two genes, Zic3 and Stag2, were expressed at similar levels in the two cell lines (mean fold-change 0.8 and 1.1 respectively), potentially due to gene-specific dosage-compensation mechanisms (Fig. 2h). Dusp9 and Klhl13 were expressed at 4.2- and 3.2-fold higher levels in XX compared to XO cells, respectively, and the two members of the Fthl17 cluster were essentially not expressed in the XO line (Fig. 2h). The strong expression difference for Fthl17e and Fthl17f can be explained by the fact that the cluster is maternally imprinted, such that it is only expressed from the paternal X chromosome, which is present only in female embryos and was probably also lost in the XO clone [51].

To assess expression patterns in mouse embryos in vivo, we analyzed epiblast cells in published single-cell RNA sequencing data collected between embryonic days E4.5 and E6.5 (Additional file 1: Figure S3B-G) [52]. Reactivation of the paternal X chromosome, which is silenced early in development in an imprinted form of XCI, is initiated around E4.5, completed at E5.5 and followed by random XCI around E6.5 [53, 54]. X-chromosomal expression was thus 1.6- and 1.4-fold higher in female compared to male cells at E4.5 and E5.5, respectively, with the difference being largely neutralized by E6.5 (Additional file 1: Figure S3B). In contrast to mESCs, where both X chromosomes are active in the naive pluripotent state, in vivo naive pluripotency factors are primarily expressed prior to X reactivation around E3.5 and are mostly downregulated at E4.5 [53]. As a consequence, most naive markers were not well detected in the data set we analyzed and a combined analysis of 9 naive factors revealed only a slight trend towards higher expression in female cells at E4.5 (Additional file 1: Figure S3C-D). Analysis of a group of 9 markers of the primed pluripotent state, by contrast, showed a clear trend towards higher expression in all three time points (Additional file 1: Figure S3E-F), suggesting that differentiation of female cells with a double X-dosage is also delayed in embryos in vivo. Analysis of the six identified putative candidate genes revealed a trend towards higher expression in female cells at E5.5 for all factors, which was statistically significant (p < 0.05, Wilcoxon rank sum test) for Dusp9, Fthl17e, Fthl17f, and Stag2 (Additional file 1: Figure S3G). In summary, all six factors were expressed at higher levels in female compared to male cells in vivo, but only four of them (Dusp9, Klhl13, Fthl17e/f) showed the same trend in the 1.8XX/XO cell lines in vitro. Since the 1.8XX/XO lines show a strong X-dosage-dependent phenotype [11], we concluded that the four differentially expressed factors would be the best candidates for mediating X-dosage effects on pluripotency and differentiation and decided to further validate Dusp9 and Klhl13, which appeared to induce the strongest phenotypes.

Over-expression of Klhl13 and Dusp9 leads to an enhanced pluripotency state and slower differentiation kinetics in male mESCs

If Dusp9 and Klhl13 would indeed mediate the sex differences observed in mESCs, their over-expression in male cells should lead to a female-like pluripotency phenotype, while their heterozygous deletion should shift female cells towards a male-like phenotype. In order to over-express Klhl13 and Dusp9 from their endogenous loci in male mESCs, we implemented the CRISPR activation (CRISPRa) system. We made use of an E14 mESC line carrying the components of the CRISPRa SunTag system under control of a doxycycline-inducible promoter, which allows recruitment of multiple VP64 activation domains through a single sgRNA (Fig. 3a) [55, 56].

Fig. 3
figure 3

Over-expression of Klhl13 and Dusp9 in male mESCs leads to an enhanced pluripotency state and slower differentiation kinetics. a Schematic representation of the dCas9-SunTag system used for gene activation. b–e To over-express Dusp9 (yellow) and Klhl13 (blue), male E14 mESCs, stably expressing the doxycycline-inducible SunTag system, were either transduced with one of two different sgRNAs targeting the respective promoter regions or with non-targeting control (NT) sgRNAs and were treated for 3 days with 1 μg/ml doxycycline as indicated. Protein levels of Dusp9 (left) and Klhl13 (right) were quantified via immunoblotting (b), expression levels of MAPK target genes Spry4 and Egr1 (c) and of naive pluripotency factors Nanog and Prdm14 (e) were assessed by qPCR and phosphorylation of Mek and Erk was quantified by immunoblotting (d). The immunoblot signals were normalized to Tubulin (b) or to total Mek/Erk (d) and to the mean of two doxycycline-treated non-targeting control sgRNAs. qPCR measurements were normalized to two housekee** genes and to the respective untreated control (−Dox). Dots and triangles depict individual measurements of the two different sgRNAs, and thick bars show the mean of three biological replicates. f Dusp9- and Klhl13 over-expressing mESCs were treated with 1 μg/ml doxycycline 24 h before differentiation via LIF withdrawal for 4 days, and expression levels of pluripotency factors were measured by qPCR at different time points as indicated. Mean and standard deviation across 3 biological replicates is shown. g Global CpG methylation levels in cell lines over-expressing Dusp9 and Klhl13 via doxycycline treatment for 3 passages were assessed via pyrosequencing-based luminometric DNA methylation assay (LUMA). *p < 0.05 in a two-tailed paired Student’s t test comparing the Dusp9/Klhl13 over-expressing samples and the non-targeting controls (mean of sgRNA1 and sgRNA2)

The SunTag system was recruited to either the Dusp9 or Klhl13 promoters using two different sgRNAs per gene and one sgRNA per cell line, leading to a 4- and 3.3-fold over-expression of Dusp9 protein and to a 2.9- and 2.1-fold induction of Klhl13 protein, respectively (Fig. 3b; Additional file 1: Figure S4A). We then characterized these cell lines with respect to pluripotency factor expression, differentiation dynamics, MAPK pathway activity, and global DNA methylation levels, all of which are affected by X-chromosomal dosage in mESCs. To assess MAPK pathway activity, we measured expression levels of Spry4 and Egr1 [20, 21], two well-known Erk target genes, by qPCR (Fig. 3c). Both MAPK target genes were strongly downregulated upon Dusp9 over-expression (2.7/5.5-fold), as expected for an Erk phosphatase, while their expression was only slightly, albeit mostly not significantly reduced upon Klhl13 over-expression. When assessing phosphorylation levels of MAPK pathway intermediates, we found that Dusp9 over-expression reduced pErk levels 12-fold, again as expected for an Erk phosphatase, but increased pMek 22-fold, most possibly due to reduced negative feedback inhibition (Fig. 3d). Over-expression of Klhl13 by contrast had no significant effect on either Erk or Mek phosphorylation. A previous study had reported the opposite effect of Dusp9 over-expression on Erk phosphorylation, potentially due to the requirement for trypsinization when analyzing feeder-dependent mESC lines, which were used in that study [25] (Additional file 1: Figure S4B). Taken together, these results confirm that Dusp9 is a strong inhibitor of MAPK pathway activity, while Klhl13 might slightly inhibit MAPK target gene expression, but does not affect pathway intermediates, which is in accordance with our screening results (Fig. 2g).

We next assessed how over-expression of Klhl13 and Dusp9 would affect pluripotency factor expression and differentiation dynamics. To this end, we quantified the pluripotency factors Nanog and Prdm14, which have been reported to be expressed at 2–4-fold higher levels in PSCs with two X chromosomes compared to those with one [11, 12, 25]. Over-expression of Dusp9 in male mESCs resulted in a nearly comparable increase of Nanog (1.5–1.7-fold) and Prdm14 (2.3–3-fold) levels. Upon Klhl13 over-expression by contrast, only Prdm14 was increased (1.6-fold) and only by the stronger sgRNA (Fig. 3e). A very similar trend was observed with regard to differentiation dynamics, where Dusp9 over-expression essentially blocked downregulation of naive pluripotency markers, Nanog, Prdm14, and Esrrb, while for Klhl13 only the stronger sgRNA had a mild effect on differentiation dynamics (Fig. 3f). Over-expression of Dusp9 in male cells thus seemed to induce a strong shift towards the naive pluripotent state similar to female cells, while Klhl13 over-expression resulted in only a minor shift.

Since Dusp9 has been suggested to be responsible for the reduction of global CpG methylation levels typically observed in female mESCs (20–30% compared to 60–80% in male mESCs) [13, 25, 31], we analyzed how over-expression of Dusp9 and Klhl13 affected global DNA methylation through the pyrosequencing-based luminometric DNA methylation assay (LUMA; Fig. 3g). Upon Dusp9 over-expression, global DNA methylation levels were reduced from ~ 60% in NTC-transduced control cells to 53% and 42%, but were unaffected by Klhl13 over-expression. Our results confirm a previously described effect of Dusp9 on global DNA methylation [25].

Overall, we observe a stronger induction of a naive-like state in Dusp9- compared to Klhl13-over-expressing cells. It is important to note, however, that over-expression is less efficient for Klhl13 than for Dusp9 and that the observed effects seem to be strongly dose dependent. The fact that small, but significant effects are observed also for Klhl13 with the stronger sgRNA (which increases Klhl13 expression to levels similar albeit slightly lower compared to those in females) suggests that also Klhl13 might contribute to sex differences with respect to pluripotency factor expression, differentiation, and MAPK target gene expression. To test this, we further investigated the role of both genes in female mESCs.

Mutation of one copy of Klhl13 and Dusp9 in female mESCs induces the male pluripotency state

If increased expression of Klhl13 and Dusp9 in female compared to male cells is indeed what drives sex differences in mESCs, their deletion on one X chromosome in female ESCs should induce the male phenotype. We therefore generated both heterozygous (HET) and homozygous (HOM) mutant mESC lines for Klhl13 (K13) and Dusp9 (D9) and a heterozygous double-mutant line (D9K13). For Klhl13, a 5-kb region spanning the promoter was deleted using Cas9, whereas for Dusp9, where attempts to create a promoter deletion were unsuccessful, frameshift mutations were introduced through an sgRNA targeting the start of the coding sequence (CDS) (Fig. 4a; Additional file 1: Figure S4C-D). Two clones were analyzed for each genotype throughout all experiments except for differentiation dynamics. Loss of Klhl13 transcription in the respective mutants was confirmed by nascent RNA FISH (Additional file 1: Figure S4E) and all generated clones were karyotyped via double digest genoty**-by-sequencing (Additional file 1: Figure S4F) [57]. Dusp9 protein levels were reduced ~ 1.8-fold in the respective HET mutants, which is less than the 3.5-fold reduction observed when comparing XX to XO cells, suggesting that Dups9 levels are modulated by other X-linked genes (Fig. 4b). In HET lines with a Klhl13 mutation, the Klhl13 protein was reduced ~ 2.7-fold (Fig. 4b). In all cell lines, we then analyzed MAPK signaling, pluripotency factor expression, differentiation, and global DNA methylation.

Fig. 4
figure 4

Heterozygous mutations of Klhl13 and Dusp9 in female mESCs partially phenocopy the male pluripotency state. a Schematic depiction of the strategies used to generate Klhl13 (K13) and Dusp9 (D9) mutant cell lines. b–f Comparison of female 1.8XX mESCs with a heterozygous (HET) or homozygous (HOM) deletion of Dusp9 (yellow), Klhl13 (blue) or both (red) with the parental XX line and XO controls (2 clones per genotype). Individual measurements are shown as gray dots (clone 1) and triangles (clone 2), and the mean across two clones and three biological replicates is indicated by a thick bar. b Immunoblot quantification of Dusp9 (top) and Klhl13 (bottom) protein levels, normalized to Tubulin and to the mean of the XX controls. c Quantification of MAPK target genes by qPCR. d Boxplots showing expression of Mek (left), Gsk3 (middle), and Akt (right) target genes in cell lines with the indicated genotypes as assessed by RNA-seq. Boxes indicate the 25th to 75th percentiles, and the central line represents the median. e Quantification of pMek, normalized to total Mek and to the XX control cells by immunoblotting. f Pluripotency factor expression (Nanog and Prdm14) assessed by qPCR. g qPCR quantification of pluripotency factors during differentiation by 2i/LIF withdrawal in one clone for each genotype from the cell lines used in b–f. Mean and SD of three biological replicates is shown. *p < 0.05 Wilcoxon rank sum test (d), otherwise two-tailed paired Student’s t test comparing each mutant/XO cell line and XX wildtype controls

To assess whether MAPK pathway activity was affected in the mutant cell lines, we again quantified expression of the MAPK target genes Egr1 and Spry4 (Fig. 4c; Additional file 1: Figure S5A). Both were expressed at higher levels in all mutant lines compared to the XX control clones, suggesting that the MAPK pathway inhibition was at least partially lifted. Among the HET mutant lines, D9 showed the weakest effect, followed by K13 and D9K13, with the double mutant reaching similar expression levels as found in the XO control cells (3.4/6.5-fold for Spry4/Egr1 in D9K13-HET vs 2.7/10.2 in XO). To get a more global picture of signaling activity, we analyzed a larger set of MAPK target genes using RNA sequencing (Fig. 4d; Additional file 1: Figure S5B; Additional file 6: Table S5). In agreement with the qPCR results, we found MAPK target genes were significantly increased in K13-HET cells and further elevated in D9K13 double mutants. We also assessed signatures of two other signaling pathways, Akt and Gsk3, implicated in pluripotency and differentiation, for which differential activity has been found in male and female mESCs [11, 58, 59]. Again, the heterozygous D9K13 mutant cells showed the strongest effects on Akt and Gsk3 target genes (Fig. 4d, Additional file 1: Figure S5B). It is important to note, however, that for none of the pathways we analyzed, target gene expression reached the levels found in XO control clones, suggesting that additional genes, other than Dusp9 and Klhl13, are involved in their regulation.

We next investigated phosphorylation levels of Mek and observed a completely different pattern. Neither HOM nor HET mutations of Klhl13 had any effect on pMek, but levels were reduced in the Dusp9 mutants in a dose-dependent manner (Fig. 4e; Additional file 1: Figure S5C). The D9K13-HET double mutants resembled the D9-HET single mutants and exhibited a 2.6-fold pMek reduction compared to the wildtype XX control, thus approaching, but not reaching the 4.3-fold reduction observed in XO cells (Fig. 4e). Taken together, these results confirm the screening results that Dusp9 and Klhl13 both affect expression of MAPK target genes, but only Dusp9 has a detectable effect on Mek phosphorylation (Fig. 2g), which is in accordance with their over-expression phenotypes in male mESCs (Fig. 3). These findings are in agreement with the role of Dusp9 as an Erk phosphatase, which reduces Erk phosphorylation and consequently the Erk-mediated negative feedback upstream of Mek. Although Dusp9 acts directly on the MAPK pathway, its deletion affects MAPK target gene expression less than the deletion of Klhl13 (Fig. 4d), which is in contrast to results obtained from over-expression, where Dusp9 shows stronger MAPK activation than Klhl13 (Fig. 3c).

In the next step, we investigated pluripotency factor expression and differentiation kinetics. Nanog and Prdm14 expression were significantly reduced in K13-HET, but not in D9-HET lines (Fig. 4f; Additional file 1: Figure S5D). D9K13 double mutant cells expressed similar levels as the K13 single mutant. With a ~ 1.5-fold reduction, the two genes could account for ~ 50% of the 2–2.4-fold decrease in Nanog and Prdm14 levels observed in XO cells (Fig. 4f). For the assessment of differentiation dynamics, cells were first adapted to 2i conditions (with serum and LIF) for at least five passages. Since these conditions neutralize the expression differences of pluripotency factors between the cell lines in undifferentiated cells, they allow easier comparison of differentiation dynamics upon 2i/LIF withdrawal. Also here, Klhl13 had a stronger effect than Dusp9 (Fig. 4g). D9-HET mutants showed only a minimal reduction of Esrrb, Nanog, and Prdm14 levels during differentiation compared to wildtype cells, while all three marker genes were reduced more in K13-HET cells (Fig. 4g). In the double D9K13 mutant, the effects of the single mutants added up to nearly the levels observed in XO cells (Fig. 4g). We can conclude that Klhl13 has a stronger effect on pluripotency factor expression and differentiation than Dusp9 and that the double mutant can qualitatively, but not quantitatively reproduce the sex differences in mESC, suggesting that additional X-linked factors also contribute.

Finally, we also assessed global CpG methylation with the LUMA assay (Additional file 1: Figure S5E). In XX control cells, 31% of all CpG dinucleotides were methylated and levels were increased by ~ 10% in the single HET mutants and by ~ 15% in the HET double mutant and the HOM mutants. Given that D9K13-HET double mutants exhibited 44% methylation compared to 59% in XO control cells, Klhl13 and Dusp9 together could account for half of the differences seen in the XX/XO comparison.

To get a more global picture of how well the mutant lines recapitulated the XO phenotype, we performed a transcriptome comparison. For each genotype, we identified autosomal genes that were differentially expressed when compared to the parental XX line. We found that 201 out of 956 differentially expressed genes (DEGs) on autosomes in XO cells were also differentially expressed in K13-HET cells, but only 148 in D9-HET lines (Fig. 5a). For the D9K13 double mutant, the overlap was 265 genes. A similar pattern was observed when performing principal component analysis (PCA). Also here the double mutant was found most closely to the XO controls, followed by K13-HET and D9-HET single mutants (Fig. 5b). These findings suggest that Klhl13 contributes more to X-dosage induced transcriptome changes than Dusp9 and that a combined effect of both can explain the observed sex differences best, but not completely.

Fig. 5
figure 5

Global transcriptome profiling of Klhl13 and Dusp9 heterozygous mutant lines. a Differentially expressed autosomal genes (DEGs) in XO (gray), D9-HET (yellow), K13-HET (blue), and D9K13-HET cells (red) compared to the parental XX line were identified by RNA-seq (log2(fold-change) > 0.5 or log2(fold-change) < − 0.5, p value < 0.05). The overlap between these gene sets is shown in Venn diagrams. b Principal component analysis (PCA) of the 100 most variable autosomal genes across XX (black), XO (gray), and heterozygous mutant cell lines (D9 yellow, K13 blue, and D9K13 red), averaged across three replicates

When comparing the results of the mutant cell lines (Fig. 4) with the over-expression experiments in male cells (Fig. 3), it becomes apparent that the relative importance of the two genes seemed to be different in the two approaches. Dusp9 had a much stronger effect than Klhl13 on MAPK target genes and pluripotency factors in the over-expression experiment, while in the mutants, both genes affected MAPK target genes, but only Klhl13 altered pluripotency factor expression. To distinguish, whether this discrepancy was due to the direction of the perturbation or different perturbations strategies used, we implemented a third validation strategy, where Dusp9 and Klhl13 were downregulated through CRISPR interference (CRISPRi) in female mESCs, expressing an ABA-inducible split KRAB-dCas9 system (Fig. 6a). For both genes, 3 different sgRNAs targeting the gene’s TSS were coexpressed from a single vector resulting in ~ 20-fold reduction of mRNA expression of each gene, compared to non-targeting control sgRNAs (Fig. 6b, c). Out of 5 quantified MAPK target genes, the majority was increased upon Dusp9 and Klhl13 repression, with somewhat stronger effects for Dusp9 (Fig. 6d). The opposite pattern was observed, when profiling 5 naive pluripotency factors, where cells that downregulated Klhl13 seemed to express consistently lower levels of these genes compared to cells with Dusp9 downregulation (Fig. 6e), thus confirming the important role of Klhl13 observed with the knockout approach (Fig. 4f).

Fig. 6
figure 6

Knock-down of Dusp9 and Klhl13 in female mESCs leads to a shift towards the male pluripotency phenotype. a Catalytically dead Cas9 (blue) and the KRAB repressor domain (red) are each fused to one component of the PYL1-ABI system (gray), which dimerizes upon treatment with abscisic acid (ABA), resulting in gene repression. b CRISPR multiguide plasmid used for expression of three different sgRNAs targeting a specific gene. Each sgRNA is expressed under a different Pol III promoter, as indicated. c–e 1.8 female wildtype mESCs stably expressing the CRISPRi system shown in a were transduced with vectors expressing sgRNAs targeting Dusp9 (yellow), Klhl13 (blue), or a non-targeting control construct (NTC; black). Expression of each target gene (c), five MAPK target genes (d), and five pluripotency factors (e) was quantified by qPCR in cells expressing the respective sgRNAs or NTCs, as indicated. Bars represent the mean of 3 biological replicates, gray dots the individual measurements. Cells were treated with abscisic acid (ABA) for 5 days prior to cell harvesting for phenotypic assessment. *p < 0.05 two-tailed paired Student’s t test comparing gene-specific sgRNAs and NTCs are indicated

In conclusion, multiple genes underlie the female pluripotency phenotype of which we have identified and validated a novel key player, Klhl13. Dusp9 is responsible for the reduced levels of Mek phosphorylation in XX cells, but a combined effect of both genes together (partially) accounts for the global reduction of MAPK target genes in female ES cells. The pluripotency and differentiation phenotypes by contrast can primarily be attributed to reduced Klhl13 dosage in female cells. Since so far no mechanistic link between Klhl13 and pluripotency or differentiation had been reported, we set out to investigate putative Klhl13 interaction partners that might mediate the observed effects.

Identification of Klhl13 interaction partners

Klhl13 is a member of the Cullin3 E3 ubiquitin ligase complex, where it acts as a substrate adaptor mediating protein ubiquitinylation, which might target proteins for proteasomal degradation (Fig. 7a) [60, 61]. We reasoned that the Klhl13-mediated sex differences we have identified might be due to reduced protein abundance of Klhl13 substrate proteins in female compared to male cells, which affect pluripotency factors, differentiation, and MAPK target gene expression. To identify Klhl13 substrates in mESCs, we profiled Klhl13 interaction partners and then selected those with increased protein levels in K13-HOM mutant cells (Fig. 7b). To identify interaction partners, we ectopically expressed either full-length Klhl13 or the substrate-binding Kelch domain, tagged with a green fluorescent protein (GFP), and identified binding partners by immunoprecipitation-mass spectrometry (IP-MS) using a GFP-specific antibody (Fig. 7b–d; Additional file 1: Figure S6A; Additional file 7: Table S6). Since E3 ubiquitin ligases usually interact with their substrates only transiently because they are rapidly degraded, the cells were treated with a proteasomal inhibitor for their stabilization. We identified a total of 197 interaction partners for the GFP-Kelch domain and 218 for full-length Klhl13 that were enriched relative to the GFP-only controls (Fig. 7c, d; Additional file 7: Table S6). As expected, the interaction partners identified for full-length Klhl13 and for the Kelch domain showed a large overlap, with 110 proteins being identified in both pull-downs. Two known interaction partners (Nudcd3 and Hsp90aa1) were identified with both constructs and several members of the Cullin 3 complex (Cul3, Klhl22, Klhl21, Klhl9) were found to interact with full-length Klhl13 only as expected (Fig. 7c, d, triangles) [62]. To identify putative Klhl13 substrates among the 110 proteins found to interact with full-length Klhl13 and with the Kelch domain only, we quantified the total proteome of K13-HOM cells and the parental XX control line through MS with label-free quantification (Fig. 7e; Additional file 8: Table S7). Among the 299 proteins that were significantly upregulated in the mutant cells (p value < 0.05, Benjamini-Hochberg FDR), 5 proteins (Scml2, Peg10, Alg13, Larp1, Cct3) had been identified as putative substrates in our IP-MS experiment (Fig. 7f).

Fig. 7
figure 7

Identification of Klhl13 target proteins that mediate its effect on pluripotency and differentiation. a Schematic representation of the putative mechanism underlying Klhl13’s (blue) pluripotency-promoting effects, where differentiation-promoting substrate proteins (red) are targeted for proteasomal degradation through recruitment to the Cul3 E3 ubiquitin ligase complex via Klhl13’s Kelch domain. b Experimental strategy for the identification of Klhl13 target proteins: Putative substrates should be more abundant at the protein level in Klhl13-deficient cells and should interact with Klhl13 and with the Kelch domain only. To identify substrates, the proteomes in wildtype and K13-HOM cells were compared and Klhl13/Kelch interaction partners were identified through GFP-mediated IP-MS. All three datasets were integrated to identify candidate proteins. c, d Volcano plots of the IP-MS results for the GFP-Kelch (c) and GFP-Klhl13 (d) constructs. The mean fold-change across 3 biological replicates relative to the GFP-only control against the p value calculated via a two-sample Student’s t test with Benjamini-Hochberg correction for multiple testing is shown. Black lines indicate the significance threshold that was chosen such that FDR < 0.1, assuming that all depleted proteins (left-sided outliers) were false-positive. Triangles show known Klhl13 interaction partners extracted from the BioGRID. e Volcano plot showing the proteome comparison of 1.8XX wildtype cells and K13-HOM mESCs. The mean fold-change across 3 biological replicates is shown. Proteins that are upregulated upon Klhl13 depletion are highlighted in red (p < 0.05 of Student’s t test, Benjamini-Hochberg FDR). Circles in c–e depict putative Klhl13 substrate proteins that were found to interact with Klhl13 and the Kelch domain and were upregulated upon Klhl13 deletion. f Venn diagram summarizing results in c–e. g Model of how X-dosage modulates the MAPK signaling pathway, pluripotency factor expression, and differentiation

We next investigated whether any of the five identified putative Klhl13 target proteins would represent pro-differentiation factors by assessing MAPK target gene and pluripotency factor expression. To this end, we over-expressed them in female wildtype mESCs and tested whether their knock-down would rescue the phenotype of K13-HOM mutant cells. For the knock-down, we used the ABA-inducible split KRAB-dCas9 (CRISPRi) system described above (Fig. 6) and for gene over-expression an analogous system for gene activation, which recruits the VPR effector domain (CRISPRa) (Additional file 1: Figure S7A-B). Three different sgRNAs targeting the gene’s TSS were coexpressed from a single vector. Perturbation strength, as assessed by qPCR, was variable between genes, but reached at least 2-fold over-expression for all genes except Larp1 in the CRISPRa experiment and a more than 2-fold reduction for all except Cct3 upon CRISPRi (Additional file 1: Figure S7A-B).

We then assessed five MAPK targets (Spry4, Egr1, Etv4, Dnmt3b, and Grhl2) and five naive pluripotency markers (Nanog, Prdm14, Tfcp2l1, Tbx3, and Tcl1) by qPCR (Additional file 1: Figure S7C-D). For factors that mediate the Klhl13 phenotype, we would expect an increase of MAPK target genes and a decrease in pluripotency markers upon over-expression, and the opposite trends upon knock-down. Generally, we only observed weak and mostly not consistent effects upon perturbation (Additional file 1: Figure S7C-D). However, a subset of factors exhibited some of the expected trends. Peg10 led to a small, but significant increase in MAPK target genes and downregulation of naive pluripotency factors when over-expressed in female mESCs, a trend that was confirmed in an independent experiment with another sgRNA plasmid (Additional file 1: Figure S7E-G). This trend was however not observed in the CRISPRi experiment. Instead, knock-down of Larp1 seemed to partially rescue the reduced pluripotency factor expression in K13-HOM cells. The reason why no effect was observed for Larp1 in the CRISPRa experiment might be its inefficient over-expression in female wildtype mESCs (Additional file 1: Figure S7A).

In summary, we could not identify a single gene that might mediate the effects of Klhl13 on the female pluripotency phenotype through the chosen approach. Instead, the phenotype might be mediated by several factors, potentially including Peg10 and Larp1. Alternatively, ubiquitinylation of Klhl13 substrates might lead to consequences other than proteasomal degradation, in which case also Klhl13 interaction partners that were not upregulated in Klhl13 knockout cells might be involved in Klhl13 functions. We have thus narrowed down the list of candidate genes that warrant further investigation in the future.

Discussion

We present what, to our knowledge, is the first comprehensive functional identification of genes that drive phenotypic consequences of the loss or gain of an entire chromosome. We developed a hierarchical CRISPR screening approach, which allowed us to profile a large number of genes with respect to multiple phenotypes linked to sex differences in mESCs in an unbiased manner. In an initial X chromosome-wide screen, we identified a set of candidate genes, which were then further characterized for a role in modulating three additional molecular phenotypes. In this way, we identified several genes that potentially mediate X-chromosomal dosage effects and characterized the two strongest candidates in more detail, namely Dusp9 and Klhl13. Through CRISPR-mediated over-expression in male and knockout or knock-down in female cells, we show that these two genes contribute to partially overlap**, yet distinct aspects of the X-dosage induced phenotype and that they appear to act in concert with additional factors. The X-dosage-dependent effects in pluripotent cells can thus not be attributed to a single X-linked gene, but arise from a complex interplay of multiple factors.

Dusp9 is a phosphatase that dephosphorylates the MAPK pathway intermediate Erk and is thus a known negative regulator of the pathway [35]. In agreement with previous reports, we found that Dusp9 gain-of-function perturbations in male cells alter feedback strength and target gene expression [25, 63], while deletion of one copy of Dusp9 in female cells results in the opposite phenotype. In addition, we confirmed the previously reported alterations in global DNA methylation levels, but the magnitude of the effects was less pronounced in our study, maybe due to differences in the cell lines, culture conditions, and methylation assay used [25]. Dusp9 also strongly affected pluripotency factor expression and differentiation in the gain-of-function experiments, which is consistent with a previous report [25]. We however observed only marginal effects in the female heterozygous Dusp9 mutant cells, again in agreement with another study [12], which were however considerably stronger, when both copies of the gene were mutated or knocked-down.

For Dusp9, we thus observed a strong phenotype in the gain-of-function experiment in male cells and the opposite, albeit much weaker phenotype upon loss-of-function in female cells. Intriguingly, we found the opposite pattern for Klhl13, the second factor we investigated in detail. Here, the gain-of-function perturbation had only small effects, while loss-of-function led to an increase in MAPK target gene expression and a decrease for pluripotency factors, which was even more pronounced than the effects observed for Dusp9. This asymmetry between the gain- and loss-of-function perturbations remains puzzling and might point towards more complex interactions between multiple X-chromosomal factors.

Klhl13 is a substrate adaptor protein of the Cul3 E3 ubiquitin ligase complex [45] and has, to our knowledge, not yet been implicated in pluripotency, signaling, or X-dosage effects. Instead, it has been reported to be involved in mitotic progression through targeting Aurora kinase B in Hela cells [45]. We could however generate mutant ES cells with a normal karyotype without difficulty, suggesting a different function for Klhl13 in ES cells. While Klhl13 did not affect phosphorylation of the MAPK pathway intermediate Mek, we found that knockout of only one copy of the gene resulted in a substantial increase in MAPK target gene expression, a reduction in pluripotency factors, and more efficient differentiation.

We hypothesized that a protein, which is targeted for proteasomal degradation through Klhl13-dependent ubiquitinylation might mediate the Klhl13 phenotype. We therefore identified Klhl13-interacting proteins that were upregulated upon Klhl13 deletion. While none of the five identified proteins could fully recapitulate the Klhl13 phenotype, two of them, Peg10 and Larp1, might contribute. Peg10 is a known oncogene and has been shown to interact with Nanog and Oct4 in human cancer cells [64, 1: Figure S4B. For the amplification of the wildtype band, the HotStart Taq Polymerase (Qiagen) was used with an annealing temperature of 51 °C and 35 cycles. For the deletion, the Phusion HiFi Polymerase (NEB) was used with an annealing temperature of 63 °C and 35 cycles (Primer sequences are listed in Additional file 9: Table S8).

For the generation of Dusp9 mutant mESCs, 2 × 106 WT and K13-HET (Clone 1) cells were nucleofected with 5 μg of the PX458-Dusp9_sgRNA1 plasmid (Additional file 9: Table S8) and subsequently plated on gelatin-coated 6-cm plates. The next day, high GFP+ cells were single-cell sorted into a 96-well plate and expanded. Clones were screened for homozygous or heterozygous frameshift deletions via Sanger sequencing and immunoblotting. Heterozygous deletion of several selected clones was further confirmed via NGS. Briefly, a region surrounding the Dusp9 deletion was amplified using the Phusion HiFi Polymerase (NEB) with a total of 30 cycles and an annealing temperature of 65 °C (Primer sequences in Additional file 9: Table S8, OG197/OG198). A second PCR using again the Phusion HiFi Polymerase (NEB) with a total of 14 cycles and an annealing temperature of 65 °C was performed in order to attach the Illumina adaptors and barcodes (Additional file 9: Table S8, OG202/OG210). A dual barcoding strategy was employed, where Illumina barcodes were included in the reverse and custom sample barcodes in the forward primers. Samples containing the same Illumina barcode but different custom sample barcodes were pooled in an equimolar fashion and sequenced on the Illumina Miseq platform PE150. Samples were aligned using Bowtie2 [95] and an index containing sample barcodes and possible deletion sequences based on previously generated Sanger sequencing data, gaining approximately 4000 reads per sample.

Generation of cell lines expressing the KRAB/VPR-dCas9 systems using piggybac transposition

The 1.8 XX SP107 (Clone A2) and 1.8 XX K13-HOM SP107 mESC lines stably express PYL1-KRAB-IRES-Blast and ABI-tagBFP-SpdCas9, constituting a two-component CRISPRi system, where dCas9 and the KRAB repressor domain are fused to ABI and PYL1 proteins, respectively, which dimerize upon treatment with abscisic acid (ABA) [57]. Briefly, the forward and reverse strands of a barcode adapter and common adapter were diluted and annealed, after which they were pipetted into each well of a 96-well PCR plate together with 1 μg of each sample and dried overnight (Oligo sequences are listed in Additional file 9: Table S8). The following day, the samples were digested with 20 μl of a NIaIII and PstI enzyme mix (NEB) in NEB Cutsmart Buffer at 37 °C for 2 h. After the digest, a 30 μl mix with 1.6 μl of T4 DNA ligase (NEB) was added to each well and placed on a thermocycler (16 °C 60 min followed by 80 °C 30 min for enzyme inactivation). By doing this, barcode and common adapters with ends complementary to those generated by the two restriction enzymes were ligated to the genomic DNA. Samples were cleaned with CleanNGS beads (CleanNA) using 90 μl of beads for each well and following the manufacturer’s instructions. Samples were eluted in 25 μl ddH2O and DNA was quantified using a dsDNA HS Qubit assay (Thermo Fisher). Samples were pooled in an equimolar fashion, size-selected (300-450 bp) by loading 400 ng of each pooled sample on an agarose gel followed by a cleaning step using the Nucleospin Gel and PCR clean-up kit (Macherey-Nagel). Samples were PCR amplified using the Phusion High-Fidelity DNA Polymerase (NEB) and an annealing temperature of 68 °C over 15 amplification cycles (OG218/OG219). Resulting amplicons were cleaned with CleanNGS beads in a 1:1.2 ratio (sample:beads) and sequenced with 2 × 75 bp on the Miseq platform (12 pM loading concentration), yielding from 0.2 × 106 to 1 × 106 fragments per sample.

Data processing and statistical analysis was performed on the public Galaxy server usegalaxy.eu. Briefly, fastq files were uploaded and demultiplexed using the “Je-demultiplex” tool [106]. Reads were mapped to the mm10 mouse reference genome (GRCm38) using “Map with BWA” [107, 108]. Read counts for each chromosome were calculated with “multiBamSummary” [109] and normalized to a previously karyotyped XX control cell line (using Dapi-stained metaphase spreads and chromosome painting).

RNA-seq

For the RNA sequencing of 1.8 XX and 1.8 XO cell lines, libraries were generated using the Tru-Seq Stranded Total RNA library preparation kit (Illumina) with 1 μg starting material and amplified with 15 cycles of PCR. Libraries were sequenced 2 × 50 bp on one HiSeq 2500 lane (22 pM loading concentration), which generated ~ 40 Mio fragments per sample. The reads were mapped with the STAR aligner allowing for maximally 2 mismatches to the mm10 mouse reference genome (GRCm38) and quantified using the ENSEMBL gene annotation [110], resulting in ~ 80% uniquely mapped reads. Gene expression values (rpkm) were obtained using the EdgeR package in RStudio [111]. Rpkm values are provided in Additional file 5: Table S4.

For RNA sequencing of the mutant cell lines, the QuantSeq 3′ mRNA-Seq Library Prep Kit (FWD) for Illumina (Lexogen) was used with 800 ng starting material. Samples were sequenced with 1 × 75 bp on the NextSeq 500 Platform (2 pM loading concentration), which generated 1.5–3 Mio reads per sample. Read map**, UMI collapsing, and read counting were performed with the FWD-UMI Mouse (GRCm38) Lexogen QuantSeq 2.6.1 pipeline from the BlueBee NGS data analysis platform (https://www.bluebee.com/). 62–68% of reads were mapped uniquely. Differential expression analysis was carried out, excluding X-chromosomal genes, using the EdgeR package in RStudio, together with normalization of gene expression values (cpm) [111]. Cpm values are provided in Additional file 6: Table S5.

Single-cell RNA-seq data analysis

For reanalysis of previously published scRNA-seq data, the normalized counts and the cell type annotation were downloaded from https://github.com/rargelaguet/scnmt_gastrulation. Sex annotation was provided by Ricard Argelaguet. For comparison of individual genes between male and female cells, a Wilcoxon rank sum test was performed using the wilcox.test function in R. For comparing chromosome-wide expression, counts for all genes located on a specific chromosome were summed up for each cell and then compared with a Wilcoxon rank sum test as described above. For the analysis of gene groups (naive and primed pluripotency markers), the log2-transformed counts for all genes in the group were averaged for each cell and then analyzed as above.

Mass spectrometry

GFP immunoprecipitation

The GFP immunoprecipitation protocol was performed as described previously with minor modifications [112]. Briefly, cells were treated with 15 μM of MG132 for 3 h prior to harvesting. Cells were pelleted and resuspended in 1 ml of lysis buffer containing 150 mM NaCl, 50 mM Tris, pH 7.5, 5% glycerol, 1% IGEPAL-CA-630, 1 mM MgCl2, 200 U benzonase (Merck), and EDTA-free complete protease inhibitor cocktail (Roche). Cells were incubated on ice for 30 min to allow cell lysis. Lysates were centrifuged at 4000×g and 4 °C for 15 min, and the supernatant was incubated with 50 μl magnetic beads coupled to monoclonal mouse anti-GFP antibody (Miltenyi Biotec) for 20 min on ice. Magnetic columns were equilibrated by washing first with 250 μl of 100% EtOH followed by two washes with the same volume of lysis buffer. After the 20 min incubation, the lysates were applied to the column followed by three washes with 800 μl of ice-cold wash buffer I (150 mM NaCl, 50 mM Tris, pH 7.5, 5% glycerol, and 0.05% IGEPAL-CA-630) and two washes with 500 μl of wash buffer II (150 mM NaCl, 50 mM Tris, pH 7.5, and 5% glycerol). Column-bound proteins were subsequently pre-digested with 25 μl 2 M urea in 50 mM Tris, pH 7.5, 1 mM DTT, and 150 ng trypsin (Roche) for 30 min at room temperature. Proteins were eluted by adding two times 50 μl elution buffer (2 M urea in 50 mM Tris, pH 7.5, and 5 mM chloroacetamide). Proteins were further digested overnight at room temperature. The tryptic digest was stopped by adding formic acid to a final concentration of 2%.

Sample preparation for proteomics with label-free quantification (LFQ)

Proteomics sample preparation was done according to a published protocol with minor modifications [113]. Approximately 2 × 107 cells were lysed under denaturing conditions in a buffer containing 3 M guanidinium chloride (GdmCl), 5 mM tris (2-carboxyethyl) phosphine, 20 mM chloroacetamide, and 50 mM Tris-HCl pH 8.5. Lysates were denatured at 95 °C for 10 min shaking at 1000 rpm in a thermal shaker and sonicated in a water bath for 10 min. A small aliquot of cell lysate was used for the bicinchoninic acid (BCA) assay to quantify the protein concentration. In total, 50 μg protein of each lysate was diluted with a dilution buffer containing 10% acetonitrile and 25 mM Tris-HCl, pH 8.0, to reach a 1 M GdmCl concentration. Then, proteins were digested with LysC (Roche, Basel, Switzerland; enzyme to protein ratio 1:50, MS-grade) shaking at 700 rpm at 37 °C for 2 h. The digestion mixture was diluted again with the same dilution buffer to reach 0.5 M GdmCl, followed by a tryptic digestion (Roche, enzyme to protein ratio 1:50, MS-grade) and incubation at 37 °C overnight in a thermal shaker at 700 rpm.

LC-MS/MS instrument settings for shotgun proteome profiling

Peptide desalting was performed according to the manufacturer’s instructions (Pierce C18 Tips, Thermo Scientific, Waltham, MA). Desalted peptides were reconstituted in 0.1% formic acid in water and further separated into four fractions by strong cation exchange chromatography (SCX, 3 M Purification, Meriden, CT). Eluates were first dried in a SpeedVac, then dissolved in 5% acetonitrile and 2% formic acid in water, briefly vortexed, and sonicated in a water bath for 30 s prior injection to nano-LC-MS. LC-MS/MS was carried out by nanoflow reverse phase liquid chromatography (Dionex Ultimate 3000, Thermo Scientific) coupled online to a Q-Exactive HF Orbitrap mass spectrometer (Thermo Scientific), as reported previously [114]. Briefly, the LC separation was performed using a PicoFrit analytical column (75 μm ID × 50 cm long, 15 μm Tip ID; New Objectives, Woburn, MA) in-house packed with 3-μm C18 resin (Reprosil-AQ Pur, Dr. Maisch, Ammerbuch, Germany). Peptides were eluted using a gradient from 3.8 to 38% solvent B in solvent A over 120 min at 266 nL per minute flow rate. Solvent A was 0.1% formic acid and solvent B was 79.9% acetonitrile, 20% H2O, 0.1% formic acid. For the IP samples, a 1 h gradient was used. Nanoelectrospray was generated by applying 3.5 kV. A cycle of one full Fourier transformation scan mass spectrum (300–1750 m/z, resolution of 60,000 at m/z 200, automatic gain control (AGC) target 1 × 106) was followed by 12 data-dependent MS/MS scans (resolution of 30,000, AGC target 5 × 105) with a normalized collision energy of 25 eV. In order to avoid repeated sequencing of the same peptides, a dynamic exclusion window of 30 s was used. In addition, only peptide charge states between two to eight were sequenced.

Data analysis

Raw MS data were processed with MaxQuant software (v1.6.0.1) and searched against the mouse proteome database UniProtKB with 22,286 entries, released in December 2018. Parameters of MaxQuant database searching were a false discovery rate (FDR) of 0.01 for proteins and peptides, a minimum peptide length of seven amino acids, a first search mass tolerance for peptides of 20 ppm, and a main search tolerance of 4.5 ppm, and using the function “match between runs”. A maximum of two missed cleavages was allowed for the tryptic digest. Cysteine carbamidomethylation was set as fixed modification, while N-terminal acetylation and methionine oxidation were set as variable modifications. Contaminants, as well as proteins identified by site modification and proteins derived from the reversed part of the decoy database, were strictly excluded from further analysis.

Comparison of protein abundance for both the proteomics (K13-HOM vs XX wildtype) and the IP datasets (GFP-Kelch vs GFP, D-GFP-Klhl13 vs D-GFP) was performed with Perseus (v1.6.1.3). LFQ intensities, originating from at least two different peptides per protein group, were transformed by log2. Only groups with valid values in at least one group were used, and missing values were replaced by values from the normal distribution. Statistical analysis for differential expression was done by a two-sample t-test with Benjamini-Hochberg (BH, FDR of 0.05) correction for multiple testing. The processed output files can be found in Additional file 7: Table S6 (IP-MS) and Additional file 8: Table S7 (Proteome).

For the identification of Klhl13 interaction partners, cut-offs were set from the data displayed in the volcano plots using a previously published method [115]. Briefly, a graphical formula as a smooth combination of the following parameters was implemented:

$$ -{\log}_{10}(p)\ge \frac{c}{\mid x\mid -{x}_{\mathrm{o}}} $$

x: enrichment factor of a protein

p: p value of the t-test, calculated from replicates

xo: fixed minimum enrichment

c: curvature parameter

We optimized parameters c and xo such as to have 10% FDR (left-sided outliers) while maximizing the number of right-sided outliers. In the case of the GFP-Kelch IP, c = 0.32 and xo = 0.02. For the D-GFP-Klhl13 IP, c = 0.28 and xo = 0.04. Proteins without an associated gene name were filtered out in further analyses. Known Klhl13 interaction partners were extracted from the Biogrid database (Arih1, Aurkb, C1qbp, Cd2ap, Cops2, Cops4, Cops5, Cops6, Cops7a, Cul3, Dcun1d1, Hsp90aa1, Kiaa1429, Klhl21, Klhl22, Klhl9, Mad2l1, Nhlrc2, Nudcd3, Tfg, Ube2m, Ubxn7, Usp11, Zmym4).