Abstract
Background
The packaging of DNA into chromatin regulates transcription from initiation through 3' end processing. One aspect of transcription in which chromatin plays a poorly understood role is the co-transcriptional splicing of pre-mRNA.
Results
Here we provide evidence that H2B monoubiquitylation (H2BK123ub1) marks introns in Saccharomyces cerevisiae. A genome-wide map of H2BK123ub1 in this organism reveals that this modification is enriched in coding regions and that its levels peak at the transcribed regions of two characteristic subgroups of genes. First, long genes are more likely to have higher levels of H2BK123ub1, correlating with the postulated role of this modification in preventing cryptic transcription initiation in ORFs. Second, genes that are highly transcribed also have high levels of H2BK123ub1, including the ribosomal protein genes, which comprise the majority of intron-containing genes in yeast. H2BK123ub1 is also a feature of introns in the yeast genome, and the disruption of this modification alters the intragenic distribution of H3 trimethylation on lysine 36 (H3K36me3), which functionally correlates with alternative RNA splicing in humans. In addition, the deletion of genes encoding the U2 snRNP subunits, Lea1 or Msl1, in combination with an htb-K123R mutation, leads to synthetic lethality.
Conclusion
These data suggest that H2BK123ub1 facilitates cross talk between chromatin and pre-mRNA splicing by modulating the distribution of intronic and exonic histone modifications.
Similar content being viewed by others
Background
Genome-wide histone modification maps have now been generated for a number of eukaryotic organisms. These maps have revealed the preferential localization of specific marks to active or silent chromatin and the association of marks of active transcription with different regions of genes. For example, H3 trimethylation on lysine 4 (H3K4me3) is enriched at the 5' ends of actively transcribed genes while H3 trimethylation on lysine 36 (H3K36me3) is localized towards the 3' ends of coding regions. These localization patterns are related to the roles that the marks play in transcription: H3K4me3 regulates the efficiency of transcription initiation and early steps in transcription elongation, and H3K36me3 prevents the utilization of cryptic initiation sites in coding regions and controls aspects of transcription termination and processing [1–4]. Most eukaryotic genes are modular, containing multiple exons interrupted by introns. Genome-wide histone modification maps from C. elegans and human revealed that intron-exon chromatin is also preferentially marked, with transcriptionally active modifications generally excluded from introns and concentrated in exons [5–11]. These studies concluded that this pattern was primarily the consequence of different levels of nucleosome occupancy in these regions because nucleosomes were depleted in introns relative to exons. However, a recent analysis of published human epigenomic data found that 10 histone modifications were enriched in the 5' introns of human genes independently of the level of nucleosome occupancy [12]. It was suggested that the presence of these marks reflects an aspect of the splicing process such as exon definition and could play a direct role in regulating splicing. Thus, the location of intragenic histone modifications and the functional roles associated with different localization patterns remain areas of intense investigation.
One important intragenic histone modification is the monoubiquitylation of H2B (H2BK123ub1). H2B is ubiquitylated co-transcriptionally and in turn regulates the presence of other active chromatin marks during the transcription process, including H3K4, H3K36, and H3K79 methylation [13–19]. The presence of H2BK123ub1 in chromatin has been associated with both nucleosome stabilization and destabilization. H2BK123ub1 and the histone chaperone, Spt16, have been shown to function interdependently during transcription elongation to regulate nucleosome reassembly and preserve chromatin integrity [20–22]. Biochemical evidence and genomic nucleosome occupancy data also indicate that the presence of H2BK123ub1 generally promotes nucleosome stability [23, 24]. However, it was recently shown that synthetic nucleosome arrays containing H2BK123ub1 are less compact and exhibit an increase in inter-nucleosomal distance, as compared to arrays containing unmodified H2B [25]. In addition, two recent reports described a putative role for this modification in mediating chromatin decondensation at DNA damage sites [26, 27]. Thus, H2BK123ub1 may differentially affect chromatin structure in a context dependent manner.
In this report, we generated a genome-wide map of H2BK123ub1 occupancy in budding yeast to determine if the distribution of this modification could be related to additional biological processes. We found that H2BK123ub1 was enriched across gene coding regions and marked both introns and exons of ribosomal protein (RP) genes, and that the level of this mark was further increased at 3' intron-exon boundaries. The presence of H2BK123ub1 in introns of RP genes was separable from nucleosome occupancy, which was generally lower in introns compared to exons. In addition, we noted that disruption of H2B ubiquitylation tended to alter the distribution of H3K36 trimethylation in intragenic regions. H3K36me3 has been functionally linked to pre-mRNA splicing in worms and humans [6, 7, 28]. Furthermore, when an htb-K123R mutation was combined with deletions of LEA1 and MSL1, whose products facilitate U2 snRNA association with pre-mRNA [29, 30], we found a synthetic lethal phenotype. These data suggest that by modulating the distribution of intronic and exonic histone modifications, H2BK123ub1 facilitates cross talk between chromatin and pre-mRNA splicing.
Results
H2B ubiquitylation is enriched in transcribed regions
Previous gene-specific studies in yeast showed that H2BK123ub1 is present at genomic regions that are actively transcribed and absent from transcriptionally silent chromatin [16,
To further address whether H2B ubiquitylation was responsible for the observed bre1Δ interactions with mutations in genes encoding RNA processing factors, we combined an htb-K123R mutation, which abolished H2B ubiquitylation, with the following mutations: a deletion of MUD2, which encodes a component of the pre-mRNA-U1 snRNP [48–50]; a deletion of SAC3 or EDC2, two genes encoding factors with roles in mRNA export and mRNA decap** [51, 52]; and a deletion of LSM1, which encodes a protein involved in degradation of cytoplasmic mRNAs [48–50]. We found that the absence of H2BK123ub1 had no effect on the growth of mud2, edc2Δ, sac3Δ and lsm1Δ mutants (Additional file 1, Figure S5). However, when htb-K123R was combined with deletions of LEA1 and MSL1, whose products facilitate U2 snRNA association with pre-mRNA [29, 30], we found a synthetic lethal phenotype (Figure 4B). A likely interpretation of these genetic interactions is that Bre1-mediated H2B ubiquitylation is functionally linked to U2 snRNP assembly.
Discussion
The positioning of histone modifications at genes has been associated with numerous co-transcriptional processes ranging from initiation to elongation to 3' end processing. In this report, we investigated the genome-wide distribution of H2B monoubiquitylation in budding yeast. We found that H2BK123ub1, a mark of active transcription, is predominantly localized across gene coding regions, consistent with its postulated roles in transcription elongation [20, 22, 53]. The mark was also proportional to transcription rate, a likely consequence of the association of the H2B ubiquitylation machinery with elongating RNA polymerase II [57–59]. Thus, the presence of H2B ubiquitylation in the 5' introns of human genes may reflect a similarity in the chromatin architecture of promoter-proximal introns between yeast and human intron-containing genes. We also found that H2BK123ub1 peaked at the 3' intron-exon boundary, particularly in the RP genes. We speculate that the chromatin structure of 3' intron-exon boundaries in these genes could carry a signal that enhances the accessibility of the enzymatic machinery that mediates H2B ubiquitylation (Rad6-Bre1) [27, 28]. Alternatively, enzymes that target H2B for de-ubiquitylation (Ubp8 and Ubp10) [15, 29] could be preferentially prevented from associating with these regions. It is currently unclear what feature of chromatin architecture at these boundaries regulates either the deposition or removal of H2B ubiquitylation. Likewise, it is not known if this distinct chromatin structure plays a role in transcriptional regulation, including pre-mRNA splicing.
H2B ubiquitylation controls the methylation of three lysine residues in histone H3 in trans. A comparison of the genome-wide distributions of H2BK123ub1, H3K4me3, H3K79me2/me3, and H3K36me3 showed that the four modifications occupy distinct regions in genes. As previously reported, H3K123ub1 and H3K79me3 co-localize across coding regions, while H2BK123ub1 and H3K79me2 show an anti-correlation at intergenic regions [8]. H3K4me3 is localized predominantly at 5' gene regions, while H3K36me3, like H2BK123ub1, spreads across coding regions, but with a more pronounced enrichment at the 3' end of genes. How H2BK123ub1 controls these particular distribution patterns remains an area of intense investigation [16, 18, 24, 60–63]. H3K4me3 and H3K79me3, like H2BK123ub1, were also present in the introns of five ribosomal protein genes that were analyzed, and all of these marks were separable from nucleosome occupancy. Together, the results are similar to the reported presence of H2B ubiquitylation and H3K79me3 in the 5' introns of humans [12].
Unlike the situation in humans and worms [6, 7, 12], H3K36me3 is present in both the introns and exons of yeast genes. The restricted presence of H3K36me3 in exons in higher eukaryotic genomes has been correlated with the regulation of alternative mRNA splicing [6, 7, 28]. These observations suggest two possible roles for H3K36me3 in pre-mRNA splicing, and specifically in the regulation of pre-RNA splicing. First, H3K36me3 could mark exons as a part of a gene structure and along with cis-splicing elements facilitate the decision of whether to include a specific exon [6, 12]. Alternatively, H3K36me3 could act as an anchor site for recruiting splicing factors that regulate alternative mRNA splicing [7, 28]. However, in the yeast genome, the majority of intron-containing genes contain a single intron, and the regulation of splicing efficiency is thus more important than alternative mRNA splicing. It has been suggested that splicing efficiency is a function of the rate of RNA polymerase II elongation. This scenario is supported by a recent finding that RNAP II pauses transiently around the 3'end of introns and that this pause coincides with splicing factor recruitment [64]. Thus, the presence of histone marks in both introns and exons might promote splicing efficiency by controlling RNA polymerase II elongation. The finding that H2BK123ub1 levels are enhanced at 3' intron-exon boundaries could provide a mechanism to couple RNAPII elongation to the recruitment of splicing factors. Moreover, the dynamic relationship between the levels of H2BK123ub1 and H3K36me3 in introns and exons supports a redundant mechanism to ensure optimal RNAP II elongation, in turn promoting efficient pre-mRNA splicing.
Because the loss of H2B ubiquitylation, H3K4/K79 methylation, or H3K36 methylation does not compromise cell viability, the histone modifications cannot play an essential role in splicing. We suggest that the presence of these marks in introns, together with reduced nucleosome occupancy in these regions, are part of a chromatin architecture that facilitates the recognition of exons and introns by splicing regulators. Further support for this mechanism comes from the observation that a synthetic lethal phenotype resulted from combining an htb-K123R mutation with deletions of genes with roles in pre-mRNA splicing, specifically in U2 splicesome assembly. For example, the histone modifications might serve as binding sites for proteins that in turn interact with splicing factors. Such a scenario has been proposed for H3K4me3 and the Chd1 protein, which contains a chromodomain that recognizes the methyl mark and interacts with the U2 snRNP complex in both humans and yeast to promote efficient splicing [65–67].
Conclusion
In summary, the co-transcriptional formation of H2BK123ub1 leads to its spread across gene coding regions. This mark in turn defines the distribution of H3K4 tri-methylation at the 5' end of coding regions and H3K79 tri-methylation toward the center of coding regions during the processes of transcription initiation and elongation. H2BK123ub1 also marks introns, and its presence in these regions leads to the presence of H3K4me3 and H3K79me3 at the introns of several ribosomal protein genes examined. The presence of H2BK123ub1 also influences the distribution of H3K36me3 across coding regions, particularly in introns. As a consequence, we suggest that this coordinated pattern of histone marks along transcribed regions facilitates the recognition of exons and introns and allows for efficient co-transcriptional pre-mRNA processing.
Methods
Yeast strains and growth conditions
Strains used in this study are listed in Additional file 1, Table S3. Yeast cultures were grown to mid-log phase in YPD medium (2% yeast extract; 2% peptone; 2% glucose) at 30°C for all ChIP-chip and growth studies.
Chromatin Immunoprecipitation
ChIP was performed as described previously (Kao et al., 2004) with minor modifications. Chromatin pellets from formaldehyde fixed cells were lysed by glass bead vortexing for 30 min at 4°C. For all ChIP experiments analyzing individual genes, cell lysates were digested with 160 U of Micrococcal nuclease (MNase) per 100 ml of cells for 15 min (Nuclease S7, Roche Applied Science, Taiwan). The following antibodies were used with the equivalence of 10 OD units of cell lysate: FLAG, 20 μl (M2-F3165; Sigma-Aldrich, St. Louis, MO); HA, 5 μl (3F10; Roche Applied Science, Taiwan); H3, 4 μl (ab1791; Abcam, Cambridge, England); H3K4me3, 2 μl (ab8580; Abcam); H3K79me3, 10 μl (ab 2621, Abcam); and H3K36me3, 4 μl (ab9050, Abcam). Antibodies were prebound to protein A or protein G sepharose or dynabeads®. Purified DNAs were analyzed by probe-based real-time quantitative PCR on a Roche LightCycler® 480 Real-Time PCR System. Probe numbers are listed in Additional file 1, Table S4 and probes were ordered from Roche Universal Probe Library, Roche Applied Science, Taiwan. The IP/Input ratios of H2BK123ub1, H3K4me3, H3K36me3, and H3K79me3 were normalized to the IP/Input ratio of INT-V sequences as described previously [32].
ChDIP-chip hybridization to oligonucleotide arrays
Chromatin pellets from formaldehyde fixed cells were lysed by glass bead vortexing for 30 min at 4°C, and the cell lysates were sonicated on a Branson Sonifier. ChDIP to measure H2BK123ub1 was performed as described previously [20, 31] using a 2 step sequential ChIP. Briefly, the first ChIP was performed using anti-FLAG antibody, and the immune complexes were eluted with FA-lysis buffer containing 200 μg/ml of 3 × FLAG peptide. One-tenth of the eluate from the first ChIP was reserved for "input", and anti-HA antibody (12C5A; Roche) was added to a final concentration of 15 μg/ml to the remaining eluate. The immune complexes were then eluted with 1% SDS/50 mM Tris (pH 8.0) and represent "IP". After reversal of the crosslinks and purification, the IP and input DNA were amplified according to the Affymetrix protocol. IP and input samples from two biological replicates were hybridized to an Affymetrix 1.0R S. cerevisiae microarray, which comprised over 3.2 million probes covering the entire genome at 5 bp resolution.
ChDIP-chip data analysis
Preprocessing and normalization
Two biological replicates for each paired ChIP (HA ubiquitin modified Flag-H2B; denoted by H2Bub1) and control (the level of H2B genomewide) were included in the experiments. Signals for ChIP-chip were extracted by the TileMap algorithm [68], in which perfect match (PM) only intensities were used. For each probe, the log ratio of intensities of ChIP and control were normalized using the quantile normalization method [69].
Averaged fold enrichments of all genes, intergenic regions and other regions of interest (in log2 scale)
IP ir and C ir denoted the PM value of probe i of ChIP and control arrays, and r denoted the r th replicate, where r = 1, 2. For each probe i, the replicated and normalized values were outputted to compute their averaged log ratios by the formula:
These averaged log ratios for all probes (including 4802 genes with assigned transcription rates) were allocated, via a code, to a genomic locus (ORF) and a 5' or 3' intergenic region, which were mapped evenly to the 20th-60th, 1th-20th, 60th -80th bins on the x-axis, respectively. In each bin, we averaged all probes of each gene, then averaging over all genes to result in the averaged intensity, which was plotted on top of the 5' intergenic, 'Average Gene' and the 3' intergenic regions to depict their fold enrichments. The averaged fold enrichments of whole genome, coding regions (ORF), intergenic regions and silent chromatin regions: HM mating type cassettes, telomeres and rDNA, were also computed. Similarly, the fold enrichment of all genes grouped by transcription rate, by gene length, and certain subgroups of interest were calculated, then smoothed by a moving window with size 5 (the averaged intensity of every five adjacent bins) [70], and plotted.
Analysis of intron-containing genes
Averaged fold enrichment of 5'exon-intron-3'exon (in log2 scale)
For intron-containing genes with defined transcription rates, each of their 5' exon (denoted as Exon 1), intron (Intron), and 3' exon (Exon 2) sequences were allocated to the 1st -10th , 10th -40th and 40 th -80th bins, and averaged fold enrichments were calculated over genes. For the few multiple-intron-containing genes, the multiple copies of Exon 1-Intron-Exon 2 sequences were aligned by each Intron and allocated accordingly. Averaged fold enrichments (allocated to the middle point of each bin such as 2.5) were then calculated over genes and smoothed by a moving window with size 5, except for the first two and the last two bins of Exon 1, Intron, and Exon 2.
Averaged nucleosome occupancy
Nucleosome occupancy was defined as the percentage of DNA bound to a nucleosome, following the analysis in [42]. Scaled occupancy levels of nucleosomes in Exon 1 (Intron or Exon 2) ranging from 0 to 100% were weight averaged over 211-intron containing genes, proportional to the length of each Exon 1 (Intron or Exon 2), to yield the averaged nucleosome occupancy.
Accession number
Raw sequencing data are available at the NCBI Gene Expression Omnibus (GEO)
(Accession number: GSE34325)