Background

Understanding the design logic of living systems is now mainly based on genomics and DNA sequence comparisons. Typically, protein comparisons are evaluated by sequence alignments. However, living systems run programs that are written both as passive information (the genome) and as dynamic, molecular ecologies (the proteome). This dichotomy drives proteomic research because no living system can be solely described by its DNA sequence. Accurate proteomic maps are logically the next dataset required to complement complete genome sequences. However, the generation of reliable proteomic data remains challenging [14].

The budding yeast, Saccharomyces cerevisiae, has led eukaryotic research in several fields, particularly genomics, reverse genetics, cell biology and proteomics. For proteomic map**, S. cerevisiae has been the main venue for the evaluation of various methodologies, which led to the clear conclusion that biochemical methods based on physiological expression levels deliver the most accurate results. In contrast, bioinformatic, yeast two hybrid and overexpression approaches generate less accurate data that require validation by a different means [14].

In contrast to a genome sequence, it is unlikely that a proteomic map can ever be complete because proteomes change in response to alterations of cellular condition. Proteomes include a very large number of post-translational modifications that are inherently variable, as well as protein-protein interactions that vary over a wide range of stabilities. Nevertheless, a proteome is based on a stable core of protein complexes, which can be accurately mapped by biochemical approaches [2]. Hence, an accurate proteomic map will be based on the constellation of stable protein complexes for a given cellular condition. The map then provides a scaffold onto which transient interactions and post-translational modifications can be organized. Thereby, proteomes can be rationalized [5, 6].

The quest to understand proteomes has led to the definition of new perspectives and terms, such as a proteomic 'environment', which describes the local relationships within a group of interacting proteins; 'hubs', which is applied to proteins that interact with many other proteins [2]; and 'hyperlinks', which is a term we applied to proteins that are present in more than one stable protein complex [7]. Similarly, insight into proteomes can be gleaned from comparative proteomics [8]. However, without accurate proteomic maps, these new terms and perspectives, particularly those derived from comparative proteomics, have limited meaning.

To map the budding yeast proteome accurately, methodologies for physiological expression and purification of tagged proteins were developed based on gene targeting with the tandem affinity purification (TAP) tag [9, 10]. The high throughput application of these methods by two different groups led to the best proteomic map datasets for any cell, whether prokaryotic or eukaryotic [11, 12]. Collins et al. consolidated both datasets into one of even higher quality; nevertheless, they recommended more intensely focused data gathering to evaluate accuracy [13].

Here we address the issue of proteomic accuracy by intense exploration of a section of the budding yeast proteome that is related to chromatin regulation. Chromatin is regulated by multiprotein complexes, which dynamically target nucleosomes with a multitude of reversible modifications, such as acetylation, methylation, phosphorylation and ubiquitination (reviewed in [14]). Also, in budding yeast, many of these complexes have been individually isolated and functionally characterized, which provides a rich and detailed source of reference information. Previously, we concluded that greater accuracy can be attained by sequential tagging to reciprocally validate interactions [10, 15, 16]. Sequential tagging of candidate interactors to map a proteomic environment has also been termed proteomic navigation or SEAM (short for Sequential rounds of Epitope tagging, Affinity isolation and Mass spectrometry). For a low throughput approach, which also permits a more intense focus on individual experiments, sequential tagging will deliver improvements in accuracy.

Several other factors may reduce map** accuracy. In the S. cerevisiae proteome every fourth protein is apparently a proteomic hyperlink [5]. That is, a member of more than one distinct protein complex. Hence, many pull-downs are mixtures of completely or partially co-purified complexes, together with other sub-stoichiometric and pair-wise interactors. Also, sorting out background proteins from genuine interactors remains challenging [5, 1719], especially when proteins are identified by mass spectrometric techniques with enhanced dynamic range, such as liquid chromatography tandem mass spectrometry (LC-MS/MS) or LC matrix-assisted laser desorption/ionization mass spectrometry (MALDI) MS/MS, which produce a large number of confident protein identifications in each pull-down. Furthermore, until recently, mass spectrometric identifications have mostly neglected the quantitative aspect. It was (and, largely, still is) difficult to determine which proteins are bona fide members of a tagged complex and, therefore, stoichiometric, and which interactors are sub-stoichiometric. Here we address these issues to develop refinements for improved accuracy of map**, including working criteria to identify common background proteins and stoichiometric interactors.

Using the sequential strategy and these refinements, we mapped a large proteomic environment that we term 'Chromatin Central' because it includes eight protein complexes interconnected by hyperlinks encompassing the major histone aceytyltransferases and deacetylases in budding yeast. As evidence for map** accuracy, we made several discoveries, including the identification of new subunits of known complexes and new complexes.

To exploit the quality of the map for comparative proteomics, we then explored the same proteomic environment in the distantly related yeast Schizosaccharomyces pombe. This enabled a detailed comparison of two highly accurate proteomic environments to shed light on the evolution of proteomic architecture.

Results

Establishing a proteomic environment

Our approach to charting proteomic environments relies upon the sequential use of TAP and mass spectrometry to identify stable protein assemblies. In a typical TAP pull-down experiment, LC-MS/MS analysis identified over 500 proteins containing stoichiometric and transient bona fide protein interactors, along with a large number of background proteins of diverse origin and abundance. To dissect the composition of complexes, we employed a layered data mining approach. First, we sorted out common background proteins and then distinguished proteins specifically enriched in the TAP isolation using semi-quantitative estimates of their abundance (Figure 1).

Figure 1
figure 1

Data processing workflow. The primary dataset is a complete list of proteins identified in IP experiments that were used to map the Chromatin Central proteomic environment in any of the two yeasts. After removal of ribosomal proteins, all hits together with their A-indices were compiled into a non-redundant master table and grouped according to IP rounds. To accurately determine the scaffold protein complexes, we further removed from the master table proteins having A-index = 1 that were identified only in one IP experiment and common background proteins. Using the average A-index of background proteins as a selection threshold, the remaining proteins were sorted into two large groups: proteins enriched in corresponding IP experiments and proteins whose abundance remained at the background level. Proteins in the first group were considered as genuine interactors and were assigned to complexes, assuming IP experiments in which they were identified. From the second group, only proteins that were validated by a reciprocal IP experiment were assigned to the corresponding complexes.

Common background proteins

A list was established based on background proteins from proteins repetitively found in 20 diverse immunoaffinity purifications (IPs) that were selected from three unrelated projects, this project being one of those three. The other two were based on mitotic cell cycle regulation and vesicle transport. The tagged proteins and their known interactors, as well as ribosomal proteins, were first removed from the 20 primary IP lists. Then, of more than 2,000 proteins identified in these 20 IPs, 119 (Table S1 in Additional data file 1) were defined as common background because they were found at least once in each of the three independent projects. This list of 119 includes proteins with molecular weights ranging from 11 to 250 kDa and expression levels of 100 to 106 molecules per cell [20, 21]. Most of these common background proteins were cytoplasmic [2123], including heat shock, translation factors and abundant housekee** enzymes. Once these common background proteins were removed from a particular IP list, it was further refined using abundance index (A-index) filtering.

Index of relative abundance

The absolute amounts of immunoprecipitated protein varies between TAP purifications. However, within a purification, members of a stable protein complex should be isolated in approximately stoichiometric amounts and relatively enriched compared to the other detected proteins. Abundant background proteins are an exception; however, we always removed them from the list at the very beginning of the data processing routine as described above.

To estimate the relative abundance of individual proteins and hence obtain an additional means to distinguish genuine interactors from background, we used an arbitrary A-index. It was calculated as a ratio of the total number of MS/MS spectra acquired for a given protein (reported as 'matched queries' for each MASCOT hit) to the number of unique peptide sequences they matched. Essentially, the A-index is a relative measure of the amounts of co-isolated proteins from the gel. We applied it as a convenient way to distinguish bona fide subunits of the tagged complex from background proteins because they should be relatively enriched, compared to background. In a series of preliminary experiments, we observed that the A-index monotonously increased with increasing amount of loaded proteins from 50 to 800 fmols. When determined for six standard proteins of various molecular weights and properties, the A-index varied within a 50% margin at any given protein loading (Figure S1 in Additional data file 2).

Selecting genuine interactions to determine protein complexes

Each protein complex was isolated several times within a round of IP experiments that used different baits [10, 15, 16]. Hence, several independent IPs established the protein complex composition or identified a hyperlink to another protein assembly (Figure S2 in Additional data file 2). In turn, proteins co-purified with a hyperlink and that did not belong to the complex characterized in the current round were selected as baits for the next sequential round. For S. cerevisiae, within five IP rounds, 21 out of 26 pull downs from unique baits were successful (for the full list of identified proteins, see Table S2 in Additional data file 1). After the ribosomal proteins were removed, a non-redundant list of proteins identified in all IPs, together with their A-indices, was assembled into a master table containing 1,301 proteins in total (Table S3 in Additional data file 1). Then we removed common background proteins and low abundant proteins whose A-indices were equal to 1 and were identified only once in the total of 21 IPs.

The common background proteins listed in the master table had an average A-index value of 1.4. We noticed that A-indices of more than 90% of background proteins were within 25% of the average, so we employed this empirical threshold to further sort out experiment-specific background. Since genuine interactors were supposed to be enriched in the IPs compared to background proteins, we introduced an arbitrary cut-off of 1.75 for A-indices of genuine protein interactions (Table S3 in Additional data file 1).

Proteins were recognized as stoichiometric core members of complexes if they did not belong to common background, were specifically enriched in corresponding IPs, and, most importantly, were co-isolated with baits within the corresponding round of sequential IPs (Figure 1). Potentially, these criteria might have eliminated some transient (yet genuine) interactors; however, we placed our priorities upon accuracy. Although the chosen 25% margin might look arbitrary, the entire approach was validated by a good concordance of the composition of protein complexes in S. cerevisiae Chromatin Central with the published evidence, as described below.

Chromatin Central in S. cerevisiae

From 1,301 unique open reading frames (ORFs) in the master table, only 63 proteins (less than 5% of all identified proteins) matched the above selection criteria, comprising 9 stable protein complexes connected by 12 proteomic hyperlinks. Three out of these nine (ASTRA (for ASsembly of Tel, Rvb and Atm-like kinase), Snt2C and Sc_Rpd-LE (for Rpd3L expanded with Set3C core); Figure 2) are reported here for the first time, whereas the other six (complexes I-VI) have been characterized previously (note that the prefixes Sc_ and Sp_ refer to proteins from S. cerevisiae and S. pombe, respectively; the suffix 'C' always refers to the protein complex).

Figure 2
figure 2

Chromatin Central proteomic environment in S. cerevisiae. Individual protein complexes are boxed; TAP-tagged subunits are indicated with asterisks. The proteomic hyperlinks (proteins shared between the individual complexes) are shown between the complexes in grey diamonds. The hyperlink from Tra1 to the SAGA/SLIK complex is designated with a dashed line/filled arrow because it was not identified in this work, but inferred from published evidence. Gene names designated with a minus (-) symbol indicate that their TAP tagging/immunoaffinity purification failed. Several relatively abundant (A-index > 1.75) pair-wise interactors, also identified in proteome-wide screens [101, 102], are mapped onto the scheme (dashed line/unfilled arrow). Set3C complex was previously characterized by TAP-tagging method in [10].

Chromatin Central comprised four distinct protein assemblies, including: the histone deacetylase Rpd3p (Sc_Rpd3S, Sc_Rpd3L [24, 25], Sc_Rpd-LE and Sc_Snt2C); at least two histone acetyltransferase complexes, Sc_NuA4 [26] and SAGA/SLIK [27]; and two ATP-dependent chromatin remodeling complexes, Sc_Swr1C and Sc_Ino80C [28, 29]. The compositions of the individual protein complexes (Tables 1, 2, 3, 4, 5) were compared with previous reports. Surprisingly, we found some discrepancies with data from the best proteome maps even though they were also obtained by TAP tagging [11, 12]. In contrast, our results agree with several publications describing the biochemical and functional characterization of the individual complexes. In particular, complexes I, V and VI are identical to the previously reported Sc_Rpd3S, Sc_Swr1C and Sc_INO80C, respectively [24, 25, 28, 29].

Table 1 Members of NuA4 histone acetylase complexes in the Chromatin Central proteomic environment
Table 2 Members of histone deacetylase complexes of the Chromatin Central proteomic environments
Table 3 Members of chromatin remodeling complexes of the Chromatin Central proteomic environment
Table 4 Members of ASTRA complexes of the Chromatin Central proteomic environment
Table 5 Other members of the Chromatin Central proteomic environment

In addition to the 12 known members of Sc_Rpd3L (complex II) [24, 25], we identified 2 novel subunits, including the 72 kDa protein Sc_Dot6p (ORF name YER088C) and its 59 kDa homolog Sc_Tod6p (Twin of the Dot6; ORF name YBL054W). Their sequences share 31% identity; 46% similarity and both possess the chromatin specific SANT domain [30]. Furthermore, the involvement of Sc_Dot6 in the regulation of telomere silencing has been indicated [31].

In addition to the 14 known members of Sc_NuA4 (complex IV) [26, 32], another new protein, the 72 kDa Sc_Yap1p (ORF name YML007W), which is a member of a family of fungal specific transcriptional activators, was identified as a subunit. Within Sc_Set3C (complex III) [10] we also identified a new member, the 55 kDa protein Sc_Tos4p (ORF name YLR183C). It is a putative transcription factor of the forkhead family. Tagging Sc_Tos4p pulled down the entire Sc_Set3C, except for the hyperlink Sc_Hst1p [5] (also, see Figure S2 in Additional data file 2 and Table S2 in Additional data file 1).

We identified 12 proteomic hyperlinks in Chromatin Central (Figure 2). One of these proteins, the 422 kDa Sc_Tra1p (ORF name YHR099W) is a core member of Sc_NuA4 and SAGA/SLIK [27], effectively also hyperlinking these two acetyltransferase complexes into Chromatin Central. Our attempts to TAP-tag Sc_Tra1p failed. However, Sc_Tra1p was co-purified when other Sc_NuA4 and also ASTRA subunits were sequentially tagged (Figure 2; also see Figure S2 in Additional data file 2 and Table S2 in Additional data file 1).

Notably, core-subunits of the histone deacetylase complex Sc_Set3C [10] were co-purified in sub-stoichiometric amounts with subunits of the Sc_Rpd3L complex (Table S2 in Additional data file 1). Sc_Set3C and Sc_Rpd3L complexes regulate overlap** target genes [3335] and synthetic lethal screens have revealed genetic links between components of these complexes [36].

Altogether, the composition of individual complexes in Chromatin Central accords well with the published biochemical evidence. Furthermore, the sequential tagging approach revealed four novel subunits in three previously characterized complexes as well as three novel protein assemblies.

Chromatin Central in S. pombe

We next asked if the Chromatin Central environment is conserved between the distantly related fungi S. cerevisiae and S. pombe. In contrast to S. cerevisiae, no systematic biochemical isolation of protein complexes has yet been performed in S. pombe; however, complexes can be isolated with essentially the same TAP methodology with a similar success rate [7, 37]. We exploited the architecture of Chromatin Central in S. cerevisiae to choose strategic baits for the work in S. pombe. The closest homologues of six S. cerevisiae hyperlinks (products of CLR6, ALP13, YAF9, SWC4, RVB1, TRA1 and TRA2 genes) were subjected to TAP tagging and immunoaffinity isolation, followed by mass spectrometric identification of corresponding interactors (Figure 3). For accuracy, we also isolated complexes associated with three more conserved subunits, encoded by PNG2, SWC2 and IES6. Thus, the characterization of each complex relied upon at least two independent TAP purifications targeting different baits.

Figure 3
figure 3

Representative Coomassie stained polyacrylamide gels of immunoaffinity isolations used for deciphering the Chromatin Central environment in S. pombe. These baits were selected for IP experiments as plausible proteomic hyperlinks. Bands, in which corresponding proteins were identified by mass spectrometry, are indicated with arrows for each lane. The full list of identified proteins is presented in Table S4 in Additional data file 1.

As in the S. cerevisiae experiments, the identified proteins, together with their A-indices, were combined into a master table (Tables S2 and S4 in Additional data file 1). We also compiled a list of 250 common background proteins for S. pombe in the same way as we did for S. cerevisiae (Table S1 in Additional data file 1). Interestingly, the average A-index of common background proteins was almost identical in both yeasts (1.3 and 1.4 in the fission and budding yeasts, respectively), and, therefore, we used the same conservative threshold of 1.75 to define stoichiometric interactors.

Chromatin Central shows a very similar architecture in both yeasts (Figures 2 and 4). To assess the similarities more closely, we focused on orthologues, recognized by overall sequence similarity (best hits in forward and reciprocal BLAST searches) and similar composition of structural domains (Tables 1, 2, 3, 4, 5). Altogether, in both Chromatin Central environments we identified 47 pairs of confident orthologues and six pairs with marginal confidence (Figure S3 in Additional data file 2) out of a total of 139 proteins. For other S. cerevisiae and S. pombe proteins, BLAST searches identified no clear orthologous pairs (Tables 1, 2, 3, 4), although some of them might be functional orthologues (such as Sc_Ume1p and Sp_Prw1p).

Figure 4
figure 4

Chromatin Central proteomic environment in S. pombe. Individual protein complexes are boxed; color coding designates the protein complexes that are orthologous to the corresponding complexes in S. cerevisiae as shown in Figure 2. TAP-tagged subunits are designated by asterisks; minus (-) signs indicate that the IP experiment failed; proteomic hyperlinks are shown between the complexes in grey diamonds. Certain confident pair-wise interactors, discussed in the text, are designated with dashed arrows.

More than half the subunits of Sc_Rpd3S and Sc_Rpd3L (complexes I and II) are orthologous to the members of corresponding S. pombe complexes Sp_Clr6S and Sp_Clr6L; however, we reveal (Figure 4 and Table 2) further similarities than previously documented [38]. In addition to the previously reported subunits, we identified Sp_Cti6p, Sp_Rxt2p, Sp_Rxt3p, Sp_Dep1p and Sp_Pst3p. Our study also revealed that Sp_Clr6L, like Rpd3L in the budding yeast, is hyperlinked to the NuA4 histone acetyltransferase complex via an MRG-family protein, Sp_Alp13p.

Complex IV (Sp_NuA4) comprises orthologues of the 12 core members of the Sc_NuA4 complex, including its catalytic subunit Sp_Mst1p (ORF name SPAC637.12c) [3941] (Table 1). Complexes V and VI include the closest homologues of the S. cerevisiae ATP-dependent helicases Sc_Swr1p and Sc_Ino80p (ORF names SPAC11E3.01c and SPAC29B12.01, respectively), together with 20 subunits orthologous to members of Sc_Swr1C and Sc_Ino80C (Table 3). The corresponding chromatin remodeling complexes in S. cerevisiae catalyze replacement of histone H2A with its variant Htz1p [29, 42, 43]. Complexes V and VI in the fission yeast both associate with Sp_Pht1p, which is the S. pombe orthologue of Sc_Htz1p (Table S2 in Additional data file 1). Therefore, it is likely that these S. pombe complexes (now called Sp_Swr1C and Sp_Ino80C) are also H2A.z chaperones.

Interestingly, while characterizing the composition of Sp_Ino80C, we identified a 17 kDa core subunit, whose gene has not been annotated as an ORF in the S. pombe genome (Figure S4 in Additional data file 2). The protein has no homologues within the Saccharomyces genus, yet possesses some remote similarity to a non-annotated genomic region in Schizosaccharomyces japonicus. We call this newly discovered S. pombe gene, IEC5 short for (Ino Eighty Complex subunit 5 [GenBank:FJ493251]).

Complex VI, ASTRA, is the same as the orthologous complex in S. cerevisiae except that the S. pombe genome encodes for two Tra1 homologues and only one, Tra1, is present in ASTRA (Table 4). The other, Tra2, is a subunit of Sp_NuA4 and presumably the S. pombe SAGA/SLIK complexes. In S. cerevisiae, the single Tra1 was found in all three complexes.

As we observed in S. cerevisiae for Sc_Rpd3L, some Sp_Set3C subunits co-purified in sub-stoichiometric amounts with Sp_Clr6L and vice versa, when Sp_Set3p was used as bait (Table S2 in Additional data file 1). Notably, the three subunits (Sp_Snt1p, Sp_Hif2p, and Sp_Set3p) isolated together with Clr6L are the orthologues of the three (Sc_Snt2, Sc_Sif2, and Sc_Set3) isolated with Rpd3L. However, in contrast to the Sc_Set3C complex, which consists of eight subunits, the Sp_Set3C complex contains only four proteins (Table 2).

In a few instances we identified proteins with domains that are not present in the corresponding orthologous complexes in the other yeast, including Sp_Msc1p (ORF name SPAC343.11c), which is a member of the Sp_Swr1C complex. The function of this protein is not known, although Ahmed et al. [44] suggested that Msc1 is involved in chromatin regulation and DNA damage response. Msc1 contains a remarkable composition of domains, including three PHD fingers [45], PLU-1 [46], zf-C5HC2, JmjC and JmjN [47]. It was recently shown that the Msc1 PHD fingers act as an E3 ubiquitin ligase [48], while in other proteins the JmjC domain mediates histone demethylation [49]. None of the Sc_Swr1C subunits possess these domains or appears to be remotely similar to Sp_Msc1 (Table S5 in Additional data file 2).

We identified nine hyperlinks within Chromatin Central in S. pombe, all of which are orthologues to corresponding proteins in the budding yeast. As our attempts to purify TRA2 failed (as they did in S. cerevisiae), it remains unclear if, similar to the budding yeast, this protein is also shared between Sp_NuA4 and an assembly orthologous to SAGA/SLIK [50].

Independent validation of functional relationships within Set3C and Swr1C complexes

We independently validated some of the novel proteomics observations, namely the insights regarding Set3C and Swr1C, using quantitative genetic interaction data from S. cerevisiae [51] and S. pombe [52, 53].

Our proteomic data suggest that Set3C contains a conserved core complex of four proteins (Set3, Hos2, Snt1, and Sif2) and physically interacts with Rpd3L in both S. cerevisiae and S. pombe. To validate these findings, we compared the correlation coefficients of genetic profiles of the two Set3C core components in both yeasts (Sc_Set3 and Sc_Hos2; Sp_Set3 and Sp_Hda1) against genetic profiles of a set of 239 direct homologs in the two species. As expected, the correlation between the two core members (Set3 and Hos2) and Sif2, is well beyond the 90th percentile (Figure 5a), suggesting they act in concert. In contrast Hst1, also a subunit of Sc_Set3C, shows a much lower correlation, which is consistent with its absence from the core complex (as was previously shown for Sc_Hst1 [10]) or not being a part of the complex at all (Sp_Hst1). Furthermore, correlation of the Set3C core with the Sc_Rpd3L subunit Sp_Pho23 (Sp_Png2) is also high in both yeasts and higher than the correlation with one of the Rpd3S subunits (Rco1, Sp_Cph1). The functional division within Sc_Set3C becomes even more obvious when examining individual interactions of Set3C core and extension subunits. Members of the Set3C core display stronger positive genetic interactions with each other, compared to the Set3C extension subunits, and their genetic interaction patterns differ from patterns of Swr1C, SAGA and Prefoldin members (Figure 5b). Taken together, these data provide genetic evidence that Sc_Set3C encompasses two functional modules, one of which (Set3C core) interacts closely with Rpd3L. This functional evidence corroborates our proteomic map** data, suggesting that the Set3C complex in S. pombe is only represented by core subunits, while the orthologous complex in S. cerevisiae has an extension of four extra subunits. In both yeasts, the core Set3C interacts with Rpd3L to form a distinct module referred to as Rpd3LE.

Figure 5
figure 5

Genetic interactions support the proteomic observations. (a) Scatter plots of correlation coefficients between genetic profiles for the two Set3C subunits, Set3 and Hos2, in S. cerevisiae and S. pombe across 239 direct homologs between the two species. In both species Sif2 (Hif2) is highly correlated with both Set3 and Hos2, consistent with its being a subunit of the core Set3C, whilst Hst1 (part of the Sc_Set3C extension) is uncorrelated with either Set3 or Hos2. Pho23 (Png2), a subunit of Rpd3C(L), correlates better with Set3 and Hos2 than does Rco1 (Cph1), a subunit of Rpd3C(S). The 90th percentile of the data is indicated. (b) Subunits of the core Set3C and Set3C extension show different genetic interaction patterns. Shown are genetic interactions between Set3C core and extension subunits and Swr1C, SAGA and Prefoldin subunits. Color-coding of the interaction magnitude (shown in the key) is as follows: shades of cyan indicate synthetic sick/lethal (negative) interactions typically observed between genes acting on parallel pathways; shades of yellow represent suppressive (positive) interactions seen primarily between genes acting on the same pathway and within stable protein complexes. (c) Msc1 in S. pombe is a member of Swr1C, while its S. cerevisiae ortholog (Ecm5) is not. Genetic profiles of members of the complex in the two species are shown with Msc1 and Ecm5 profiles aligned at the bottom. Genetic pattern of Msc1 is very similar to the rest of the complex and positive genetic interactions with the other members indicate that it is a bona fide member of Swr1C in S. pombe. Color-coding is as for (b). (d) A scatter plot of pair-wise correlation coefficients of genetic profiles of members of Swr1C in S. cerevisiae and S. pombe. Consistent with (c), data-points corresponding to pairs containing Msc1 or Ecm5 form an outlier group and are strongly correlated in S. pombe, but not in S. cerevisiae.

In S. pombe another complex, Swr1C, contains a newly identified subunit, Msc1. Its closest homolog in Sc_Ecm5 is not a part of Swr1C in budding yeast. To independently validate this finding, we examined and compared individual genetic interactions of seven of the Swr1C subunits in both yeasts with the genetic patterns of Sp_Msc1 (Sc_Ecm5). Consistent with our proteomic data, Sp_Msc1, unlike Sc_Ecm5, shows strong positive genetic interactions and a very similar pattern to the other members of the complex (Figure 5c). Hence, pairs of genetic profiles containing Sc_Ecm5/Sp_Msc1 and other members of Swr1C show weak correlation in S. cerevisiae, but strong correlation in S. pombe (Figure 5d). Taken together, these genetic data confirm our proteomic map** observations.

Discussion

By navigating a complex proteomic environment in two divergent yeasts with high accuracy, we obtained a new level of precise insight into the comparative proteome and also extracted several new and subtle discoveries.

Comparative profile of a proteomic environment

The overall architecture of Chromatin Central is the same in the two yeasts; however, there is a surprising amount of variation in their subunit composition. For both yeasts, Chromatin Central is based on the same eight complexes, encompassing 53 orthologous protein pairs plus a further 33 proteins that appear to be species specific (23 in S. cerevisiae, 10 in S. pombe). Of these 33, three pairs have a very similar composition of functional domains (Table S5 in Additional data file 2). Hence, a very similar architecture is sustained by a scaffold based on about two-thirds of all the proteins involved, whereas the remaining one-third appears to be less constrained.

Analyses of domain composition (Table S5 in Additional data file 2) revealed that many subunits (19 in S. cerevisiae and 21 in S. pombe) possess bromo-, chromo-, SANT or PHD finger domains, which can bind either methylated or acetylated histones or other chromatin determinants [30, 54

Figure 6
figure 6

Several divergent proteomic hyperlinks between S. cerevisiae and S. pombe are due to gene duplications. Hence, hyperlinks provide a mechanism for the fixation of gene duplications.

New complexes in Chromatin Central

Despite the fact that this work was based on one of the most intensely studied proteomic environments to date, we documented four new subunits in the six known complexes (described above in Results). We also found three new complexes, including two more containing the histone deacetylase Rpd3, to add to the recently described Rpd3S and Rpd3L complexes [24, 25].

We found the three-member Sc_Snt2 complex in an Rpd3-TAP pull down. Subsequently, we validated the complex by tagging Snt2 (Figure 7a). Sc_Snt2p and the other subunit, Sc_Ecm5p, contain several domains involved in chromatin regulation (Figure 7b). Sc_Snt2p (YGL131C) is one of 18 SANT domain-containing proteins in budding yeast and the sixth identified in Chromatin Central (Table S5 in Additional data file 2). Due to its JmjC domain, Sc_Ecm5p is a putative histone demethylase. Its PHD fingers have been reported to bind methylated H3K36 [58].

Figure 7
figure 7

New protein complexes in Chromatin Central. (a) Immuno-isolation of the Sc_Snt2C complex and (b) domain structure of its subunits. (c) Immuno-isolation of the Sp_ASTRA complex and (d) domain structure of its subunits. In (a, c), only the relevant protein bands are annotated. The full list of identified proteins is provided in Table S1 in Additional data file 1.

We did not find a complex similar to Sc_Snt2C in S. pombe. Previously, we showed that the fission yeast orthologue of Sc_Snt2p is also found in a complex with the JmjC domain proteins Sp_Lid2p and Sp_Jmj3p [7, 17]. However, these two complexes, Sc_Snt2C and Sp_Lid2C, are very different. From our IPs and Coomassie gels, we note that the Sc_Snt2C is much less abundant than either Rpd3S or Rpd3L.

The second new Rpd3-containing complex, Rpd3-LE, also contains a minor fraction of cellular Rpd3. Rpd3-LE is an extension of the Rpd3L complex, which includes the core of the Set3 complex. We have four reasons to be confident of this assignment. First, we reciprocally observed the relationship between Rpd3L and core Set3C in both yeasts. Second, in S. cerevisiae we observed only the core proteins of Set3C, but not the full eight-membered Set3C, associated with Rpd3L. This accords with our previous dissection of Sc_Set3C, which showed that the complex has a core of these four proteins [10]. Third, Sp_Set3C consists of only the four core proteins found in Sc_Set3C and does not include homologues of Sc_Hst1p or Sc_Hos4p. It also does not include a homologue of the cyclophilin Sc_Cph1p, which is required to fine-tune the sporulation-specific activity of Sc_Set3C [10, 63]. Fourth, genetic interaction data clearly support the division of Sc_Set3C into core and extension components, as well as the interaction of the core with Rpd3L.

Taken together, these data indicate that Set3C has two versions, one that is small, common to both yeasts and interacts with Rpd3L, and the another that is larger and specific to S. cerevisiae. Sc_Set3C functions during vegetative growth to regulate several inducible genes and to co-operate with Rpd3 [3335]. It has also been shown to be a negative regulator during meiosis [10]. Partly because transcriptional control of meiosis appears to be poorly conserved [64], we suggest that the vegetative functions in S. cerevisiae entirely relate to the smaller complex.

We also discovered the ASTRA complex, which is composed of orthologous subunits in both yeasts and contains a Tra1 member of the ATM-like kinase family (Figures 2, 4 and 7C,D; Table 4). ASTRA includes the ATP-dependent DNA helicase Rvb1/2 heterodimer, the telomere binding protein Tel2p together with two uncharacterized Tel2-interacting proteins (Tti1p and Tti2p), and a WD-repeat containing protein (Asa1p) of unknown function in both yeasts.

Protein assemblies similar to ASTRA have not been identified in other organisms. However, five subunits have clear human homologues so the complex may be widespread. Furthermore, Sc_Tel2p, as well as the C. elegans (CLK-2 [65]) and human (TELO [66]) orthologues are required for telomere length regulation. Sc_Tel2p also influences telomere position effect and interacts directly with double-stranded telomeric DNA [67, 68]. Recently, Hayashi et al. [69] suggested that a Tel2-Tti1 heterodimer recognizes a common domain of phosphatidyl inositol 3-kinase related kinases (PIKK) in the fission yeast and is a component of multiple PIKK assemblies, which corroborates our findings in S. cerevisiae (Table S2 in Additional data file 1). Hence, we suggest that ASTRA is the Tra-specific Tel2/CLK-2/TELO complex involved in telomeric chromatin regulation and will be found in many eukaryotes.

The stringency of our protein selection criteria may have excluded some lowly abundant interactors, as well as proteins only interacting with a particular subunit of the complex, rather than with the entire complex cores. This might include several interesting possible further connections to Chromatin Central (Tables S2, S3 and S4 in Additional data file 1). For example, we observed several subunits of the SWI-SNF global transcription activator complex [7072], RSC chromatin remodeling complex [72, 73], Regulator of Nucleolar silencing and Telophase exit (RENT) [74, 75], Anaphase Promoting Complex (APC) [37], cytoplasm to nucleus signaling complexes TORC1 and TORC2 [69, 76], the proteasome, DNA replication licensing complex [77], the CCR4/NOT transcription complex [78] and various transcription factors.

The histone deacetylases Rpd3p and Clr6p pulled down the complete TRiC chaperonin complex [79] in S. cerevisiae and S. pombe, respectively (Figures 2 and 4; Tables S3 and S4 in Additional data file 1). Similarly, TRiC was co-purified with the Hos2 deacetylase complex Set3C in S. cerevisiae [5]. Hence, we suggest that TriC chaperonin activity is specifically related to type I histone deacetylases.