Introduction

Protein glycosylation is one of the major post-translational modifications (PTMs) with significant effects on protein folding, stability and activity1,2. It plays structural, protective, stabilizing roles in living cells and is involved in multiple biological mechanisms3,4,5. It is well recognized that aberrant glycosylation is closely correlated with oncogenesis and tumor progression and up to now, most of the clinically used cancer biomarkers are glycoproteins6,7,8. Therefore, comprehensive characterization of the glycoproteome in a biological sample is highly demanded but remains extremely challenging due to the vast dynamic range of protein concentration and the heterogeneity of protein glycosylation. To achieve in-depth glycoproteome analysis, glycoproteins/glycopeptides are often selectively enriched to reduce the sample complexity9,10,11. There are four glycoproteins/glycopeptides enrichment approaches that have been mostly applied so far including: 1) lectin affinity chromatography based method12,13; 2) titanium dioxide chromatography14,15; 3) hydrophilic interaction chromatography (HILIC)16,42.

Figure 8
figure 8

Performance of the PNP strategy.

(a) Comparison of the number of de-glycopeptide identifications, ratio of N-terminal Ser/Thr de-glycopeptides and glycopeptide enrichment specificity for conventional HC method, PNP strategy and the HILIC method. (b) Comparison of hydroxylamine released peptides from hydrazide beads after enzymatic release of glycopeptides released by PNGase F. (c) The number of identified de-glycopeptides at different NaIO4 concentration for conventional HC method and PNP strategy. Tryptic peptides from mouse liver were used as the samples. Above data were averaged from 3 replicate experiments (error bars represent the standard deviation).

In addition to improvement of de-glycopeptides identification, it’s also of interest to know whether many oxidized N-terminal Ser/Thr peptides are captured by the HC beads in the conventional and our improved HC methods. The enrichment of glycopeptides by HC beads was performed using the mouse liver sample as above. After PNGase F treatment and removing the de-glycopeptides, the hydrazide beads of both the conventional HC method and the PNP strategy were then subjected to a hydroxylamine treatment. As hydrazone bond is produced through the reaction of hydrazide with aldehyde on N-terminal Ser/Thr peptides, hydroxylamine treatment could effectively cleave the hydrazone bond between the N-terminal Ser/Thr peptides and the beads to generate N-terminal oxime peptides. Following this hydroxylamine treatment, the supernatants were collected and analyzed by 1D LC-MS/MS. As shown in Fig. 8b, about 2000 unique peptides could be identified from the supernatant from the conventional HC method beads and about 80% of them were peptides of N-terminal Ser/Thr. While using the PNP strategy, only about 100 unique peptides were identified and less than 1% peptides were N-terminal Ser/Thr peptide. Clearly the hydroxylamine treatment experiment confirmed that peptides with N-terminal Ser/Thr were covalently captured by HC beads in the conventional method.

In conventional HC method, as enormous N-terminal Ser and Thr peptides coexist with glycopeptides, the total amount of the groups to be oxidized (cis-diols on glycans and vicinal amino alcohols on N-terminal Ser/Thr peptides) are much higher than those in the PNP strategy. Thus NaIO4 concentration used in conventional HC method may not be optimal in this PNP strategy. Therefore we performed a series of experiments to capture glycopeptides from aliquots of mouse liver protein digests oxidized with NaIO4 at 0.5, 1, 2, 5 and 10 mM for PNP strategy and 1, 2, 5, 10 and 20 mM for conventional HC method. As shown in Fig.8c 1 mM NaIO4 is sufficient for carbohydrate oxidation of glycopeptides in PNP strategy, while higher concentration of NaIO4 is needed in conventional HC strategy, which fits well with our anticipation. It should be noted that higher NaIO4 concentration could impair the identification of glycosites in both methods, partially due to the potential more severe side reactions occurring on peptides containing Cys and Trp.

Comprehensive glycoproteomics analysis by the PNP strategy

Then we performed the comprehensive glycoproteome analysis of mouse liver tissues using the improved HC method with the PNP strategy. For comparison, conventional HC approach was also applied to analyze the same sample. Briefly each aliquot of 500 μg protein digest of mouse liver tissues was oxidized in 400 μL oxidation buffer containing 1 mM NaIO4 for PNP strategy and 2 mM for conventional HC strategy. Two replicate runs of SCX-RPLC MS/MS were then applied to analyze the enriched de-glycopeptides. The number of identified de-glycopeptides and glycosites were summarized in Fig. 9a. Compared with the conventional HC approach, the PNP strategy identified about 30% more de-glycopeptides. And the ratio of the identified N-terminal Ser/Thr de-glycopeptides reached to 13.9% which is close to its proteome level, while this ratio was only 1.0% when the conventional HC method was applied (Fig. 9a). By combining two replicate results, totally 1837 unique glycosites corresponding to 864 unique glycoproteins were identified by PNP strategy and only 1371 unique glycosylation sites corresponding to 711 unique glycoproteins were identified by the conventional HC approach. And 34.0% more glycosites were identified in the PNP strategy clearly demonstrated that the new method surpasses the conventional method for comprehensive glycoproteomics analysis (Fig. 9b, Fig. S7).

Figure 9
figure 9

Comparison of the glycoproteome datasets obtained by the PNP strategy and the conventional HC approach.

(a) Summary of the identification results. (b) Overlap of glycosites. (c) Overlap of glycoproteins.

Interestingly, the PNP strategy was found to be complementary in glycoproteomics analysis with the conventional HC approach. The overlaps of the two technical replicates in each method were about 60 ~ 70%, while the overlap between these two methods was only about 50% (Fig. 9b, Fig. S7). In addition to recovering the N-terminal Ser/Thr deglycosylated peptides, the change of fragmentation behavior due to the labeling may also contribute to the high complementary39. Thus the glycoproteome coverage could be improved by combining these two methods. Finally, 2108 unique glycosites corresponding to 976 unique glycoproteins were identified by combining the datasets of these two methods (Fig. 9b,c). Distribution of the number of glycosites on glycoproteins was investigated and it was found that approximately 50% glycoproteins carried a single N-linked sugar chain and 8.6% contained 5 or more N-glycosylation sites (Fig. S9). The average glycosites for each glycoprotein increase from 1.9 with conventional HC method to 2.1 with PNP strategy, which indicated that the identification of glycosites on peptides with N-terminal Ser/Thr could significantly improve the glycoproteome coverage. The HC methods were found to be highly complementary to HILIC method as mentioned in the above section (Fig. S6). Besides, HC methods and lectin methods were also highly complementary. As shown in Fig. S8, the overlap of large scale mouse liver glycoproteome obtained in this work with the dataset acquired from lectin affinity method was low, indicating high complementary of the identifications43. This is consistent with the results reported in previous literature30. Therefore, to achieve a more comprehensive glycoproteome analysis, combined using of all above methods is required.

Discussion

Investigating side reactions and finding ways to prevent the side reactions in peptide/protein derivatization experiments are of broad interest to the protein science community. The side reactions could be investigated by performing derivatization experiments individually with a series of synthesized oligopeptides. However, this conventional approach is time consuming and labor intensive. In this study, we present a high throughput proteomics approach to achieve this goal. It is based on the fact that peptides with unknown modifications are in general unable to be identified by the normal database search. If there is any side reaction occurring on a residue, the peptides will be modified. And because such modified peptides fail to be identified, the occurrence frequency for the peptides with the modified amino acid residues must decrease. This gives us a clue to determine which type of amino acid residue might be modified. By using this approach, we successfully found that mainly two types of side reactions occurred on peptide backbones during the glycan oxidation step in HC method, namely the oxidation on the Met, CamCys and Trp residues and the aldehyde formation reaction on N-terminal Ser/Thr residues. In principle, this approach is readily applicable to find the side reactions in other peptide derivatization experiments.

Then we presented approaches to prevent the loss of identifications resulted from such side reactions in HC method. For the de-glycopeptides with oxidized Met, CamCys and Trp residues, their identifications can be achieved by setting variable modifications. However, the identification of de-glycopeptides with N-terminal Ser/Thr residues cannot be achieved by simply setting variable modification during the database search. This is because the aldehyde groups on N-terminal of such glycopeptides react with hydrazide groups resulting in capture of these peptides on the HC beads. Therefore these peptides cannot be released by PNGase F cleavage and thus fail to be identified by LC-MS/MS. To overcome this problem, we presented a PNP strategy by blocking the N-terminal amino groups with dimethyl groups, which effectively prevented the oxidation of N-terminal Ser/Thr on peptides. As a result, de-glycopeptides with N-terminal Ser/Thr could be efficiently identified with the developed PNP strategy.

Peptides with N-terminal Ser/Thr residues represent roughly 15.2% of proteome, however at least 30% more de-glycopeptides were identified by this new strategy in comparison with that using the conventional HC method. Clearly in addition to the recovery of de-glycopeptides with the N-terminal Ser/Thr, there are other reasons for the excellent improvement. We believe that the improved enrichment specificity could be one of the reasons. In addition to the oxidation of glycopeptides, many non-glycopeptides with N-terminal Ser/Thr could also be oxidized and coupled to the hydrazide beads in conventional HC method. As illustrated in Fig. S5, the covalent binding of these peptides makes the beads surface changing from hydrophilic to hydrophobic, from neutral to charged. Therefore, many non-glycosylated peptides may be adsorbed on the hydrazide beads through the hydrophobic interaction and/or charge-charge interaction which compromise the enrichment specificity. While in the newly developed PNP strategy, these peptides are no longer be oxidized and cannot be immobilized onto the beads, which ends up with much less non-specific adsorption of peptides. Above allegation was confirmed by the experiment of hydroxylamine treatment (Fig. 8b), which was performed after the PNGase F enzymatic release of glycopeptides. Among the 2044 unique peptides identified from the hydroxylamine treatment of HC beads of the conventional method, about 80% of them were N-terminal Ser/Thr peptide. While the corresponding number and percentage was 113 and less than 1% using the PNP strategy. Clearly a large amount of N-terminal Ser/Thr peptides were captured onto the HC beads in the conventional method as expected. And much more peptides with N-terminal amino acids other than Ser/Thr (380 in conventional method vs 112 in the new method) were identified in the conventional method. These peptides were retained on the hydrazide beads mainly through the non-specific interaction with the captured N-terminal Ser/Thr peptides on the beads. Clearly the PNP approach is able to reduce the non-specific binding. As a result, the PNP approach yields the higher specificity (78.6%) over the conventional HC method (66.3%) in glycoproteome analysis.

It should be noted that, higher NaIO4 concentration could impair the identification of de-glycopeptides in both approaches. This is because more severe side reactions could occur when higher NaIO4 concentration is used. As glycoproteomics analysis could be performed using only 1 mM NaIO4 for carbohydrate oxidation in the developed PNP strategy, the potential side reaction effect was reduced significantly. Though the peptides with oxidized Met, Trp and CamCys could be identified by setting variable modifications in the database searching, this approach significantly increases the search space and decreases the identification sensitivity. The setting of variable modification should be cautious. It should be adopted only when the extent of the side reaction is significant and the setting of modifications surely increases the number of peptide identification. Based on the data we obtained in this study, we believe that the setting of oxidation modification on Met in database searching is essential, while the setting of oxidation on CamCys and Trp is necessary only when high NaIO4 concentration is used.

Methods

Ethics statement

This study was approved by the Dalian Institute of Chemical Physics Ethics Committee. All experiments were performed in accordance with Ethics Committee’s guidelines and regulations.

Reagents

Formic acid (FA) and sodium cyanoborohydride (NaBH3CN) were provided by Fluka (Buchs, Germany). Acetonitrile (ACN, HPLC grade) was purchased from Merck (Darmstadt, Germany). All the other chemicals and reagents were purchased from Sigma (St. Louis, MO). Sep-Pak C18 cartridges were provided by Waters (Milford, MA). Fused silica capillaries with 75 μm i.d. and 200 μm i.d. were obtained from Polymicro Technologies (Phoenix, AZ). All the water used in this experiment was prepared using a Mill-Q system (Millipore, Bedford, MA).

Protein Sample Preparation

Adult female C57 mice were purchased from Dalian Medical University (Dalian, China). As described in our previous work44,45, the liver tissues were lysed in a homogenization buffer, consisting of 8 M urea, 1% Triton X-100 v/v, 65 mM DTT, 1 mM EDTA, 0.5 mM EGTA, 1 mM PMSF, 10 μL of protease inhibitor cocktail for 1 mL of homogenized buffer, phosphatase inhibitors (1 mM NaF, 1 mM Na3VO4, 1 mM C3H7Na2O6P, 10 mM Na4O7P2) and 40 mM Tris-HCl at pH 7.4. The protein concentration was determined by Bradford assay. The extracted proteins were precipitated by chloroform/methanol precipitation. After washing with methanol, the pellets were resuspended in denaturing buffer containing 100 mM Triethyl Ammonium Bicarbonate (TEAB, a versatile buffer compatible with the digestion and dimethyl labeling reaction) (pH 8.0) and 8 M urea. The protein concentration was determined again by Bradford assay.

Protein Digestion and peptide dimethyl labelling

The proteins were reduced by DTT at 37 °C for 2 h and alkylated by iodoacetamide in the dark at room temperature for 40 min. Then, the solutions were diluted to 1 M urea with 100 mM TEAB (pH 8.0) and trypsin was added with the weight ratio of trypsin to protein at 1/25 and incubated at 37 °C overnight. All of the resulting peptide solution was stored at -80 °C. To the resulted tryptic digest (1 mg in 1 mL of 100 mM TEAB solution), 100 μL of CH2O (4%, v/v) was added followed with the addition of 100 μL of freshly prepared NaBH3CN (0.6 M). The resultant mixture was incubated for 1 h at room temperature. Then, 20 μL of ammonia (10%) added to consume the excess labeling reagents. After the labeled peptide mixture was acidified by addition of 10 μL of FA , it was desalted by the solid phase extraction (SPE) column and dried down in a Speed Vac.

Glycopeptide Enrichment. The glycopeptide enrichment by HC method was similar to that reported by Tian et al 21. Briefly, 1 mg of the dried tryptic peptides was reconstituted in oxidation buffer (100 mM NaAc, 150 mM NaCl, pH = 5.5) and NaIO4 was added to reach different final concentrations varied from 1 mM to 20 mM. The reaction was kept in dark for 1 h and quenched by adding sodium thiosulfate with final concentration two times of that of NaIO4. Then the oxidized peptides were coupled to 50 μL Affi-Gel Hz hydrazide beads (slurry volume) (Bio-Rad, USA) overnight at room temperature. The glycopeptides-coupled beads were washed extensively and sequentially with sodium chloride (1.5 M), ACN/H2O (80/20, v/v) and ammonium bicarbonate (100 mM, pH 8.3). The beads was resuspended in a minimum volume of ammonium bicarbonate solution (25 mM, pH 7.5) with 3 μL (500 U/μL) of PNGase F (New England Biolabs) and incubated overnight at 37 °C, which leaves a 0.9858 Da mass shift on the previously glycosylated site and will facilitate the identification of glycosylated sites by MS (it should be noted that chemical deamidation can also generate a 0.9858 Da mass shift on non-glycosylated asparagine and glutamine which may lead to false positive identifications46). Deglycosylated glycopeptides washed and dispersed in 80% ACN/2% TFA (v/v, 400 μL). Then 1 mg mouse liver glycoprotein digest were dissolved in 80% ACN/2% TFA (v/v, 100 μL) and the solution was added to the HILIC resin suspension in the centrifuge tube. After incubation under gentle agitation at room temperature for 10 min, the supernatant was discarding after centrifuge. The HILIC resins were washed three times with 80% ACN/0.1% TFA (v/v, 400 μL) to remove the non-glycopeptides. Then glycopeptides were eluted with 100 μL 0.1%TFA (v/v), 30%ACN/0.1%TFA (v/v) and 0.1%TFA (v/v), respectively. The eluted peptides were collected and combined. After lyophilization, 100 μL of 20 mM NH4HCO3 containing 500 U PNGase F was added and incubated at 37 °C for overnight for deglycosylation of the glycopeptides.

Online RP-SCX-RP Multidimensional Separation and Mass Spectrometry Analysis

The lyophilized de-glycopeptides were resuspended in 0.1% FA. The automated sample injection and multidimensional separation using the RP-SCX-RP system were constructed as previously described47 and the RP segment of the RP-SCX biphasic column was used as the sample loading column to reduce the sample loss. The resuspended de-glycopeptides were loaded onto the biphasic column and then, a 120 min RP gradient nanoflow LC-MS/MS (0 mM) was applied at first to transfer the peptides retained on the RP segment to the SCX monolithic of the biphasic column. Then a series stepwise elution with salt concentrations of 50, 100, 200, 300, 400, 500 and 1000 mM NH4AC (pH 2.7) was used to elute peptides from SCX monolithic column to an in-house packed 75 μm i.d. and 15 cm length C18 separation column (3 μm, 120 Å). Each salt step lasted 10 min followed by 15 min equilibrium with 0.1% FA in water.

RPLC-MS/MS analysis was performed using a quaternary surveyor MS pump (Thermo, San Jose, CA) and LTQ-Orbitrap Velos (Thermo, San Jose, CA). For the RPLC separation, 0.1% FA in water and in acetonitrile were used as mobile phases A and B, respectively and the flow rate was adjusted to ~300 nL/min after splitting. The 200 min gradient elution was performed with a gradient of 0-3% B in 5 min, 3-25% B in 145 min, 25-35% B in 10 min, 35%-80% B in 3 min, 80% B in 7 min, 80-100% B in 3 min and 100% B lasted 27 min. Other gradients for 1 D LC-MS/MS analysis were just referred to the 200 min gradient and elongated or shorten the time of 3-25% B gradient as needed. A spray voltage of 2.2 kV was applied between the spray tip and the MS interface. The temperature of the ion transfer capillary was 250 °C. The LTQ-Orbitrap Velos mass spectrometer (Thermo, San Jose, CA) was operated in data-dependent MS/MS acquisition mode. Full mass scan performed in the Orbitrap analyzer was acquired from m/z 400 to 2000 (R = 60000 at m/z 400). The 20 most intense ions from the full scan were selected to fragmentation via collision induced dissociation (CID) in the LTQ. The dynamic exclusion function was set as follows: repeat count 2, repeat duration 30 s and an exclusion duration of 60 s.

All MALDI-TOF mass data were obtained on AB Sciex 5800 MALDI-TOF/TOF mass spectrometer (AB Sciex, CA) equipped with a pulsed Nd/YAG laser at 355 nm. DHB (2,5-dihydroxybenzoic acid, 25 mg/mL in ACN/H2O/H3PO4 (70/29/1 v/v/v)) was used as the matrix for the analysis of peptides. Peptide sample solutions (0.5 μL) were deposited on the MALDI plate and dried at room temperature and then DHB matrix solution (0.5 μL for peptide analysis) was deposited.

Protein and Peptide Identification

All MS/MS spectra were searched using the MaxQuant version 1.3.0.05 against a composite International Protein Index (IPI) database (IPI mouse 3.87). Carbamidomethylation on cysteine (C, +57.0215 Da) was set as a fixed modification for all the searches. One or more following variable modifications were set: deamidation (N, +0.9858 Da), the oxidation on methionine (M, +15.9949 Da), cysteine (C, +15.9949 Da) and tryptophan (W, +15.9949, +31.9898 Da). For the identification of peptides released by the hydroxylamine treatment experiment, oximation on N-terminal serine (-16.0313 Da) and threonine (-30.0470 Da) was set as variable modification. To identify peptides from PNP strategy, single-plex label of dimethylation on lysine and peptide amino termini was set. Trypsin was set as the specific proteolytic enzyme with up to two missed cleavages allowed. The mass tolerance for the precursor ion was set to 10 ppm and 0.8 Da for the fragment ion. The peptide identifications with the false discovery rate ≤0.01 were accepted for protein identification. Only the identified deamidation sites which conformed to the N-glycosylation consensus sequence (N-!P-[S/T], N-X-C) were considered as glycosites43. The identified de-glycopeptides must include at least one glycosite defined above. The amino acid occurrence frequency distributions were determined based on the identified unique peptides. Specifically, the percentages of peptides containing each type of amino acid residue were determined by dividing the number of unique peptides containing the amino acid residue by the total number of identified unique peptides.

Additional Information

How to cite this article: Huang, J. et al. A peptide N-terminal protection strategy for comprehensive glycoproteome analysis using hydrazide chemistry based method. Sci. Rep. 5, 10164; doi: 10.1038/srep10164 (2015).