Abstract
Since the first bacterial genomes were completely sequenced, the surge in genome sequence data has overwhelmed the scientific community’s efforts towards elucidating protein function. Computational methods have made it possible to work with sequences from complete genomes and proteomes, and inference of protein function by exploiting direct sequence similarity indeed goes a long way in describing a proteome’s functional capacity. However, at least 40% of the gene products in newly sequenced genomes typically remain uncharacterised. Proteins without an annotated function are also known as orphan proteins since they do not belong to a functionally characterised protein family. Many sequences must, therefore, be compared using their features rather than by direct comparison in the conventional sequence space. Here we focus on one such feature — glycosylation — that is common in eukaryotic proteomes.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Apweiler R, Hermjakob H, Sharon N (1999) On the frequency of protein glycosylation, as deduced from analysis of the SWISS-PROT database. Biochim Biophys Acta 1473: 4–8
Ashburner M, Ball C, Blake J, Botstein D, Butler H, Cherry J, Davis A, Dolinski K, Dwight S, Eppig J, Harris M, Hill D, Issel Tarver L, Kasarskis A, Lewis S, Matese J, Richardson J, Ringwald M, Rubin G, Sherlock G (2000) Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat Genet 25: 25–29
Attwood T (2000) The quest to deduce protein function from sequence: the role of pattern databases. Int J Biochem Cell Biol 32: 139–155
Blom N, Gammeltoft S, Brunak S (1999) Sequence and structure-based prediction of eukaryotic protein phosphorylation sites. J Mol Biol 294: 1351–1362
Bork P, Dandekar T, Diaz Lazcoz Y, Eisenhaber F, Huynen M, Yuan Y (1998) Predicting function: from genes to genomes and back. J Mol Biol 283: 707–725
Brown M, Grundy W, Lin D, Cristianini N, Sugnet C, Furey T, Ares Jr M, Haussier D (2000) Knowledge-based analysis of microarray gene expression data by using support vector machines. Proc Natl Acad Sci USA 97: 262–267
Casari G, Ouzounis C, Valencia A, Sander C (1996) Genequiz-H: Automatic function assignment for genome sequence analysis. In: Hunter L, Klein T (eds) Proceedings of the First Annual Pacific Symposium on Biocomputing. World Scientific, Hawaii, pp 707–709
Chen C, Colley K (2000) Minimal structural and glycosylation requirements for Gal I activity and traficking. Glycobiology 10: 531–583
Cohen P (2000) The regulation of protein function by multisite phosphorylation — a 25 year update. Trends Biochem Sci 25: 596–601
Comer F, Hart G (1999) O-G1cNAc and the control of gene expression. Biochim Biophys Acta 1473: 161–171
Corner F, Hart G (2000) 0-Glycosylation of nuclear and cytosolic proteins: dynamic interplay between O-G1cNAc and O-Phosphate. J Biol Chem 275: 29179–29182
Dandekar T, Snel B, Huynen M, Bork P (1998) Conservation of gene order: a fingerprint of proteins that physically interact. Trends Biochem Sci 23: 324–328
Devos D, Valencia A (2000) Practical limits of function prediction. Proteins 41: 98–107
Eisen M, Spellman P, Brown P, Botstein D (1998) Cluster analysis and display of genome-wide expression patterns. Proc Natl Acad Sci USA 95: 14863–14868
Eisenberg D, Marcotte E, Xenarios I, Yeates T (2000) Protein function in the post-genomic era. Nature 405: 823–826
Eisenhaber B, Bork P, Eisenhaber F (1999) Prediction of potential GPI-modification sites in proprotein sequences. J Mol Biol 292: 741–758
Enright A, Iliopoulos I, Kyrpides N, Ouzounis C (1999) Protein interaction maps for complete genomes based on gene fusion events. Nature 402: 86–90
Gupta R, Birch H, Rapacki K, Brunak S, Hansen J (1999a) O-GLYCBASE version 4.0: a revised database of 0-glycosylated proteins. Nucleic Acids Res 27: 370–372
Gupta R, Jung E, Gooley A, Williams K, Brunak S, Hansen J (1999b) Scanning the available Dictyostelium discoideum proteome for O-linked GIcNAc glycosylation sites using neural networks. Glycobiology 9: 1009–1022
Hanover J (2001) Glycan-dependent signaling: 0-linked N-acetylglucosamine. FASEB J 15: 1865–1876
Hansen JE, Lund O, Engelbrecht J, Bohr H, Nielsen JO, Hansen JES, Brunak S (1995) Prediction of 0-glycosylation of mammalian proteins: specificity patterns of UDP- Ga1NAc:polypeptide N-acetylgalactosaminyltransferase. Biochem J 308: 801–813
Hansen JE, Lund O, Tolstrup N, Gooley AA, Williams KL, Brunak S (1998) NetOglyc: Prediction of mucin type 0-glycosylation sites based on sequence context and surface accessibility. Glycoconjugate J 15: 115–130
Hart GW, Greis KD, Dong LY, Blomberg MA, Chou TY, Jiang MS, Roquemore EP, Snow DM, Kreppel LK, Cole RN (1995) 0-linked N-acetylglucosamine: the “yin-yang” of Ser/Thr phosphorylation? Nuclear and cytoplasmic glycosylation. Adv Exp Med Biol 376: 115–123
Heyer L, Kruglyak S, Yooseph S (1999) Exploring expression data identification and analysis of coexpressed genes. Genome Res 9: 1106–1115
Hounsell EF, Davies MJ, Renouf DV (1996) 0-linked protein glycosylation structure and function. Glycoconjugate J 13: 19–26
Huynen M, Dandekar T, Bork P (1998) Differential genome analysis applied to the species-specific features of Helicobacter pylori. FEBS Lett 426: 1–5
Iliopoulos I, Tsoka S, Andrade MA, Janssen P, Audit B, Tramontano A, Valencia A, Leroy C, Sander C, Ouzounis CA (2000) Genome sequences and great expectations. Genome Biology 2: 1–2
Arabidopsis Initiative (2000) Analysis of the genome sequence of the flowering plant Arabidopsis thaliana. The Arabidopsis Genome Initiative. Nature 408: 796–815
Krieg J, Hartmann S, Vicentini A, Glasner W, Hess D, Hofsteenge J (1998) Recognition signal for C-mannosylation of Trp-7 in RNase 2 consists of sequence Trp-x-x-Trp. Mol Biol Cell 9: 301–309
Kukuruzinska M, Lennon K (1998) Protein N-glycosylation: molecular genetics and functional significance. Crit Rev Oral Biol Med 9: 415–448
Lis H, Sharon N (1993) Protein glycosylation: Structural and functional aspects. Cur J Biochem 218: 1–27
Marcotte E (2000) Computational genetics: finding protein function by nonhomology methods. Curr Opin Struct Biol 10: 359–365
Marcotte E, Pellegrini M, Ng H, Rice D, Yeates T, Eisenberg D (1999) Detecting protein function and protein-protein interactions from genome sequences. Science 285: 751–753
Nielsen H, Krogh A (1998) Prediction of signal peptides and signal anchors by a hidden Markov model. In: Glasgow J, Littlejohn T, Major F, Lathrop R
Sankoff D, Sensen C (eds) Proceedings, Sixth International Conference on Intelligent Systems for Molecular Biology, vol. 6. AAAI Press, Menlo Park, pp 122–130
Nielsen H, Engelbrecht J, Brunak S, von Heijne G (1997) Identification of prokaryotic and eukaryotic signal peptides and prediction of their cleavage sites. Protein Eng 10: 1–6
Nilsson I, von Heijne G (1993) Determination of the distance between the oligosaccharyl-transferase active site and the endoplasmic reticulum membrane. J Biol Chem 268: 5798–5801
Nilsson I, von Heijne G (2000) Glycosylation eficiency of Asn-Xaa-Thr sequons depends both on the distance from the C terminus and on the presence of a downstream transmembrane segment. J Biol Chem 275: 17338–17343
Overbeek R, Fonstein M, D’Souza M, Pusch G, Maltsev N (1999) The use of gene clusters to infer functional coupling. Proc Natl Acad Sci USA 96: 2896–2901
Pellegrini M, Marcotte E, Thompson M, Eisenberg D, Yeates T (1999) Assigning protein functions by comparative genome analysis: protein phylogenetic profiles. Proc Natl Acad Sci USA 96: 4285–4288
Rechsteiner M, Rogers S (1996) PEST sequences and regulation by proteolysis. Trends Biochem Sci 21: 267–271
Riley M (1993) Functions of the gene products of Escherichia coli. Microbiol Rev 57: 862–952
Roth J, Wang Y, Eckhardt AE, Hill RL (1994) Subcellular localization of the UDP-N-acetyl-d-galactosamine: polypeptide Nacetylgalactosaminyltransferase-mediated O- glycosylation reaction in the submaxillary gland. Proc Nati Acad Sci USA 91: 8935–8939
Rubin G, Yandell M, Wortman J, Gabor Miklos G, Nelson C, Hariharan I, Fortini M, Li P, Apweiler R, Fleischmann W, Cherry J, Henikofi S, Skupski M, Misra S, Ashburner M, Birney E, Boguski M, Brody T, Brokstein P, Celniker S, Chervitz S, Coates D, Cravchik A, Gabrielian A, Galle R, Gelbart W, George R, Goldstein L, Gong F, Guan P, Harris N, Hay B, Hoskins R, Li J, Li Z, Hynes R, Jones S, Kuehl P, Lemaitre B, Littleton J, Morrison D, Mungall C, OFarrell P, Pickeral O, Shue C, Vosshall L, Zhang J, Zhao Q, Zheg X, Zhong F, Zhong W, Gibbs R, Venter J, Adams M, Lewis S (2000) Comparative genomics of the eukaryotes. Science 287: 2204–2215
Snow DM, Hart GW (1998) Nuclear and Cytoplasmic Glycosylation. Int Rev Cytol 181: 43–74
Sonnhammer E, von Heijne G, Krogh A (1998) A hidden Markov model for predicting transmembrane helices in protein sequences. Proc Int Conf Intell Syst Mol Biol 6: 175–182
Tamames J, Casari G, Ouzounis C, Valencia A (1997) Conserved clusters of functionally related genes in two bacterial genomes. J Mol Evol 44: 66–73
Tatusov R, Koonin E, Lipman D (1997) A genomic perspective on protein families. Science 278: 631–637
Van den Steen P, Rudd PM, Dwek RA, Opdenakker G (1998) Concepts and Principles of 0-linked Glycosylation. Crit Rev Biochem Mol Biol 33: 151–208
Varki A (1993) Biological roles of oligosaccharides: all of the theories are correct. Glycobiology 3: 97–130
Varshaysky A (1996) The N-end rule: functions, mysteries, uses. Proc Natl Acad Sci USA 93: 12142–12149
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2002 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Gupta, R., Jensen, L.J., Brunak, S. (2002). Orphan Protein Function and Its Relation to Glycosylation. In: Mewes, HW., Seidel, H., Weiss, B. (eds) Bioinformatics and Genome Analysis. Ernst Schering Research Foundation Workshop, vol 38. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-04747-7_13
Download citation
DOI: https://doi.org/10.1007/978-3-662-04747-7_13
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-662-04749-1
Online ISBN: 978-3-662-04747-7
eBook Packages: Springer Book Archive