Log in

Carbohydrate Structure Database: current state and recent developments

  • Research Paper
  • Published:
Analytical and Bioanalytical Chemistry Aims and scope Submit manuscript

Abstract

Carbohydrate Structure Database (CSDB) is a curated glycan data collection and a glycoinformatic platform. In this report, its database, analytical, and other components that have appeared for the recent years are reviewed. The major improvements were achieving close-to-full coverage on glycans from microorganisms, launching modules for glycosyltransferases and saccharide conformations, online glycan builder and 3D modeler, NMR simulator, NMR-based structure predictor, and other tools.

Graphical Abstract

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price includes VAT (Germany)

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

© Springer Nature, 2022

Fig. 4
Fig. 5

Similar content being viewed by others

Abbreviations

API:

Application programming interface

CASPER:

Computer-assisted spectrum evaluation of regular polysaccharides

CAZY:

Carbohydrate-active enzyme

COSY:

Correlation spectroscopy

CSDB:

Carbohydrate Structure Database

GLC:

Gas-liquid chromatography

GT:

Glycosyltransferase

GODDESS:

Glycan-optimized database-driven empirical spectrum simulation

GRASS:

Generation, ranking, and assignment of saccharide structures

HMBC:

Heteronuclear multiple bond correlation

HOSE:

Hierarchical organization of spherical environment

HSQC:

Heteronuclear single-quantum coherence

ICD:

International Classification of Diseases

IEDB:

Immune Epitope Database

KEGG:

Kyoto Encyclopedia of Genes and Genomes

MSDB:

Monosaccharide Database

MESH:

Medical subject headings

NLM:

National Library of Medicine

PMID:

PubMed identifier

RDF:

Resource description framework

ROESY:

Rotating frame Overhauser effect spectroscopy

SMILES:

Simplified molecular-input line-entry system

SNFG:

Symbolic nomenclature for glycans

TOCSY:

Total correlation spectroscopy

PDB:

Protein Data Bank

WURCS:

Web3 unique representation of carbohydrate structures

References

  1. Egorova KS, Toukach PV. Glycoinformatics: bridging isolated islands in the sea of data. Angew Chem Int Ed. 2018;57(46):14986–90. https://doi.org/10.1002/anie.201803576.

    Article  CAS  Google Scholar 

  2. Lisacek F, Mariethoz J, Alocci D, Rudd PM, Abrahams JL, Campbell MP, Packer NH, Stahle J, Widmalm G, Mullen E, et al. Databases and associated tools for glycomics and glycoproteomics. Methods Mol Biol. 2017;1503:235–64. https://doi.org/10.1007/978-1-4939-6493-2_18.

    Article  CAS  PubMed  Google Scholar 

  3. Abrahams JL, Taherzadeh G, Jarvas G, Guttman A, Zhou Y, Campbell MP. Recent advances in glycoinformatic platforms for glycomics and glycoproteomics. Curr Opin Struct Biol. 2020;62:56–69. https://doi.org/10.1016/j.sbi.2019.11.009.

    Article  CAS  PubMed  Google Scholar 

  4. Copoiu L, Malhotra S. The current structural glycome landscape and emerging technologies. Curr Opin Struct Biol. 2020;62:132–9. https://doi.org/10.1016/j.sbi.2019.12.020.

    Article  CAS  PubMed  Google Scholar 

  5. Scherbinina SI, Toukach PV. Three-dimensional structures of carbohydrates and where to find them. Int J Mol Sci. 2020;21(20):7702. https://doi.org/10.3390/ijms21207702.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  6. A practical guide to using glycomics databases. 1st ed. K.F. Aoki-Kinoshita, Editor. 2017: Springer Tokyo. https://doi.org/10.1007/978-4-431-56454-6.

  7. Aoki-Kinoshita KF, Campbell MP, Lisacek F, Neelamegham S, York WS, Packer NH. Glycoinformatics, in Essentials of Glycobiology, Ch. 52, A. Varki, et al., Editors. Cold Spring Harbor Laboratory Press: Cold Spring Harbor (NY); 2022. https://doi.org/10.1101/glycobiology.4e.52.

  8. Lütteke T. Glycan data retrieval and analysis using GLYCOSCIENCES. de Applications. In A Practical Guide to Using Glycomics Databases, Ch. 16, K.F. Aoki-Kinoshita, Editor. Springer Japan: Tokyo, Japan; 2017. pp. 335–350. https://doi.org/10.1007/978-4-431-56454-6_16.

  9. Campbell MP, Peterson R, Mariethoz J, Gasteiger E, Akune Y, Aoki-Kinoshita KF, Lisacek F, Packer NH. UniCarbKB: building a knowledge platform for glycoproteomics. Nucleic Acids Res. 2014;42(Database issue):D215–21. https://doi.org/10.1093/nar/gkt1128.

    Article  CAS  PubMed  Google Scholar 

  10. Kahsay R, Vora J, Navelkar R, Mousavi R, Fochtman BC, Holmes X, Pattabiraman N, Ranzinger R, Mahadik R, Williamson T, et al. GlyGen data model and processing workflow. Bioinformatics. 2020;36(12):3941–3. https://doi.org/10.1093/bioinformatics/btaa238.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  11. Maeda M, Fujita N, Suzuki Y, Sawaki H, Shikanai T, Narimatsu H. JCGGDB: Japan consortium for glycobiology and glycotechnology database, in Glycoinformatics, Ch. 12, T. Lütteke and M. Frank, Editors. Humana Press: New York; 2015. pp. 161–179. https://doi.org/10.1007/978-1-4939-2343-4_12.

  12. Mariethoz J, Alocci D, Gastaldello A, Horlacher O, Gasteiger E, Rojas-Macias M, Karlsson NG, Packer NH, Lisacek F. Glycomics@ExPASy: Bridging the gap. Mol Cell Proteomics. 2018;17(11):2164–76. https://doi.org/10.1074/mcp.RA118.000799.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  13. Yamada I, Shiota M, Shinmachi D, Ono T, Tsuchiya S, Hosoda M, Fujita A, Aoki NP, Watanabe Y, Fujita N, et al. The GlyCosmos portal: a unified and comprehensive web resource for the glycosciences. Nat Methods. 2020;17(7):649–50. https://doi.org/10.1038/s41592-020-0879-8.

    Article  CAS  PubMed  Google Scholar 

  14. Lee S, Inzerillo S, Lee GY, Bosire EM, Mahato SK, Song J. Glycan-mediated molecular interactions in bacterial pathogenesis. Trends Microbiol. 2022;30(3):254–67. https://doi.org/10.1016/j.tim.2021.06.011.

    Article  CAS  PubMed  Google Scholar 

  15. Herget S, Ranzinger R, Maass K, Lieth CW. GlycoCT-a unifying sequence format for carbohydrates. Carbohydr Res. 2008;343(12):2162–71. https://doi.org/10.1016/j.carres.2008.03.011.

    Article  CAS  PubMed  Google Scholar 

  16. Rigden DJ, Fernandez-Suarez XM, Galperin MY. The 2016 database issue of nucleic acids research and an updated molecular biology database collection. Nucleic Acids Res. 2016;44:D1–6. https://doi.org/10.1093/nar/gkv1356.

    Article  CAS  PubMed  Google Scholar 

  17. Zhulin IB. Databases for microbiologists. J Bacteriol. 2015;197(15):2458–67. https://doi.org/10.1128/JB.00330-15.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  18. Aoki-Kinoshita KF. Using databases and web resources for glycomics research. Mol Cell Proteomics. 2013;12(4):1036–45. https://doi.org/10.1074/mcp.R112.026252.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  19. Li X, Xu Z, Hong X, Zhang Y, Zou X. Databases and bioinformatic tools for glycobiology and glycoproteomics. Int. J. Mol. Sci. 2020;21(18) https://doi.org/10.3390/ijms21186727.

  20. Toukach PV, Egorova KS. Carbohydrate structure database merged from bacterial, archaeal, plant and fungal parts. Nucleic Acids Res. 2016;44(D1):D1229–36. https://doi.org/10.1093/nar/gkv840.

    Article  CAS  PubMed  Google Scholar 

  21. Toukach PV, Egorova KS. Source files of the Carbohydrate Structure Database: the way to sophisticated analysis of natural glycans. Sci Data. 2022;9(1):131. https://doi.org/10.1038/s41597-022-01186-9.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  22. Toukach FV, Shashkov AS. Computer-assisted structural analysis of regular glycopolymers on the basis of 13C NMR data. Carbohydr Res. 2001;335(2):101–14. https://doi.org/10.1016/s0008-6215(01)00214-2.

    Article  CAS  PubMed  Google Scholar 

  23. Lundborg M, Widmalm G. Structural analysis of glycans by NMR chemical shift prediction. Anal Chem. 2011;83(5):1514–7. https://doi.org/10.1021/ac1032534.

    Article  CAS  PubMed  Google Scholar 

  24. Loss A, Stenutz R, Schwarzer E, von der Lieth CW. GlyNest and CASPER: two independent approaches to estimate 1H and 13C NMR shifts of glycans available through a common web-interface. Nucleic Acids Res. 2006;34(Web Server issue):W733–7. https://doi.org/10.1093/nar/gkl265.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  25. Doubet S, Albersheim P. CarbBank. Glycobiology. 1992;2(6):505–7. https://doi.org/10.1093/glycob/2.6.505.

    Article  CAS  PubMed  Google Scholar 

  26. Toukach PV, Shirkovskaya AI. Carbohydrate Structure Database and other glycan databases as an important element of glycoinformatics. Russ J Bioorg Chem. 2022;48(3):457–66. https://doi.org/10.1134/s1068162022030190.

    Article  CAS  Google Scholar 

  27. Neelamegham S, Aoki-Kinoshita K, Bolton E, Frank M, Lisacek F, Lütteke T, O’Boyle N, Packer N, Stanley P, Toukach P, et al. Updates to the symbol nomenclature for glycans (SNFG) guidelines. Glycobiology. 2019;29(9):620–4. https://doi.org/10.1093/glycob/cwz045.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  28. Bochkov AY, Toukach PV. CSDB/SNFG structure editor: An online glycan builder with 2D and 3D structure visualization. J Chem Inf Model. 2021;61(10):4940–8. https://doi.org/10.1021/acs.jcim.1c00917.

    Article  CAS  PubMed  Google Scholar 

  29. Alocci D, Suchánková P, Costa R, Hory N, Mariethoz J, SvobodováVařeková R, Toukach P, Lisacek F. SugarSketcher: quick and intuitive online glycan drawing. Molecules. 2018;23(12):3206. https://doi.org/10.3390/molecules23123206.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  30. Scherbinina SI, Frank M, Toukach PV. Carbohydrate Structure Database oligosaccharide conformation tool. Glycobiology. 2022;32(6):460–8. https://doi.org/10.1093/glycob/cwac011.

    Article  CAS  PubMed  Google Scholar 

  31. Chernyshov IY, Toukach PV. REStLESS: automated translation of glycan sequences from residue-based notation to SMILES and atomic coordinates. Bioinformatics. 2018;34(15):2679–81. https://doi.org/10.1093/bioinformatics/bty168.

    Article  CAS  PubMed  Google Scholar 

  32. Burley SK, Berman HM, Kleywegt GJ, Markley JL, Nakamura H, Velankar S. Protein Data Bank (PDB): The single global macromolecular structure archive. Methods Mol Biol. 2017;1607:627–41. https://doi.org/10.1007/978-1-4939-7000-1_26.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  33. Kirschner KN, Yongye AB, Tschampel SM, Gonzalez-Outeirino J, Daniels CR, Foley BL, Woods RJ. GLYCAM06: a generalizable biomolecular force field. Carbohydrates J Comput Chem. 2008;29(4):622–55. https://doi.org/10.1002/jcc.20820.

    Article  CAS  PubMed  Google Scholar 

  34. Matsubara M, Aoki-Kinoshita KF, Aoki NP, Yamada I, Narimatsu H. WURCS 2.0 update to encapsulate ambiguous carbohydrate structures. J Chem Inf Model. 2017;57(4):632–7. https://doi.org/10.1021/acs.jcim.6b00650.

    Article  CAS  PubMed  Google Scholar 

  35. Sehnal D, Grant OC. Rapidly display glycan symbols in 3D structures: 3D-SNFG in LiteMol. J Proteome Res. 2019;18(2):770–4. https://doi.org/10.1021/acs.jproteome.8b00473.

    Article  CAS  PubMed  Google Scholar 

  36. Toukach PV, Egorova KS. New features of Carbohydrate Structure Database notation (CSDB Linear), as compared to other carbohydrate notations. J Chem Inf Model. 2020;60(3):1276–89. https://doi.org/10.1021/acs.jcim.9b00744.

    Article  CAS  PubMed  Google Scholar 

  37. UniProt Consortium. UniProt: the universal protein knowledgebase in 2023. Nucleic Acids Res. 2023;51(D1):D523–31. https://doi.org/10.1093/nar/gkac1052.

    Article  CAS  Google Scholar 

  38. Sayers EW, Cavanaugh M, Clark K, Pruitt KD, Sherry ST, Yankie L, Karsch-Mizrachi I. GenBank 2024 update. Nucleic Acids Res. 2024;52(D1):D134–7. https://doi.org/10.1093/nar/gkad903.

    Article  PubMed  Google Scholar 

  39. Aoki-Kinoshita KF, Kanehisa M. Glycomic analysis using KEGG GLYCAN, in Glycoinformatics, Ch. 7, T. Lütteke and M. Frank, Editors. Humana Press: New York; 2015. pp. 97–107. https://doi.org/10.1007/978-1-4939-2343-4_7.

  40. Drula E, Garron ML, Dogan S, Lombard V, Henrissat B, Terrapon N. The carbohydrate-active enzyme database: functions and literature. Nucleic Acids Res. 2022;50(D1):D571–7. https://doi.org/10.1093/nar/gkab1045.

    Article  CAS  PubMed  Google Scholar 

  41. Egorova KS, Toukach PV. CSDB_GT: a new curated database on glycosyltransferases. Glycobiology. 2017;27(4):285–90. https://doi.org/10.1093/glycob/cww137.

    Article  CAS  PubMed  Google Scholar 

  42. Egorova KS, Knirel YA, Toukach PV. Expanding CSDB_GT glycosyltransferase database with Escherichia coli. Glycobiology. 2019;29(4):285–7. https://doi.org/10.1093/glycob/cwz006.

    Article  CAS  PubMed  Google Scholar 

  43. Egorova KS, Smirnova NS, Toukach PV. CSDB_GT, a curated glycosyltransferase database with close-to-full coverage on three most studied nonanimal species. Glycobiology. 2021;31(5):524–9. https://doi.org/10.1093/glycob/cwaa107.

    Article  CAS  PubMed  Google Scholar 

  44. Martini S, Nielsen M, Peters B, Sette A. The immune epitope database and analysis resource program 2003–2018: reflections and outlook. Immunogenetics. 2020;72(1–2):57–76. https://doi.org/10.1007/s00251-019-01137-6.

    Article  PubMed  Google Scholar 

  45. Mariethoz J, Khatib K, Alocci D, Campbell MP, Karlsson NG, Packer NH, Mullen EH, Lisacek F. SugarBindDB, a resource of glycan-mediated host-pathogen interactions. Nucleic Acids Res. 2016;44(D1):D1243–50. https://doi.org/10.1093/nar/gkv1247.

    Article  CAS  PubMed  Google Scholar 

  46. Toukach PV. Supplementing the Carbohydrate Structure Database with glycoepitopes. Glycobiology. 2023;33(7):528–31. https://doi.org/10.1093/glycob/cwad043.

    Article  CAS  PubMed  Google Scholar 

  47. Harrison JE, Weber S, Jakob R, Chute CG. ICD-11: an international classification of diseases for the twenty-first century. BMC Med Inform Decis Mak. 2021;21(Suppl 6):206. https://doi.org/10.1186/s12911-021-01534-6.

    Article  PubMed  PubMed Central  Google Scholar 

  48. Kapaev RR, Toukach PV. GRASS: semi-automated NMR-based structure elucidation of saccharides. Bioinformatics. 2018;34(6):957–63. https://doi.org/10.1093/bioinformatics/btx696.

    Article  CAS  PubMed  Google Scholar 

  49. Tiemeyer M, Aoki K, Paulson J, Cummings RD, York WS, Karlsson NG, Lisacek F, Packer NH, Campbell MP, Aoki NP, et al. GlyTouCan: an accessible glycan structure repository. Glycobiology. 2017;27(10):915–9. https://doi.org/10.1093/glycob/cwx066.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  50. Kapaev RR, Egorova KS, Toukach PV. Carbohydrate structure generalization scheme for database-driven simulation of experimental observables, such as NMR chemical shifts. J Chem Inf Model. 2014;54(9):2594–611. https://doi.org/10.1021/ci500267u.

    Article  CAS  PubMed  Google Scholar 

  51. Kapaev RR, Toukach PV. Improved carbohydrate structure generalization scheme for 1H and 13C NMR simulations. Anal Chem. 2015;87(14):7006–10. https://doi.org/10.1021/acs.analchem.5b01413.

    Article  CAS  PubMed  Google Scholar 

  52. Kapaev RR, Toukach PV. Simulation of 2D NMR spectra of carbohydrates using GODESS software. J Chem Inf Model. 2016;56(6):1100–4. https://doi.org/10.1021/acs.jcim.6b00083.

    Article  CAS  PubMed  Google Scholar 

  53. de Vienne DM. Lifemap: exploring the entire tree of life. PLoS Biol. 2016;14(12): e2001624. https://doi.org/10.1371/journal.pbio.2001624.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  54. Stroylov V, Panova M, Toukach P. Comparison of methods for bulk automated simulation of glycosidic bond conformations. Int J Mol Sci. 2020;21(20):7626. https://doi.org/10.3390/ijms21207626.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  55. Lütteke T. Translation and validation of carbohydrate residue names with MonosaccharideDB routines, in A Practical Guide to Using Glycomics Databases, Ch. 3, K. Aoki-Kinoshita, Editor. Springer Japan; 2017 pp. 29–40. https://doi.org/10.1007/978-4-431-56454-6_3.

  56. Schoch CL, Ciufo S, Domrachev M, Hotton CL, Kannan S, Khovanskaya R, Leipe D, McVeigh R, O'Neill K, Robbertse B, et al. NCBI Taxonomy: a comprehensive update on curation, resources and tools. Database (Oxford), 2020. 2020:baaa062. https://doi.org/10.1093/database/baaa062.

  57. White J.  PubMed 2.0. Medical reference services quarterly. 2020;39(4):382–387. https://doi.org/10.1080/02763869.2020.1826228.

  58. Ranzinger R, Herget S, von der Lieth CW, Frank M. GlycomeDB-a unified database for carbohydrate structures. Nucleic Acids Res. 2011;39:D373–6. https://doi.org/10.1093/nar/gkq1014.

    Article  CAS  PubMed  Google Scholar 

  59. Ranzinger R, Aoki-Kinoshita KF, Campbell MP, Kawano S, Lutteke T, Okuda S, Shinmachi D, Shikanai T, Sawaki H, Toukach P, et al. GlycoRDF: an ontology to standardize glycomics data in RDF. Bioinformatics. 2015;31(6):919–25. https://doi.org/10.1093/bioinformatics/btu732.

    Article  CAS  PubMed  Google Scholar 

  60. Egorova KS, Kondakova AN, Toukach PV. Carbohydrate structure database: tools for statistical analysis of bacterial, plant and fungal glycomes. Database (Oxford); 2015. https://doi.org/10.1093/database/bav073.

  61. Toukach FV, Ananikov VP. Recent advances in computational predictions of NMR parameters for the structure elucidation of carbohydrates: methods and limitations. Chem Soc Rev. 2013;42(21):8376–415. https://doi.org/10.1039/c3cs60073d.

    Article  CAS  PubMed  Google Scholar 

  62. Dorst KM, Widmalm G. NMR chemical shift prediction and structural elucidation of linker-containing oligo- and polysaccharides using the computer program CASPER. Carbohydr Res. 2023;533:108937. https://doi.org/10.1016/j.carres.2023.108937.

    Article  CAS  PubMed  Google Scholar 

Download references

Acknowledgements

 The author acknowledges all the CSDB team involved in the CSDB product life cycle. The participants are listed at http://csdb.glycoscience.ru/help/credits.html.

Funding

The author declares no funding in 2023-2024.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Philip Toukach.

Ethics declarations

Ethics approval

The reported work did not include any chemical or biological experiment, except those set in silico.

Conflict of interest

The author declares no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Published in the topical collection featuring Current Progress in Glycosciences and Glycobioinformatics with guest editors Joseph Zaia and Kiyoko F. Aoki-Kinoshita.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Toukach, P. Carbohydrate Structure Database: current state and recent developments. Anal Bioanal Chem (2024). https://doi.org/10.1007/s00216-024-05383-w

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s00216-024-05383-w

Keywords

Navigation