Using GenBank and SRA

  • Protocol
  • First Online:
Plant Bioinformatics

Part of the book series: Methods in Molecular Biology ((MIMB,volume 2443))

Abstract

GenBank® and the Sequence Read Archive (SRA) are comprehensive databases of publicly available DNA sequences. GenBank contains data for 480,000 named organisms, more than 176,000 within the embryophyta, obtained through submissions from individual laboratories and batch submissions from large-scale sequencing projects. SRA contains reads from next-generation sequencing studies from over 110,000 species. Daily data exchange with the European Nucleotide Archive (ENA) in Europe and the DNA Data Bank of Japan (DDBJ) ensures worldwide coverage for both databases. GenBank and SRA data are accessible through the NCBI Entrez retrieval system that integrates these data with other data at NCBI, such as genomes, taxonomy, and the biomedical literature. BLAST provides sequence similarity searches of GenBank and other sequence databases. Complete bimonthly releases and daily updates of the GenBank database are available by FTP. Usage scenarios for both GenBank and SRA ranging from local and cloud analyses to online analyses supported by the NCBI web-based tools are discussed. Both GenBank and SRA, along with their related retrieval and analysis services, are available from the NCBI homepage at www.ncbi.nlm.nih.gov.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Protocol
USD 49.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 99.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 129.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free ship** worldwide - see info
Hardcover Book
USD 179.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free ship** worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Sayers EW, Cavanaugh M, Clark K, Ostell J, Pruitt KD, Karsch-Mizrachi I (2020) GenBank. Nucleic Acids Res 48:D84–D86

    Article  CAS  Google Scholar 

  2. Amid C, Alako BTF, Balavenkataraman Kadhirvelu V, Burdett T, Burgin J, Fan J, Harrison PW, Holt S, Hussein A, Ivanov E et al (2020) The European Nucleotide Archive in 2019. Nucleic Acids Res 48:D70–D76

    Article  CAS  Google Scholar 

  3. Ogasawara O, Kodama Y, Mashima J, Kosuge T, Fujisawa T (2020) DDBJ Database updates and computational infrastructure enhancement. Nucleic Acids Res 48:D45–D50

    Article  CAS  Google Scholar 

  4. Schoch CL, Ciufo S, Domrachev M, Hotton CL, Kannan S, Khovanskaya R, Leipe D, McVeigh R, O’Neill K, Robbertse B et al (2020) NCBI Taxonomy: a comprehensive update on curation, resources and tools. Database 2020:baaa062

    Article  CAS  Google Scholar 

  5. Haft DH, DiCuccio M, Badretdin A, Brover V, Chetvernin V, O’Neill K, Li W, Chitsaz F, Derbyshire MK, Gonzales NR et al (2018) RefSeq: an update on prokaryotic genome annotation and curation. Nucleic Acids Res 46:D851–D860

    Article  CAS  Google Scholar 

  6. Kitts PA, Church DM, Thibaud-Nissen F, Choi J, Hem V, Sapojnikov V, Smith RG, Tatusova T, **ang C, Zherikov A et al (2016) Assembly: a resource for assembled genomes at NCBI. Nucleic Acids Res 44:D73–D80

    Article  CAS  Google Scholar 

  7. Barrett T, Clark K, Gevorgyan R, Gorelenkov V, Gribov E, Karsch-Mizrachi I, Kimelman M, Pruitt KD, Resenchuk S, Tatusova T et al (2012) BioProject and BioSample databases at NCBI: facilitating capture and organization of metadata. Nucleic Acids Res 40:D57–D63

    Article  CAS  Google Scholar 

  8. Sayers EW, Agarwala R, Bolton EE, Brister JR, Canese K, Clark K, Connor R, Fiorini N, Funk K, Hefferon T et al (2019) Database resources of the National Center for Biotechnology Information. Nucleic Acids Res 47:D23–D28

    Article  CAS  Google Scholar 

  9. Brown GR, Hem V, Katz KS, Ovetsky M, Wallin C, Ermolaeva O, Tolstoy I, Tatusova T, Pruitt KD, Maglott DR et al (2015) Gene: a gene-centered information resource at NCBI. Nucleic Acids Res 43:D36–D42

    Article  CAS  Google Scholar 

  10. Sayers EW, Beck J, Brister JR, Bolton EE, Canese K, Comeau DC, Funk K, Ketter A, Kim S, Kimchi A et al (2020) Database resources of the National Center for Biotechnology Information. Nucleic Acids Res 48:D9–D16

    Article  CAS  Google Scholar 

  11. Clark K, Karsch-Mizrachi I, Lipman DJ, Ostell J, Sayers EW (2016) GenBank. Nucleic Acids Res 44:D67–D72

    Article  CAS  Google Scholar 

  12. Kodama Y, Shumway M, Leinonen R (2012) The Sequence Read Archive: explosive growth of sequencing data. Nucleic Acids Res 40:D54–D56

    Article  CAS  Google Scholar 

  13. Tryka KA, Hao L, Sturcke A, ** Y, Wang ZY, Ziyabari L, Lee M, Popova N, Sharopova N, Kimura M et al (2014) NCBI’s database of genotypes and phenotypes: dbGaP. Nucleic Acids Res 42:D975–D979

    Article  CAS  Google Scholar 

  14. Boratyn GM, Camacho C, Cooper PS, Coulouris G, Fong A, Ma N, Madden TL, Matten WT, McGinnis SD, Merezhuk Y et al (2013) BLAST: a more efficient report with usability improvements. Nucleic Acids Res 41:W29–W33

    Article  Google Scholar 

  15. Johnson M, Zaretskaya I, Raytselis Y, Merezhuk Y, McGinnis S, Madden TL (2008) NCBI BLAST: a better web interface. Nucleic Acids Res 36:W5–W9

    Article  CAS  Google Scholar 

  16. Ye J, McGinnis S, Madden TL (2006) BLAST: improvements for better sequence analysis. Nucleic Acids Res 34:W6–W9

    Article  CAS  Google Scholar 

  17. Altschul SF, Madden TL, Schaffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ (1997) Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res 25:3389–3402

    Article  CAS  Google Scholar 

  18. Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ (1990) Basic local alignment search tool. J Mol Biol 215:403–410

    Article  CAS  Google Scholar 

Download references

Acknowledgements

Funding for this work was provided by the Intramural Research Program of the National Institutes of Health, National Library of Medicine.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Eric W. Sayers .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Science+Business Media, LLC, part of Springer Nature

About this protocol

Check for updates. Verify currency and authenticity via CrossMark

Cite this protocol

Sayers, E.W., O’Sullivan, C., Karsch-Mizrachi, I. (2022). Using GenBank and SRA. In: Edwards, D. (eds) Plant Bioinformatics. Methods in Molecular Biology, vol 2443. Humana, New York, NY. https://doi.org/10.1007/978-1-0716-2067-0_1

Download citation

  • DOI: https://doi.org/10.1007/978-1-0716-2067-0_1

  • Published:

  • Publisher Name: Humana, New York, NY

  • Print ISBN: 978-1-0716-2066-3

  • Online ISBN: 978-1-0716-2067-0

  • eBook Packages: Springer Protocols

Publish with us

Policies and ethics

Navigation