RNA-Seq Data Analysis: From Raw Data Quality Control to Differential Expression Analysis

  • Protocol
  • First Online:
Plant Germline Development

Part of the book series: Methods in Molecular Biology ((MIMB,volume 1669))

Abstract

As a revolutionary technology for life sciences, RNA-seq has many applications and the computation pipeline has also many variations. Here, we describe a protocol to perform RNA-seq data analysis where the aim is to identify differentially expressed genes in comparisons of two conditions. The protocol follows the recently published RNA-seq data analysis best practice and applies quality checkpoints throughout the analysis to ensure reliable data interpretation. It is written to help new RNA-seq users to understand the basic steps necessary to analyze an RNA-seq dataset properly. An extension of the protocol has been implemented as automated workflows in the R package ezRun, available also in the data analysis framework SUSHI, for reliable, repeatable, and easily interpretable analysis results.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Protocol
EUR 44.95
Price includes VAT (Germany)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
EUR 93.08
Price includes VAT (Germany)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
EUR 117.69
Price includes VAT (Germany)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free ship** worldwide - see info
Hardcover Book
EUR 160.49
Price includes VAT (Germany)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free ship** worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Wang Z, Gerstein M, Snyder M (2009) RNA-Seq: a revolutionary tool for transcriptomics. Nat Rev Genet 10:57–63. doi:10.1038/nrg2484

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  2. Conesa A, Madrigal P, Tarazona S et al (2016) A survey of best practices for RNA-seq data analysis. Genome Biol 17:13. doi:10.1186/s13059-016-0881-8

    Article  PubMed  PubMed Central  Google Scholar 

  3. Rehrauer H, Opitz L, Tan G et al (2013) Blind spots of quantitative RNA-seq: the limits for assessing abundance, differential expression, and isoform switching. BMC Bioinformatics 14:370. doi:10.1186/1471-2105-14-370

    Article  PubMed  PubMed Central  Google Scholar 

  4. Li W, Freudenberg J (2014) Mappability and rea d length. Front Genet 5:381. doi:10.3389/fgene.2014.00381

    PubMed  PubMed Central  Google Scholar 

  5. Zhao S, Zhang Y, Gordon W et al (2015) Comparison of stranded and non-stranded RNA-seq transcriptome profiling and investigation of gene overlap. BMC Genomics 16(1):675. doi:10.1186/s12864-015-1876-7

    Article  PubMed  PubMed Central  Google Scholar 

  6. Li S, Labaj PP, Zumbo P et al (2014) Detecting and correcting systematic variation in large-scale RNA sequencing data. Nat Biotechnol 32:888–895

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  7. Hatakeyama M, Opitz L, Russo G et al (2016) SUSHI: an exquisite recipe for fully documented, reproducible and reusable NGS data analysis. BMC Bioinformatics 17:228. doi:10.1186/s12859-016-1104-8

    Article  PubMed  PubMed Central  Google Scholar 

  8. Villarino GH, Hu Q, Manrique S et al (2016) Transcriptomic signature of the SHATTERPROOF2 expression domain reveals the meristematic nature of Arabidopsis gynoecial medial domain. Plant Physiol 171:42–61. doi:10.1104/pp.15.01845

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  9. Bolger AM, Lohse M, Usadel B (2014) Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics 30:2114–2120. doi:10.1093/bioinformatics/btu170

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  10. Dobin A, Davis CA, Schlesinger F et al (2013) STAR: ultrafast universal RNA-seq aligner. Bioinformatics 29:15–21. doi:10.1093/bioinformatics/bts635

    Article  CAS  PubMed  Google Scholar 

  11. Li H, Handsaker B, Wysoker A et al (2009) The sequence alignment/map format and SAMtools. Bioinformatics 25:2078–2079. doi:10.1093/bioinformatics/btp352

    Article  PubMed  PubMed Central  Google Scholar 

  12. Wang L, Wang S, Li W (2012) RSeQC: quality control of RNA-seq experiments. Bioinformatics 28:2184–2185. doi:10.1093/bioinformatics/bts356

    Article  CAS  PubMed  Google Scholar 

  13. Liao Y, Smyth GK, Shi W (2014) featureCounts: an efficient general purpose program for assigning sequence reads to genomic features. Bioinformatics 30:923–930. doi:10.1093/bioinformatics/btt656

    Article  CAS  PubMed  Google Scholar 

  14. Tarazona S, Furió-Tarí P, Turrà D et al (2015) Data quality aware analysis of differential expression in RNA-seq with NOISeq R/Bioc package. Nucleic Acids Res 43(21):e140. doi:10.1093/nar/gkv711. gkv711

    PubMed  PubMed Central  Google Scholar 

  15. Robinson MD, McCarthy DJ, Smyth GK (2010) edgeR: a Bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics 26(1):139–140. doi:10.1093/bioinformatics/btp616

    Article  CAS  PubMed  Google Scholar 

  16. Robinson MD, Oshlack A (2010) A scaling normalization method for differential expression analysis of RNA-seq data. Genome Biol 11:R25. doi:10.1186/gb-2010-11-3-r25

    Article  PubMed  PubMed Central  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Weihong Qi .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer Science+Business Media LLC

About this protocol

Cite this protocol

Qi, W., Schlapbach, R., Rehrauer, H. (2017). RNA-Seq Data Analysis: From Raw Data Quality Control to Differential Expression Analysis. In: Schmidt, A. (eds) Plant Germline Development. Methods in Molecular Biology, vol 1669. Humana Press, New York, NY. https://doi.org/10.1007/978-1-4939-7286-9_23

Download citation

  • DOI: https://doi.org/10.1007/978-1-4939-7286-9_23

  • Published:

  • Publisher Name: Humana Press, New York, NY

  • Print ISBN: 978-1-4939-7285-2

  • Online ISBN: 978-1-4939-7286-9

  • eBook Packages: Springer Protocols

Publish with us

Policies and ethics

Navigation