A Survey of Bioinformatics-Based Tools in RNA-Sequencing (RNA-Seq) Data Analysis

  • Chapter
  • First Online:
Translational Bioinformatics and Its Application

Part of the book series: Translational Medicine Research ((TRAMERE))

Abstract

The capability of next-generation sequencing can be understood by one of its techniques like RNA sequencing (RNA-Seq) that deals with the transcriptome complexity in a powerful and cost-effective way. This technique has emerged as a revolutionary tool with high sensitivity and accuracy over old techniques. Additionally, this technique also gives unprecedented ability to detect novel mRNA transcripts as well as ncRNAs and analyze alternative splicing. Being a high-throughput sequencing technique, it poses a great demand for bioinformatics-based analysis of the generated data. Here, we explain how RNA-Seq data can be analyzed, discuss its challenges, and provide an overview of the data analysis methods/tools. We discuss strategies for quality check, map**, and differential expression in transcriptomic data along with discussions on lately developed strategies for alternative splicing and isoform quantification. We also mention some useful R/Bioconductor packages for aforementioned tasks.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
EUR 29.95
Price includes VAT (Germany)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
EUR 139.09
Price includes VAT (Germany)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
EUR 181.89
Price includes VAT (Germany)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free ship** worldwide - see info
Hardcover Book
EUR 181.89
Price includes VAT (Germany)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free ship** worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  • An J, Lai J, Lehman ML, Nelson CC. miRDeep*: an integrated application tool for miRNA identification from RNA sequencing data. Nucleic Acids Res. 2013. PMID: 23221645.

    Google Scholar 

  • Anders S, Pyl PT, Huber W. HTSeq-a Python framework to work with high-throughput sequencing data. Bioinformatics. 2015;31:166–9.

    Article  CAS  PubMed  Google Scholar 

  • Andrews S. Fast QC: a quality control tool for high throughput sequence data. 2010. Available online at: http://www.bioinformatics.babraham.ac.uk/projects/fastqc

  • Ansorge WJ. Next-generation DNA sequencing techniques. N Biotechnol. 2009;25:195–203. Bioinformatics 25:1754–60.

    Google Scholar 

  • Au KF, Jiang H, Lin L, **ng Y, Wong WH. Detection of splice junctions from paired-end RNA-seq data by Splice Map. Nucleic Acids Res. 2010;38:4570–8.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Axtell MJ. ShortStack: comprehensive annotation and quantification of small RNA genes. RNA. 2013. PMID: 23610128.

  • Ballouz S, Gillis J. AuPairWise: a method to estimate RNA-seq replicability through co-expression. bioRxiv. 2016; doi:10.1101/044669.

    Google Scholar 

  • Bao H, Guo H, Wang J, Zhou R, Lu X, Shi S. MapView: visualization of short reads alignment on a desktop computer. Bioinformatics. 2009. PMID: 19369497.

    Google Scholar 

  • Baras AS, Mitchell CJ, Myers JR, Gupta S, Weng LC, Ashton JM et al. miRge – a multiplexed method of processing small RNA-Seq data to determine microRNA entropy. PloS one. 2015. PMID: 26571139.

  • Bolger AM, Lohse M, Usadel B. Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics. 2014;30:2114–20. doi:10.1093/bioinformatics/btu170.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Bray NL, Pimentel H, Melsted P, Pachter L. Near-optimal probabilistic RNA-Seq quantification. Nat Biotechnol. 2016; doi:10.1038/nbt.3519.

    Google Scholar 

  • Bu J, Chi X, ** Z. HSA: a heuristic splice alignment tool. BMC Systems Biol. 2013. PMID: 24564867.

  • Butterfield YS, Kreitzman M, Thiessen N, Corbett RD, Li Y, Pang J et al. JAGuaR: junction alignments to genome for RNA-seq reads. PloS one. 2014. PMID: 25062255.

  • Cabanski CR, Wilkerson MD, Soloway M, Parker JS, Liu J, Prins JF, et al. BlackOPs: increasing confidence in variant detection through mappability filtering. Nucleic Acids Res. 2013. PMID: 23935067.

  • Canzar S, Andreotti S, Weese D, Reinert K, Klau GW. CIDANE: comprehensive isoform discovery and abundance estimation. Genome Biol. 2016. PMID: 26831908.

  • Capece V, Garcia Vizcaino JC, Vidal R, Rahman RU, Pena Centeno T, Shomroni O et al. Oasis: online analysis of small RNA deep sequencing data. Bioinformatics. 2015. PMID: 25701573.

  • Chae H, Rhee S, Nephew KP, Kim S. BioVLAB-MMIA-NGS: microRNA-mRNA integrated analysis using high throughput sequencing data. Bioinformatics. 2014. PMID: 25270639.

  • Chang Z, Li G, Liu J, Zhang Y, Ashby C, Liu D, et al. Bridger: a new framework for de novo transcriptome assembly using RNA-seq data. Genome Biol. 2015.

    Google Scholar 

  • Chen C, Khaleel SS, Huang H, Wu CH. Software for pre-processing Illumina next-generation sequencing short read sequences. Source code for biology and medicine. 2014. PMID: 24955109.

  • Chen HH, Liu Y, Zou Y, Lai Z, Sarkar D, Huang Y, et al. Differential expression analysis of RNA sequencing data by incorporating non-exonic mapped reads. BMC Genomics. 2015; doi:10.1186/1471-2164-16-S7-S14.

    Google Scholar 

  • Cheng WC, Chung IF, Huang TS, Chang ST, Sun HJ, Tsai CF, et al. YM500: a small RNA sequencing (smRNA-seq) database for microRNA research. Nucleic Acids Res. 2013. PMID: 23203880.

  • Cheng WC, Chung IF, Tsai CF, Huang TS, Chen CY, Wang SC, et al. YM500v2: a small RNA sequencing (smRNA-seq) database for human cancer miRNome research. Nucleic Acids Res. 2015. PMID: 25398902.

  • Chou MT, Han BW, Hsiao CP, Zamore PD, Weng Z, Hung JH. Tailor: a computational framework for detecting non-templated tailing of small silencing RNAs. Nucleic Acids Res. 2015. PMID: 26007652.

  • Chu C, Fang Z, Hua X, Yang Y, Chen E, CowleyJr AW, et al. deGPS is a powerful tool for detecting differential expression in RNA-sequencing studies. BMC Genomics. 2015. doi: 10.1186/s12864-015-1676-0.

  • Criscuolo A, Brisse S. AlienTrimmer: a tool to quickly and accurately trim off multiple short contaminant sequences from high-throughput sequencing reads. Genomics. 2013;102:500–6.

    Article  CAS  PubMed  Google Scholar 

  • Dai M, Thompson RC, Maher C, Contreras-Galindo R, Kaplan MH, Markovitz DM, et al. NGSQC: cross-platform quality analysis pipeline for deep sequencing data. BMC Genomics. 2010. PMID: 21143816.

  • D’Antonio M, D’Onorio De Meo P, Pallocca M, Picardi E, D’Erchia AM, Calogero RA, et al. RAP: RNA-Seq Analysis Pipeline, a new cloud-based NGS web application. BMC Genomics. 2015. PMID: 26046471.

  • David M, Dzamba M, Lister D, Ilie L, Brudno M. SHRiMP2: sensitive yet practical SHort Read Map**. Bioinformatics. 2011. PMID: 21278192.

    Google Scholar 

  • Davis MPA, Dongen SV, Goodger CA, Bartonicek N, Enright AJ. Kraken: A set of tools for quality control and analysis of high-throughput sequence data. Methods. 2013;63(1): 41–9. doi:10.1016/j.ymeth.2013.06.027. PMID 23816787.

  • Deluca DS, Levin JZ, Sivachenko A, Fennell T, Nazaire MD, Williams C, et al. RNA-SeQC: RNA-seq metrics for quality control and process optimization. Bioinformatics. 2012;28(11):1530–2. doi:10.1093/bioinformatics/bts196.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Deveci M, Catalyürek UV, Toland AE. mrSNP: software to detect SNP effects on microRNA binding. BMC Bioinf. 2014. PMID: 24629096.

    Google Scholar 

  • Dobin A, Davis CA, Schlesinger F, Drenkow J, Zaleski C, Jha S, et al. STAR: ultrafast universal RNA-seq aligner. Bioinformatics. 2013. PMID: 23104886.

    Google Scholar 

  • Dressman D, Yan H, Traverso G, Kinzler KW, Vogelstein B. Transforming single DNA molecules into fluorescent magnetic particles for detection and enumeration of genetic variations. Proc Natl Acad Sci U S A. 2003;100:8817–22.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Edmonson MN, Zhang J, Yan C, Finney RP, Meerzaman DM, Buetow KH. Bambino: a variant detector and alignment viewer for next-generation sequencing data in the SAM/BAM format. Bioinformatics. 2011. PMID: 21278191.

    Google Scholar 

  • Evers M, Huttner M, Dueck A, Meister G, Engelmann JC. miRA: adaptable novel miRNA identification in plants using small RNA sequencing data. BMC Bioinf. 2015. PMID: 26542525.

    Google Scholar 

  • Ewing B, Green P. Base-calling of automated sequencer traces using Phred II error probabilities. Genome Res. 1998;8(3):186–94.

    Article  CAS  PubMed  Google Scholar 

  • Ewing B, Hillier L, Wendl MC, Green P. Base-calling of automated sequencer traces using phred. I Accuracy assessment. Genome Res. 1998;8(3):175–85.

    Article  CAS  PubMed  Google Scholar 

  • Fedurco M, Romieu A, Williams S, Lawrence I, Turcatti G. BTA, a novel reagent for DNA attachment on glass and efficient generation of solid-phase amplified DNA colonies. Nucleic Acids Res. 2006;34:e22.

    Article  PubMed  PubMed Central  Google Scholar 

  • Feng H, Zhang X, Zhang C. mRIN for direct assessment of genome-wide and gene-specific mRNA integrity from large-scale RNA sequencing data. Nat Commun. 2015;6(7816) doi:10.1038/ncomms8816.

  • Feng S, Lo CC, Li PE, Chain PS. ADEPT, a dynamic next generation sequencing data error-detection program with trimming. BMC Bioinf. 2016; doi:10.1186/s12859-016-0967-z.

    Google Scholar 

  • Filichkin SA, Priest HD, Givan SA, Shen R, Bryant DW, Fox SE, et al. Genome-wide map** of alternative splicing in Arabidopsis thaliana. Genome Res. 2010;20:45–58.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Gentleman RC, Carey VJ, Bates DM, Bolstad B, Dettling M, Dudoit S, et al. Bioconductor: open software development for computational biology and bioinformatics. Genome Biol. 2004;5:R80.

    Article  PubMed  PubMed Central  Google Scholar 

  • Giurato G, De Filippo MR, Rinaldi A, Hashim A, Nassa G, Ravo M, et al. iMir: An Integrated pipeline for high-throughput analysis of small non-coding RNA data obtained by smallRNA-Seq. BMC Bioinf. 2013. PMID: 24330401.

    Google Scholar 

  • Glaus P, Honkela A, Rattray M. Identifying differentially expressed transcripts from RNA-seq data with biological variation. Bioinformatics. 2012. PMID: 22563066.

    Google Scholar 

  • Goncalves A, Tikhonov A, Brazma A, Kapushesky M. A pipeline for RNA-seq data processing and quality assessment. Bioinformatics. 2011. PMID: 21233166.

    Google Scholar 

  • Griffith M, Griffith OL, Mwenifumbo J, Goya R, Morrissy AS, Morin RD, et al. Alternative expression analysis by RNA sequencing. Nat Methods. 2010;7:843–7.

    Article  CAS  PubMed  Google Scholar 

  • Haas BJ, Papanicolaou A, Yassour M, Grabherr M, Blood PD, Bowden J, et al. De novo transcript sequence reconstruction from RNA-seq using the Trinity platform for reference generation and analysis. Nat Protoc. 2013;8:1494–512.

    Article  CAS  PubMed  Google Scholar 

  • Hardcastle TJ. Discovery of methylation loci and analyses of differential methylation from replicated high-throughput sequencing data. bioRxiv. 2015; doi:10.1101/021436.

    Google Scholar 

  • Hardcastle TJ. baySeq: eEmpirical Bayesian analysis of patterns of differential expression in count data. R package version 2.8.0. 2012.

    Google Scholar 

  • Hardcastle TJ, Kelly KA and Baulcombe DC. Identifying small interfering RNA loci from high-throughput sequencing data. Bioinformatics. 2012. PMID: 22171331.

    Google Scholar 

  • Hartley SW, Mullikin JC. QoRTs: a comprehensive toolset for quality control and data processing of RNA-Seq experiments. BMC Bioinf. 2015; doi:10.1186/s12859-015-0670-5.

    Google Scholar 

  • Hashimoto TB, Edwards MD, Gifford DK. Universal count correction for high-throughput sequencing. PLoS Comput Biol. 2014. PMID: 24603409.

    Google Scholar 

  • Heap GA, Yang JHM, Downes K, Healy BC, Hunt KA, et al. Genome-wide analysis of allelic expression imbalance in human primary cells by high-throughput transcriptome resequencing. Hum Mol Genet. 2010;19:122–34.

    Article  CAS  PubMed  Google Scholar 

  • Hensman J, Papastamoulis P, Glaus P, Honkela A, Rattray M. Fast and accurate approximate inference of transcript expression from RNA-seq data. Bioinformatics. 2015. PMID: 26315907.

    Google Scholar 

  • Homer N, Merriman B, Nelson SF. BFAST: an alignment tool for large scale genome resequencing. PLoS One. 2009;4:e7767.

    Article  PubMed  PubMed Central  Google Scholar 

  • Huang J, Chen J, Lathrop M, Liang L. A tool for RNA sequencing sample identity check. Bioinformatics. 2013. PMID: 23559639.

    Google Scholar 

  • Huber W, Carey VJ, Gentleman R, Anders S, Carlson M, Carvalho BS, et al. Orchestrating high-throughput genomic analysis with Bioconductor. Nat Methods. 2015;12(2):115–21.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Jha A, Shankar R. miReader: discovering novel miRNAs in species without sequenced genome. PloS one. 2013. PMID: 23805282.

    Google Scholar 

  • Jiang H, Wong WH. SeqMap: map** massive amount of oligonucleotides to the genome. Bioinformatics. 2008;24:2395–6.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Jiang H, Lei R, Ding SW, Zhu S. Skewer: a fast and accurate adapter trimmer for next-generation sequencing paired-end reads. BMC Bioinf. 2014. PMID: 24925680.

    Google Scholar 

  • Jiang P, Thomson JA, Stewart R. Quality Control of Single-cell RNA-seq by SinQC. Bioinformatics. 2016; doi:10.1093/bioinformatics/btw176.

    Google Scholar 

  • Jung I, Park JC, Kim S. piClust: a density based piRNA clustering algorithm. Comput Biol Chem. 2014. PMID: 24656595.

    Google Scholar 

  • Kartashov AV, Barski A. BioWardrobe: an integrated platform for analysis of epigenomics and transcriptomics data. Genome Biol. 2015. PMID: 26248465.

    Google Scholar 

  • Kim J, Levy E, Ferbrache A, Stepanowsky P, Farcas C, Wang S, et al. MAGI: a Node.js web service for fast MicroRNA-Seq analysis in a GPU infrastructure. Bioinformatics. 2014. PMID: 24907367.

    Google Scholar 

  • Kroll KW, Mokaram NE, Pelletier AR, Frankhouser DE, Westphal MS, Stump PA, et al. Quality Control for RNA-Seq (QuaCRS): an integrated quality control pipeline. Cancer Inf. 2014. PMID: 25368506.

    Google Scholar 

  • Langmead B. Aligning short sequencing reads with Bowtie. Curr Protoc Bioinf Chapter 11, Unit 11.7. 2010.

    Google Scholar 

  • Langmead B, Trapnell C, Pop M, Salzberg SL. Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol. 2009;10:R25.

    Article  PubMed  PubMed Central  Google Scholar 

  • Lassmann T, Hayashizaki Y, Daub CO. SAMStat: monitoring biases in next generation sequencing data. Bioinformatics. 2010;27(1):130–1. doi:10.1093/bioinformatics/btq614. PMID 21088025.

  • Lawrence M, Huber W, Pagès H, Aboyoun P, Carlson M, Gentleman R, Morgan M, Carey V. Software for computing and annotating genomic RANGES. PLoS Comput Biol 2013;9.

    Google Scholar 

  • Le HS, Schulz MH, McCauley BM, Hinman VF, Bar-Joseph Z. Probabilistic error correction for RNA sequencing. Nucleic Acids Res. 2013. PMID: 23558750.

    Google Scholar 

  • Lee WP, Stromberg MP, Ward A, Stewart C, Garrison EP, Marth GT. MOSAIK: a hash-based algorithm for accurate next-generation sequencing short-read map**. PloS one. 2014. PMID: 24599324.

    Google Scholar 

  • Leung YY, Ryvkin P, Ungar LH, Gregory BD, Wang LS. CoRAL: predicting non-coding RNAs from small RNA-sequencing data. Nucleic Acids Res. 2013. PMID: 23700308

    Google Scholar 

  • Li H, Durbin R. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics. 2009d;25(14):1754–60.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Li R, Li Y, Kristiansen K, Wang J. SOAP: short oligonucleotide alignment program. Bioinformatics. 2008a;24:713–4.

    Article  CAS  PubMed  Google Scholar 

  • Li H, Ruan J, Durbin R. Map** short DNA sequencing reads and calling variants using map** quality scores. Genome Res. 2008b. PMID: 18714091.

    Google Scholar 

  • Li R, Yu C, Li Y, Lam T-W, Yiu S-M, Kristiansen K, et al. SOAP2: an improved ultra-fast tool for short read alignment. Bioinformatics. 2009a;25:1966–7.

    Article  CAS  PubMed  Google Scholar 

  • Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, et al. The sequence alignment/map format and SAMtools. Bioinformatics. 2009b. PMID: 19505943.

    Google Scholar 

  • Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, et al. 1000 genome project data processing subgroup. 2009c.

    Google Scholar 

  • Li R, Fan W, Tian G, Zhu H, He L, Cai J, et al. The sequence and de novo assembly of the giant panda genome. Nature. 2010;463(7279):311–7.

    Article  CAS  PubMed  Google Scholar 

  • Li J, Hou J, Sun L, Wilkins JM, Lu Y, Niederhuth CE, et al.. From gigabyte to kilobyte: A bioinformatics protocol for mining large RNA-Seq transcriptomics data. PloS one. 2015a. PMID: 25902288.

    Google Scholar 

  • Li YL, Weng JC, Hsiao CC, Chou MT, Tseng CW, Hung JH. PEAT: an intelligent and efficient paired-end sequencing adapter trimming algorithm. BMC Bioinf. 2015b. PMID: 25707528

    Google Scholar 

  • Liao Y, Smyth GK, Shi W. The Subread aligner: fast, accurate and scalable read map** by seed-and-vote. Nucleic Acids Res. 2013;41:e108.

    Article  PubMed  PubMed Central  Google Scholar 

  • Liao Y, Smyth GK, Shi W. Feature counts: an efficient general-purpose read summarization program. Bioinformatics. 2014;30:923–30.

    Article  CAS  PubMed  Google Scholar 

  • Lindberg J, Lundeberg J. The plasticity of the mammalian transcriptome. Genomics. 2010;95:1–6.

    Article  CAS  PubMed  Google Scholar 

  • Liu L, Li Y, Li S, Hu N, He Y, Pong R, et al. Comparison of next-generation sequencing systems. J Biomed Biotechnol. 2012;2012:251364.

    PubMed  PubMed Central  Google Scholar 

  • Liu Y, Popp B, Schmidt B. CUSHAW3: sensitive and accurate base-space and color-space short-read alignment with hybrid seeding. PloS one. 2014. PMID: 24466273.

    Google Scholar 

  • Love MI, Huber W, Anders S. Moderated estimation of fold change and dispersionfor RNA-seq data with DESeq2. Genome Biol. 2014;15:550.

    Google Scholar 

  • Lunter G, Goodson M. Stampy: a statistical algorithm for sensitive and fast map** of Illumina sequence reads. Genome Res. 2011. PMID: 20980556.

    Google Scholar 

  • Luo GZ, Yang W, Ma YK, Wang XJ. ISRNA: an integrative online toolkit for short reads from high-throughput sequencing data. Bioinformatics. 2014. PMID: 24300438.

    Google Scholar 

  • Mangul S, Caciula A, Al Seesi S, Brinza D, Mӑndoiu I, Zelikovsky A. Transcriptome assembly and quantification from Ion Torrent RNA-Seq data. BMC Genomics. 2014. PMID: 25082147.

    Google Scholar 

  • Marguerat S, Bähler J. RNA-seq: from technology to biology. Cell Mol Life Sci. 2010;67:569–79.

    Article  CAS  PubMed  Google Scholar 

  • Martin M. Cutadapt removes adapter sequences from high-throughput sequencing reads. EMBnet J. 2011;17(1):10–2.

    Article  Google Scholar 

  • McClure R, Balasubramanian D, Sun Y, Bobrovskyy M, Sumby P, Genco CA, et al. Computational analysis of bacterial RNA-Seq data. Nucleic Acids Res. 2013. PMID: 23716638.

    Google Scholar 

  • Metzker ML. Sequencing technologies – the next generation. Nat Rev Genet. 2009;11:31–46.

    Article  PubMed  Google Scholar 

  • Milholland B, Gombar S, Suh Y. SMiRK: an automated pipeline for miRNA analysis. J Genomics. 2015. PMID: 26613105.

    Google Scholar 

  • Milne I, Stephen G, Bayer M, Cock PJ, Pritchard L, Cardle L, et al. Using Tablet for visual exploration of second-generation sequencing data. Brief Bioinf. 2013. PMID: 22445902.

    Google Scholar 

  • Morgan M, Anders S, Lawrence M, Aboyoun P, Pagès H, Gentleman R. Short read: a Bioconductor package for input, quality assessment and exploration of high-throughput sequence data. Bioinformatics. 2009;25:2607–8.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Mortazavi A, Williams BA, McCue K, Schaeffer L, Wold B. Map** and quantifying mammalian transcriptomes by RNA-Seq. Nature Methods. 2008. PMID: 18516045.

    Google Scholar 

  • Nellore A, Collado-Torres L, Jaffe AE, Morton J, Pritt J, Alquicira-Hernández J, et al. Rail-RNA: Scalable analysis of RNA-seq splicing and coverage. bioRxiv. 2015. doi:10.1101/019067.

  • O’Connell J, Schulz-Trieglaff O, Carlson E, Hims MM, Gormley NA, Cox AJ. NxTrim: optimized trimming of Illumina mate pair reads. Bioinformatics. 2015. PMID: 25661542.

    Google Scholar 

  • Okazaki Y, Furuno M, Kasukawa T, et al. Analysis of the mouse transcriptome based on functional annotation of 60,770 full-length cDNAs. Nature. 2002;420:563–73.

    Article  PubMed  Google Scholar 

  • Okonechnikov K, et al. Qualimap 2: advanced multi-sample quality control for high-throughput sequencing data. Bioinformatics. 2015. PMID: 26428292.

    Google Scholar 

  • Oshlack A, Wakefield MJ. Transcript length bias in RNA-seq data confounds systems biology. Biol Direct. 2009;4:14.

    Article  PubMed  PubMed Central  Google Scholar 

  • Pandey RV, Pabinger S, Kriegner A, Weinhäusel A. ClinQC: a tool for quality control and cleaning of Sanger and NGS data in clinical research. BMC Bioinf. 2016; doi:10.1186/s12859-016-0915.

    Google Scholar 

  • Park JW, Tokheim C, Shen S, **ng Y. Identifying differential alternative splicing events from RNA sequencing data using RNASeq-MATS. Methods Mol Biol. 2013. PMID: 23872975.

    Google Scholar 

  • Patel RK, Jain M. NGS QC Toolkit: a toolkit for quality control of next generation sequencing data. PloS one. 2012. PMID: 22312429.

    Google Scholar 

  • Patro R, Mount SM, Kingsford C. Sailfish enables alignment-free isoform quantification from RNA-seq reads using lightweight algorithms. Nat Biotechnol. 2014;32:462–4. PMID: 23912058

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Patro R, Duggal G, Kingsford C. Salmon: accurate, versatile and ultrafast quantification from RNA-seq data using lightweight-alignment. bioRxiv. 2015. http://dx.doi.org/10.1101/021592

  • Quek C, Jung CH, Bellingham SA, Lonie A, Hill AF. iSRAP – a one-touch research tool for rapid profiling of small RNA-seq data. J Extracell Vesicles. 2015. PMID: 26561006.

    Google Scholar 

  • Quinn EM, Cormican P, Kenny EM, et al. Development of strategies for SNP detection in RNA-seq data: application to lymphoblastoid cell lines and evaluation using 1000 Genomes data. PLoS One. 2013;8(3):e58815.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Ramirez F, Dündar F, Diehl S, Grüning BA, Manke T. deepTools: a flexible platform for exploring deep-sequencing data. Nucleic Acids Res. 2014. PMID: 24799436.

    Google Scholar 

  • Renaud G, Stenzel U, Kelso J. leeHom: adaptor trimming and merging for Illumina sequencing reads. Nucleic Acids Res. 2014. PMID: 25100869.

    Google Scholar 

  • Ritchie ME, Phipson B, Wu D, Hu Y, Law CW, Shi W, Smyth GK. Limma powers differential expression analyses for RNA-sequencing and microarray studies. Nucleic Acids Res. 2015;43(7):e47.

    Article  PubMed  PubMed Central  Google Scholar 

  • Roberts A, Trapnell C, Donaghey J, Rinn JL, Pachter L. Improving RNA-Seq expression estimates by correcting for fragment bias. Genome Biol. 2011;12:R22.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Robertson G, Schein J, Chiu R, et al. De novo assembly and analysis of RNA-seq data. Nat Methods. 2010;7:909–12.

    Article  CAS  PubMed  Google Scholar 

  • Robinson MD, McCarthy DJ, Smyth GK. edgeR: a Bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics. 2010;26:139–40.

    Article  CAS  PubMed  Google Scholar 

  • Robinson JT, Thorvaldsdóttir H, Winckler W, Guttman M, Lander ES, et al. Integrative Genomics Viewer. Nat Biotechnol. 2011;29:24–6.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Rueda A, Barturen G, Lebrón R, Gómez-Martín C, Alganza Á, Oliver JL, et al. sRNAtoolbox: an integrated collection of small RNA research tools. Nucleic Acids Res. 2015. PMID: 26019179.

    Google Scholar 

  • Santana-Quintero L, Dingerdissen H, Thierry-Mieg J, Mazumder R, Simonyan V. HIVE-hexagon: high-performance, parallelized sequence alignment for next-generation sequencing data analysis. PLoS ONE. 2014;9(6):e99033. doi:10.1371/journal.pone.0099033.

    Article  PubMed  PubMed Central  Google Scholar 

  • Sayols S, Klein H. dupRadar: assessment of duplication rates in RNA-Seq datasets. R package version 1.1.0. 2015.

    Google Scholar 

  • Selitsky SR, Sethupathy P. tDRmapper: challenges and solutions to map**, naming, and quantifying tRNA-derived RNAs from human small RNA-sequencing data. BMC Bioinf. 2015; doi:10.1186/s12859-015-0800-0.

    Google Scholar 

  • Shendure J, Ji H. Next-generation DNA sequencing. Nat Biotechnol. 2008;26:1135–45.

    Article  CAS  PubMed  Google Scholar 

  • Shi J, Dong M, Li L, Liu L, Luz-Madrigal A, Tsonis PA et al. mirPRo-a novel standalone program for differential expression and variation analysis of miRNAs. Scientific Rep. 2015. PMID: 26434581.

    Google Scholar 

  • Shrestha RK, Lubinsky B, Bansode VB, Moinz MB, McCormack GP and Travers SA. QTrim: a novel tool for the quality trimming of sequence reads generated using the Roche/454 sequencing platform. BMC Bioinf. 2014. PMID: 24479419.

    Google Scholar 

  • Simpson JT, Wong K, Jackman SD, et al. ABySS: a parallel assembler for short read sequence data. Genome Res. 2009;19:1117–23.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Song L, Florea L. CLASS: constrained transcript assembly of RNA-seq reads. BMC Bioinf. 2013. PMID: 23734605.

    Google Scholar 

  • Song L, Florea L. Rcorrector: efficient and accurate error correction for Illumina RNA-seq reads. GigaScience. 2015; doi:10.1186/s13742-015-0089-y.

    Google Scholar 

  • Song L, Sabunciyan S, Florea L. CLASS2: accurate and efficient splice variant annotation from RNA-seq reads. Nucleic Acids Res. 2016. PMID: 26975657.

    Google Scholar 

  • Starostina E, Tamazian G, Dobrynin P, O’Brien S, Komissarov A. Cookiecutter: a tool for kmer-based read filtering and extraction. bioRxiv. 2015. doi:10.1101/024679.

  • Sun Z, Evans J, Bhagwate A, Middha S, Bockol M, Yan H, et al. CAP-miRSeq: a comprehensive analysis pipeline for microRNA sequencing data. BMC Genomics. 2014. PMID: 24894665.

    Google Scholar 

  • Tarazona S, Furió-Taríl P, Turrà D, Pietro AD, José Nueda M, Ferrer A, et al. Data quality aware analysis of differential expression in RNA-seq with NOISeq R/Bioc package. Nucleic Acids Res. 2015; doi:10.1093/nar/gkv711.

    PubMed  PubMed Central  Google Scholar 

  • Thorvaldsdóttir H, Robinson JT, Mesirov JP. Integrative Genomics Viewer (IGV): high-performance genomics data visualization and exploration. Brief Bioinform. 2013;14:178–92.

    Article  PubMed  Google Scholar 

  • Tjaden B. De novo assembly of bacterial transcriptomes from RNA-seq data. Genome Biol. 2015. PMID: 25583448.

    Google Scholar 

  • Trapnell C, Pachter L, Salzberg SL. TopHat: discovering splice junctions with RNA-Seq. Bioinformatics. 2009;25:1105–11.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Trapnell C, Williams BA, Pertea G, Mortazavi A, Kwan G, van Baren MJ, et al.. Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation. Nat Biotechnol. 2010;28:511–5.

    Google Scholar 

  • Trapnell C, Roberts A, Goff L, Pertea G, Kim D, Kelley DR, et al. Differential gene and transcript expression analysis of RNA-seq experiments with TopHat and Cufflinks. Nat Protoc. 2012;7:562–78.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Urgese G, Paciello G, Acquaviva A, Ficarra E. isomiR-SEA: an RNA-Seq analysis tool for miRNAs/isomiRs expression level profiling and miRNA-mRNA interaction sites evaluation. BMC Bioinf. 2016. PMID: 27036505.

    Google Scholar 

  • Velmeshev D, Lally P, Magistri M, Faghihi MA. CANEapp: a user-friendly application for automated next generation transcriptomic data analysis. BMC Genomics. 2016. PMID: 26758513.

    Google Scholar 

  • Vitsios DM, Enright AJ. Chimira: analysis of small RNA sequencing data and microRNA modifications. Bioinformatics. 2015. PMID: 26093149.

    Google Scholar 

  • Wagle P, Nikolić M, Frommolt P. QuickNGS elevates next-generation sequencing data analysis to a new level of automation. BMC Genomics. 2015. PMID: 26126663.

    Google Scholar 

  • Wang Z, Gerstein M, Snyder M. RNA-Seq: A revolutionary tool for transcriptomics. Nat Rev Genet. 2009;10:57–63.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Wang K, Singh D, Zeng Z, Coleman SJ, Huang Y, Savich GL, et al.. MapSplice: accurate map** of RNA-seq reads for splice junction discovery. Nucleic Acids Res. 2010. PMID: 20802226.

    Google Scholar 

  • Wang, L, Wang, S, Li W. RSeQC: quality control of RNA-seq experiments. Bioinformatics. 2012;28(16): 2184–2185. http://doi.org/10.1093/bioinformatics/bts356

  • Wang L, Nie J, Sicotte H, Li Y, Eckel-Passow JE, Dasari S, et al. Measure transcript integrity using RNA-seq data. BMC Bioinf. 2016;17(1):1–16. http://doi.org/10.1186/s12859-016-0922-z Rseqc

    Google Scholar 

  • Wilhelm BT, Marguerat S, Watt S, et al. Dynamic repertoire of a eukaryotic transcriptome surveyed at single-nucleotide resolution. Nature. 2008;453:1239–43.

    Article  CAS  PubMed  Google Scholar 

  • Wolfien M, Rimmbach C, Schmitz U, Jung JJ, Krebs S, Steinhoff G, et al.. TRAPLINE: a standardized and automated pipeline for RNA sequencing data analysis, evaluation and annotation. BMC Bioinf. 2016. PMID: 26738481

    Google Scholar 

  • Yang X, Liu D, Liu F, Wu J, Zou J, **ao X, et al.. HTQC: a fast quality control toolkit for Illumina sequencing data. BMC Bioinf. 2013. PMID: 23363224.

    Google Scholar 

  • Yuan Y, Norris C, Xu Y, Tsui KW, Ji Y and Liang H. BM-Map: an efficient software package for accurately allocating multireads of RNA-sequencing data. BMC Genomics. 2012. PMID: 23281802.

    Google Scholar 

  • Zerbino DR, Birney E. Velvet: algorithms for de novo short read assembly using de Bruijn graphs. Genome Res. 2008;18:821–9.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Zhang T, Luo Y, Liu K, Pan L, Zhang B, Yu J, et al. BIGpre: a quality assessment package for next-generation sequencing data. Genom Proteom Bioinform. 2011;9:238–44. PMID: 22289480.

    Article  Google Scholar 

  • Zhang Z, Huang S, Wang J, Zhang X, Pardo Manuel de Villena F, McMillan L, et al. GeneScissors: a comprehensive approach to detecting and correcting spurious transcriptome inference owing to RNA-seq reads misalignment. Bioinformatics. 2013;29:i291–9. . PMID: 23812996

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Zhao S, ** L, Quan J, ** H, Zhang Y, Schack DV, et al. QuickRNASeq lifts large-scale RNA-seq data analyses to the next level of automation and interactive visualization. BMC Genomics. 2016; doi:10.1186/s12864-015-2356-9.

    Google Scholar 

Download references

Acknowledgments

The authors would like to thank University Grant Commission, India for the support. The authors express their gratitude to Nimisha Chaturvedi, Dr. Raghvendra Singh, and Swadha Singh for giving valuable suggestions regarding the improvement of this chapter.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Pallavi Gaur .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Shanghai Jiao Tong University Press, Shanghai and Springer Science+Business Media Dordrecht

About this chapter

Cite this chapter

Gaur, P., Chaturvedi, A. (2017). A Survey of Bioinformatics-Based Tools in RNA-Sequencing (RNA-Seq) Data Analysis. In: Wei, DQ., Ma, Y., Cho, W., Xu, Q., Zhou, F. (eds) Translational Bioinformatics and Its Application. Translational Medicine Research. Springer, Dordrecht. https://doi.org/10.1007/978-94-024-1045-7_10

Download citation

Publish with us

Policies and ethics

Navigation