Managing NGS Differential Expression Uncertainty with Fuzzy Sets

  • Conference paper
  • First Online:
Computational Intelligence Methods for Bioinformatics and Biostatistics (CIBB 2015)

Part of the book series: Lecture Notes in Computer Science ((LNBI,volume 9874))

Abstract

High-performance Next-Generation Sequencing (NGS) has become a widely used technology to characterize case-control comparison studies for RNA transcripts, such as mRNAs and small non-coding RNAs. The first step in the analysis strategies is map** NGS reads against a reference database and a critical issue emerges in this phase: the problem of multireads. In this paper we present a novel approach to represent and quantify read map** ambiguities through the use of fuzzy sets and possibility theory. The aim of this work is to obtain a list of candidate differential expression events, providing a description of the uncertainty of the results due to multiread presence. In a preliminary experiment on HeLa cells, the method correctly detected the possibility of false positiveness, while on a case-control study of human endobronchial biopsies, the method identified 11 genes with possible different expression, four of them with an uncertain fold change. This last result was confirmed by FDR adjusted Fisher’s test, while DESeq2 did not provide significant differences between case and control.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
EUR 29.95
Price includes VAT (Thailand)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
EUR 42.79
Price includes VAT (Thailand)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
EUR 49.99
Price excludes VAT (Thailand)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free ship** worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    \(A'=A-1\) if \(A>0\), otherwise \(A'=A=0\).

  2. 2.

    A standard application of the extension principle to the fold change results in a fuzzy set with a complex membership function, which requires complex computations without any real benefits.

  3. 3.

    Given two expressions \(e_1\) and \(e_2\) of a gene in two samples, the MA-plot places the gene on a plane (M, A) where \(M=\log _2 ({e_1}/{e_2})\) (the fold change) and \(A = (1/2)\log _2(e_1e_2)\) (average intensity).

  4. 4.

    The centroid is computed with the constraint of falling inside the interval [B, C].

  5. 5.

    The boundaries are estimated as hyperbolas, with their parameters fitted on the dataset; their horizontal asymptotes represent a limit fold change value under which differential expression loses significance.

  6. 6.

    The possibility measure between two fuzzy sets \(F_1\) and \(F_2\) is defined as \(\varPi (F_1, F_2)=\max _x \min \{F_1(x), F_2(x)\}\).

References

  1. Faulkner, G.J., Forrest, A.R., Chalk, A.M., Schroder, K., Hayashizaki, Y., Carninci, P., HUme, D.A., Grimmond, S.M.: A rescue strategy for multimap** short sequence tags refines surveys of transcriptional activity by CAGE. Genomics 91(3), 281–288 (2008)

    Article  Google Scholar 

  2. Jiang, H., Wong, W.H.: Statistical inferences for isoform expression in RNA-Seq. Bioinformatics 25(8), 1026–1032 (2009)

    Article  Google Scholar 

  3. Li, B., Ruotti, V., Stewart, R.M., Thomson, J.A., Dewey, C.N.: RNA-Seq gene expression estimation with read map** uncertainty. Bioinformatics 26(4), 493–500 (2010)

    Article  Google Scholar 

  4. Li, B., Dewey, C.N.: RSEM: accurate transcript quantification from RNA-Seq data with or without a reference genome. BMC Bioinf. 12(1), 323 (2011)

    Article  Google Scholar 

  5. Love, M.I., Huber, W., Anders, S.: Moderated estimation of fold change and dispersion for RNA-Seq data with DESeq2. Genome Biol. 15(12), 550 (2014)

    Article  Google Scholar 

  6. Trapnell, C., Roberts, A., Goff, L., Pertea, G., Kim, D., Kelley, D.R., Pimentel, H., Salzberg, S.L., Rinn, J.L., Pachter, L.: Differential gene and transcript expression analysis of RNA-Seq experiments with TopHat and Cufflinks. Nat. Protoc. 7(3), 562–578 (2012)

    Article  Google Scholar 

  7. Glaus, P., Honkela, A., Rattray, M.: Identifying differentially expressed transcripts from RNA-Seq data with biological variation. Bioinformatics 28(13), 1721–1728 (2012)

    Article  Google Scholar 

  8. Negoita, C., Zadeh, L.A., Zimmermann, H.J.: Fuzzy sets as a basis for a theory of possibility. Fuzzy Sets Syst. 1, 3–28 (1978)

    Article  MathSciNet  Google Scholar 

  9. Pedrycz, W., Gomide, F.: An Introduction to Fuzzy Sets: Analysis and Design. MIT Press, Cambridge (1998)

    MATH  Google Scholar 

  10. Wilming, L.G., Gilbert, J.G.R., Howe, K., Trevanion, S., Hubbard, T., Harrow, J.L.: The vertebrate genome annotation (Vega) database. Nucleic Acids Res. 36(suppl 1), D753–D760 (2008)

    Google Scholar 

Download references

Acknowledgments

We thank Dr. Flavio Licciulli, Dr. Mariano Caratozzolo and Dr. Flaviana Marzano for their suggestions and help with NGS data elaboration. A.C. is supported by Progetto MICROMAP PON01_02589.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Arianna Consiglio .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this paper

Cite this paper

Consiglio, A., Mencar, C., Grillo, G., Liuni, S. (2016). Managing NGS Differential Expression Uncertainty with Fuzzy Sets. In: Angelini, C., Rancoita, P., Rovetta, S. (eds) Computational Intelligence Methods for Bioinformatics and Biostatistics. CIBB 2015. Lecture Notes in Computer Science(), vol 9874. Springer, Cham. https://doi.org/10.1007/978-3-319-44332-4_4

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-44332-4_4

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-44331-7

  • Online ISBN: 978-3-319-44332-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Navigation