A Machine Learning Pipeline for Discriminant Pathways Identification

  • Conference paper
Computational Intelligence Methods for Bioinformatics and Biostatistics (CIBB 2011)

Abstract

Identifying the molecular pathways more prone to disruption during a pathological process is a key task in network medicine and, more in general, in systems biology. In this work we propose a pipeline that couples a machine learning solution for molecular profiling with a recent network comparison method. The pipeline can identify changes occurring between specific sub-modules of networks built in a case-control biomarker study, discriminating key groups of genes whose interactions are modified by an underlying condition. The proposal is independent from the classification algorithm used. Two applications on genomewide data are presented regarding children susceptibility to air pollution and early and late onset of Parkinson’s disease.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
EUR 29.95
Price includes VAT (France)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
EUR 37.44
Price includes VAT (France)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
EUR 48.53
Price includes VAT (France)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free ship** worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Barabasi, A.L., Gulbahce, N., Loscalzo, J.: Network medicine: a network-based approach to human disease. Nature Review Genetics 12, 56–68 (2011)

    Article  Google Scholar 

  2. Strogatz, S.H.: Exploring complex networks. Nature 410, 268–276 (2001)

    Article  Google Scholar 

  3. Newman, M.E.J.: The Structure and Function of Complex Networks. SIAM Review 45, 167–256 (2003)

    Article  MathSciNet  MATH  Google Scholar 

  4. Boccaletti, S., Latora, V., Moreno, Y., Chavez, M., Hwang, D.-U.: Complex networks: Structure and dynamics. Physics Reports 424(4-5), 175–308 (2006)

    Article  MathSciNet  Google Scholar 

  5. Newman, M.E.J.: Networks: An Introduction. Oxford University Press (2010)

    Google Scholar 

  6. Buchanan, M., Caldarelli, G., De Los Rios, P., Rao, F., Vendruscolo, M. (eds.): Networks in Cell Biology. Cambridge University Press (2010)

    Google Scholar 

  7. He, F., Balling, R., Zeng, A.-P.: Reverse engineering and verification of gene networks: Principles, assumptions, and limitations of present methods and future perspectives. J. Biotechnol. 144(3), 190–203 (2009)

    Article  Google Scholar 

  8. Baralla, A., Mentzen, W.I., de la Fuente, A.: Inferring Gene Networks: Dream or Nightmare? Ann. N.Y. Acad. Sci. 1158, 246–256 (2009)

    Article  Google Scholar 

  9. Marbach, D., Prill, R.J., Schaffter, T., Mattiussi, C., Floreano, D., Stolovitzky, G.: Revealing strenghts and weaknesses of methods for gene network inference. PNAS 107(14), 6286–6291 (2010)

    Article  Google Scholar 

  10. De Smet, R., Marchal, K.: Advantages and limitations of current network inference methods. Nature Review Microbiology 8, 717–729 (2010)

    Google Scholar 

  11. The MicroArray Quality Control (MAQC) Consortium. The MAQC-II Project: A comprehensive study of common practices for the development and validation of microarray-based predictive models. Nature Biotechnology 28(8), 827–838 (2010)

    Google Scholar 

  12. Zhang, B., Kirov, S., Snoddy, J.: WebGestalt: an integrated system for exploring gene sets in various biological contexts. Nuc. Acid. Res. 33 (2005)

    Google Scholar 

  13. Subramanian, A., Tamayo, P., Mootha, V.K., Mukherjee, S., Ebert, B.L., Gillette, M.A., Paulovich, A., Pomeroy, S.L., Golub, T.R., Lander, E.S., Mesirov, J.P.: Gene set enrichment analysis: A knowledge-based approach for interpreting genome-wide expression profiles. PNAS 102(43), 15545–15550 (2005)

    Article  Google Scholar 

  14. Sharan, R., Ideker, T.: Modeling cellular machinery through biological network comparison. Nature Biotechnology 24(4), 427–433 (2006)

    Article  Google Scholar 

  15. Jurman, G., Visintainer, R., Furlanello, C.: An introduction to spectral distances in networks. In: Proc. WIRN 2010, pp. 227–234 (2011)

    Google Scholar 

  16. Ipsen, M., Mikhailov, A.S.: Evolutionary reconstruction of networks. Phys. Rev. E 66(4), 046109 (2002)

    Article  Google Scholar 

  17. Stumpf, M.P.H., Wiuf, C., May, R.M.: Subnets of scale-free networks are not scale-free: Sampling properties of networks. Proceedings of the National Academy of Sciences of the United States of America 102(12), 4221–4224 (2005)

    Article  Google Scholar 

  18. Zhang, B., Horvath, S.: A General Framework for Weighted Gene Co-Expression Network Analysis. Statistical Applications in Genetics and Molecular Biology 4(1), Article 17 (2005)

    Google Scholar 

  19. Cai, D., He, X., Han, J.: Srda: An efficient algorithm for large-scale discriminant analysis. IEEE Transactions on Knowledge and Data Engineering 20, 1–12 (2008)

    Article  Google Scholar 

  20. De Mol, C., Mosci, S., Traskine, M., Verri, A.: A regularized method for selecting nested groups of relevant genes from microarray data. Journal of Computational Biology 16, 1–15 (2009)

    Article  MathSciNet  Google Scholar 

  21. Fardin, P., Barla, A., Mosci, S., Rosasco, L., Verri, A., Varesio, L.: The l1-l2 regularization framework unmasks the hypoxia signature hidden in the transcriptome of a set of heterogeneous neuroblastoma cell lines. BMC Genomics (January 2009)

    Google Scholar 

  22. Zhang, B., Kirov, S., Snoddy, J.: Webgestalt: an integrated system for exploring gene sets in various biological contexts. Nucleic Acids Res. 33 (July 2005)

    Google Scholar 

  23. Ashburner, M., Ball, C.A., Blake, J.A., Botstein, D., Butler, H., Cherry, J.M., Davis, A.P., Dolinski, K., Dwight, S.S., Eppig, J.T., Harris, M.A., Hill, D.P., Issel-Tarver, L., Kasarskis, A., Lewis, S., Matese, J.C., Richardson, J.E., Ringwald, M., Rubin, G.M., Sherlock, G.: Gene ontology: tool for the unification of biology. the gene ontology consortium. Nature Genetics 25(1), 25–29 (2000)

    Article  Google Scholar 

  24. Zhao, W., Langfelder, P., Fuller, T., Dong, J., Li, A., Horvath, S.: Weighted gene coexpression network analysis: state of the art. Journal of Biopharmaceutical Statistics 20(2), 281–300 (2010)

    Article  MathSciNet  Google Scholar 

  25. Margolin, A.A., Nemenman, I., Basso, K., Wiggins, C., Stolovitzky, G., Dalla-Favera, R., Califano, A.: Aracne: an algorithm for the reconstruction of gene regulatory networks in a mammalian cellular context. BMC Bioinform. 7(7) (2006)

    Google Scholar 

  26. Nemenman, I., Escola, G.S., Hlavacek, W.S., Unkefer, P.J., Unkefer, C.J., Wall, M.E.: Reconstruction of Metabolic Networks from High-Throughput Metabolite Profiling Data. Ann. N.Y. Acad. Sci. 1115, 102–115 (2007)

    Article  Google Scholar 

  27. Cover, T.M., Thomas, J.: Elements of Information Theory. Wiley (1991)

    Google Scholar 

  28. Meyer, P., Lafitte, F., Bontempi, G.: Minet: A R/Bioconductor Package for Inferring Large Transcriptional Networks Using Mutual Information. BMC Bioinform. 9(1), 461 (2008)

    Article  Google Scholar 

  29. van Leeuwen, D.M., Pedersen, M., Hendriksen, P.J.M., Boorsma, A., van Herwijnen, M.H.M., Gottschalk, R.W.H., Kirsch-Volders, M., Knudsen, L.E., Sram, R.J., Bajak, E., van Delft, J.H.M., Kleinjans, J.C.S.: Genomic analysis suggests higher susceptibility of children to air pollution. Carcinogenesis 29(5) (2008)

    Google Scholar 

  30. Scherzer, C.R., Eklund, A.C., Morse, L.J., Liao, Z., Locascio, J.L., Fefer, D., Schwarzschild, M.A., Schlossmacher, M.G., Hauser, M.A., Vance, J.M., Sudarsky, L.R., Standaert, D.G., Growdon, J.H., Jensen, R.V., Gullans, S.R.: Molecular markers of early Parkinson’s disease based on gene expression in blood. PNAS (2007)

    Google Scholar 

  31. Zhang, Y., James, M., Middleton, F.A., Davis, R.L.: Transcriptional analysis of multiple brain regions in Parkinson’s disease supports the involvement of specific protein processing, energy metabolism and signaling pathways and suggests novel disease mechanisms. American Journal of Medical Genetics Part B Neuropsychiatric Genetics 137B, 5–16 (2005)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Barla, A. et al. (2012). A Machine Learning Pipeline for Discriminant Pathways Identification. In: Biganzoli, E., Vellido, A., Ambrogi, F., Tagliaferri, R. (eds) Computational Intelligence Methods for Bioinformatics and Biostatistics. CIBB 2011. Lecture Notes in Computer Science(), vol 7548. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-35686-5_4

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-35686-5_4

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-35685-8

  • Online ISBN: 978-3-642-35686-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Navigation