Iterative Clustering for Differential Gene Expression Analysis

  • Conference paper
  • First Online:
Bioinformatics and Biomedical Engineering (IWBBIO 2022)

Part of the book series: Lecture Notes in Computer Science ((LNBI,volume 13347))

  • 716 Accesses

Abstract

The Next Generation Sequencing technologies provide large volumes of DNA-seq and RNA-seq data. A central part of their investigation is the task for selecting the differentially expressed genes. Different methods for RNA-seq data analysis that identify genes distinguished by their expression levels have been proposed basically on the statistical data analysis. There is no agreement among the applied methods as different results are produced by the distinct methods. The present paper proposes a new method for differential gene expression analysis based on machine learning approach. Difficulty of the selection due to the large number of indistinguishable genes is solved by iterative clustering procedure. The importance of the proper cluster distance measure is discussed. The visibility of the procedure results and ability to find different number of compact clusters is also underlined. The significance of the method is investigated and proved by application to the two mice strains dataset. The obtained results are compared with the results of the statistical methods applied to the same dataset. It is concluded that the proposed method is valuable and could be applied as standalone or for preliminary genes selection within a statistical algorithms pipeline for discovering differentially expressed genes.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
EUR 29.95
Price includes VAT (Germany)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
EUR 96.29
Price includes VAT (Germany)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
EUR 128.39
Price includes VAT (Germany)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free ship** worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Spies, D., Renz, P.F., Beyer, T.A., Ciaudo, C.: Comparative analysis of differential gene expression tools for RNA sequencing time course data. Brief. Bioinform. 20(1), 288–298 (2019)

    Article  PubMed  Google Scholar 

  2. Wang, T., Li, B., Nelson, C.E., et al.: Comparative analysis of differential gene expression analysis tools for single-cell RNA sequencing data. BMC Bioinform. 20, 40 (2019)

    Article  Google Scholar 

  3. Palejev, D.: Comparison of RNA-seq differential expression methods. Cybern. Inf. Technol. 17(5), 60–67 (2017)

    Google Scholar 

  4. Law, C.W., Chen, Y., Shi, W., Smyth, G.: Voom: Precision weights unlock linear model analysis tools for RNA-seq read counts. Genome Biol. 15, 1–17 (2014). R29

    Article  Google Scholar 

  5. Anders, S., Huber, W.: Differential expression analysis for sequence count data. Genome Biol. 11 (2010). R106

    Google Scholar 

  6. Love, M.I., Huber, W., Anders, S.: Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol. 12 (2014). 550

    Google Scholar 

  7. Robinson, M.D., Mccarthy, D.J., Smyth, G.K.: EdgeR: A bioconductor package for differential expression analysis of digital gene expression data. Bioinform. 26(1), 139–140 (2010)

    Article  CAS  Google Scholar 

  8. Chousiadas, D., Menychtas, A., Tsanakas, P., Maglogiannis, I.: Advancing quantified-self applications utilizing visual data analytics and the internet of things. In: Iliadis, L., Maglogiannis, I., Plagianakos, V. (eds.) AIAI 2018. IAICT, vol. 520, pp. 263–274. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-92016-0_24

    Chapter  Google Scholar 

  9. Sevakula, R.K., Au-Yeung, W.T.M., Singh, J.P., Heist, E.K., Isselbacher, E.M., Armoundas, A.A.: State-of-the-Art machine learning techniques aiming to improve patient outcomes pertaining to the cardiovascular system. J. Am. Heart Assoc. 9(4), e013924 (2020)

    Article  PubMed  PubMed Central  Google Scholar 

  10. Poddar, M.G., Birajdar, A.C., Virmani, J., Kriti: Automated classification of hypertension and coronary artery disease patients by PNN, KNN, and SVM classifiers using HRV analysis. In: Dey, N., Borra, S., Ashour, A.S., Shi, F. (eds.) Proceedings of the Machine Learning in Bio-Signal Analysis and Diagnostic Imaging, pp. 99–125. Academic Press (2019)

    Google Scholar 

  11. van IJzendoorn, D.G.P., Szuhai, K., Briaire-de Bruijn, I.H., Kostine, M., Kuijjer, M.L., et al.: Machine learning analysis of gene expression data reveals novel diagnostic and prognostic biomarkers and identifies therapeutic targets for soft tissue sarcomas. PLoS Comput. Biol. 15(2) (2019)

    Google Scholar 

  12. Abbas, M., El-Manzalawy, Y.: Machine learning based refined differential gene expression analysis of pediatric sepsis. BMC Med. Genomics 13, 122 (2020)

    Google Scholar 

  13. Bottomly, D., Walter, N.A.R., Hunter, J.E., et al.: Evaluating gene expression in C57BL/6J and DBA/2J mouse striatum using RNA-seq and microarrays. PLoS ONE 6(3), e17820 (2011)

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  14. Ester, M., Kriegel, H.-P., Sander, J., **aowei, X.: A density-based algorithm for discovering clusters in large spatial databases with noise. In: Proceedings of the Second International Conference on Knowledge Discovery in Databases and Data Mining, pp. 226–231. AAAI Press, Portland (1996)

    Google Scholar 

  15. Frazee, A.C., Langmead, B., Leek, J.T.: ReCount: a multi-experiment resource of analysis-ready RNA-seq gene count datasets. BMC Bioinform. 12, 449 (2011)

    Article  Google Scholar 

Download references

Acknowledgement

The result presented in this paper is part of the GATE project. The project has received funding from the European Union’s Horizon 2020 WIDESPREAD-2018–2020 TEAMING Phase 2 programme under Grant Agreement No. 857155 and Operational Programme Science and Education for Smart Growth under Grant Agreement No. BG05M2OP001–1.003–0002-C01.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Olga Georgieva .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Georgieva, O. (2022). Iterative Clustering for Differential Gene Expression Analysis. In: Rojas, I., Valenzuela, O., Rojas, F., Herrera, L.J., Ortuño, F. (eds) Bioinformatics and Biomedical Engineering. IWBBIO 2022. Lecture Notes in Computer Science(), vol 13347. Springer, Cham. https://doi.org/10.1007/978-3-031-07802-6_33

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-07802-6_33

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-07801-9

  • Online ISBN: 978-3-031-07802-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Navigation