Advancing ScRNA-Seq Data Integration via a Novel Gene Selection Method

  • Conference paper
  • First Online:
Artificial Intelligence Applications and Innovations (AIAI 2024)

Abstract

Cancer presents a formidable challenge in medical research, spurring efforts to demystify its underlying mechanisms towards advancing precision medicine, which aims at tailoring treatments to individuals’ genetic profiles. This study harnesses the power of single-cell RNA sequencing (scRNA-seq), a cutting-edge tool in next-generation sequencing, to delve into the transcriptomic intricacies of individual cells across diverse populations. Our methodology provides profound insights into gene expression patterns, significantly enhancing our understanding of cellular heterogeneity and its implications for cancer’s pathogenesis. To address the ’curse of dimensionality’ inherent in high-dimensional data, we introduce a sophisticated machine learning-based feature selection approach. This technique conceptualizes gene selection as a multi-label classification challenge, focusing on identifying genes critical for distinguishing between disease states and cell types. Importantly, our strategy underscores the value of data integration in reinforcing the statistical robustness of scRNA-seq analyses. By integrating disparate scRNA-seq datasets, we effectively mitigate batch effects, ensuring more accurate and reliable insights, thereby contributing significantly to the advancement of precision medicine in oncology.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
EUR 29.95
Price includes VAT (Germany)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
EUR 171.19
Price includes VAT (Germany)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
EUR 106.99
Price includes VAT (Germany)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free ship** worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Berisha, V., Krantsevich, C., Hahn, P.R., Hahn, S., Dasarathy, G., Turaga, P., Liss, J.: Digital medicine and the curse of dimensionality. NPJ Digital Med. 4(1), 153 (2021)

    Article  Google Scholar 

  2. Butler, A., Hoffman, P., Smibert, P., Papalexi, E., Satija, R.: Integrating single-cell transcriptomic data across different conditions, technologies, and species. Nat. Biotechnol. 36(5), 411–420 (2018)

    Article  Google Scholar 

  3. Büttner, M., Miao, Z., Wolf, F.A., Teichmann, S.A., Theis, F.J.: A test metric for assessing single-cell rna-seq batch correction. Nat. Methods 16(1), 43–49 (2019)

    Article  Google Scholar 

  4. Camps, J., Noël, F., Liechti, R., Massenet-Regad, L., Rigade, S., Götz, L., Hoffmann, C., Amblard, E., Saichi, M., Ibrahim, M.M., et al.: Meta-analysis of human cancer single-cell rna-seq datasets using the immucan database. Can. Res. 83(3), 363–373 (2023)

    Article  Google Scholar 

  5. de Carvalho, A.C., Freitas, A.A.: A tutorial on multi-label classification techniques. Foundations of Computational Intelligence Volume 5: Function Approximation and Classification, pp. 177–195 (2009)

    Google Scholar 

  6. Chatzilygeroudis, K.I., Vrahatis, A.G., Tasoulis, S.K., Vrahatis, M.N.: Feature selection in single-cell rna-seq data via a genetic algorithm. In: Learning and Intelligent Optimization: 15th International Conference, LION 15, Athens, Greece, June 20–25, 2021, Revised Selected Papers 15, pp. 66–79. Springer (2021)

    Google Scholar 

  7. Choi, Y.H., Kim, J.K.: Dissecting cellular heterogeneity using single-cell rna sequencing. Mol. Cells 42(3), 189–199 (2019)

    Google Scholar 

  8. Géron, A.: Hands-on machine learning with Scikit-Learn, Keras, and TensorFlow. O’Reilly Media, Inc. (2022)

    Google Scholar 

  9. Han, H., Guo, X., Yu, H.: Variable selection using mean decrease accuracy and mean decrease gini based on random forest. In: 2016 7th IEEE International Conference on Software Engineering and Service Science (ICSESS), pp. 219–224. IEEE (2016)

    Google Scholar 

  10. Hie, B., Bryson, B., Berger, B.: Efficient integration of heterogeneous single-cell transcriptomes using scanorama. Nat. Biotechnol. 37(6), 685–691 (2019)

    Article  Google Scholar 

  11. Kharchenko, P.V.: The triumphs and limitations of computational methods for scrna-seq. Nat. Methods 18(7), 723–732 (2021)

    Article  Google Scholar 

  12. Kolodziejczyk, A.A., Kim, J.K., Svensson, V., Marioni, J.C., Teichmann, S.A.: The technology and biology of single-cell rna sequencing. Mol. Cell 58(4), 610–620 (2015)

    Article  Google Scholar 

  13. Korsunsky, I., et al.: Fast, sensitive and accurate integration of single-cell data with harmony. Nature Methods 16(12), 1289–1296 (2019)

    Google Scholar 

  14. Kuhn, M., Johnson, K., et al.: Applied Predictive Modeling, vol. 26. Springer (2013)

    Google Scholar 

  15. Lazaros, K., Tasoulis, S., Vrahatis, A., Plagianakos, V.: Feature selection for high dimensional data using supervised machine learning techniques. In: 2022 IEEE International Conference on Big Data (Big Data), pp. 3891–3894. IEEE (2022)

    Google Scholar 

  16. Lopez, R., Regier, J., Cole, M.B., Jordan, M.I., Yosef, N.: Deep generative modeling for single-cell transcriptomics. Nat. Methods 15(12), 1053–1058 (2018)

    Article  Google Scholar 

  17. Luecken, M.D., Büttner, M., Chaichoompu, K., Danese, A., Interlandi, M., Müller, M.F., Strobl, D.C., Zappia, L., Dugas, M., Colomé-Tatché, M., et al.: Benchmarking atlas-level data integration in single-cell genomics. Nat. Methods 19(1), 41–50 (2022)

    Article  Google Scholar 

  18. Luecken, M.D., Theis, F.J.: Current best practices in single-cell rna-seq analysis: a tutorial. Mol. Syst. Biol. 15(6), e8746 (2019)

    Article  Google Scholar 

  19. McInnes, L., Healy, J., Melville, J.: Umap: Uniform manifold approximation and projection for dimension reduction. ar**v preprint ar**v:1802.03426 (2018)

  20. Pal, M.: Random forest classifier for remote sensing classification. Int. J. Remote Sens. 26(1), 217–222 (2005)

    Article  Google Scholar 

  21. Paplomatas, P., Krokidis, M.G., Vlamos, P., Vrahatis, A.G.: An ensemble feature selection approach for analysis and modeling of transcriptome data in Alzheimer’s disease. Appl. Sci. 13(4), 2353 (2023)

    Article  Google Scholar 

  22. Saliba, A.E., Westermann, A.J., Gorski, S.A., Vogel, J.: Single-cell rna-seq: advances and future challenges. Nucleic Acids Res. 42(14), 8845–8860 (2014)

    Article  Google Scholar 

  23. Wolf, F.A., Angerer, P., Theis, F.J.: Scanpy: large-scale single-cell gene expression data analysis. Genome Biol. 19, 1–5 (2018)

    Article  Google Scholar 

  24. Xu, C., Lopez, R., Mehlman, E., Regier, J., Jordan, M.I., Yosef, N.: Probabilistic harmonization and annotation of single-cell transcriptomics data with deep generative models. Mol. Syst. Biol. 17(1), e9620 (2021)

    Article  Google Scholar 

Download references

Acknowledgement

The registration and publication costs for this work are funded by the Research Committee of the Ionian University, Special Account for Research Grants, project title: “Master’s program in Bioinformatics and Neuroinformatics”.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Konstantinos Lazaros .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2024 IFIP International Federation for Information Processing

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Lazaros, K., Exarchos, T., Maglogiannis, I., Vlamos, P., Vrahatis, A.G. (2024). Advancing ScRNA-Seq Data Integration via a Novel Gene Selection Method. In: Maglogiannis, I., Iliadis, L., Macintyre, J., Avlonitis, M., Papaleonidas, A. (eds) Artificial Intelligence Applications and Innovations. AIAI 2024. IFIP Advances in Information and Communication Technology, vol 711. Springer, Cham. https://doi.org/10.1007/978-3-031-63211-2_3

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-63211-2_3

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-63210-5

  • Online ISBN: 978-3-031-63211-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Navigation