Abstract
The Fourth Industrial Revolution has brought up a vast amount of new innovative implementations. These can be used for numerous areas to make wealth and to improve human ways of living. Our point of view is to consider medical problems for enhancing prediction first. In this study, we like to rise a question about whether if we could enhancing Inflammatory Bowel Disease (IBD) prediction for early detect related sickness by feature selection on metagenomic data. Over the last few years, the prediction has been a challenge. Because of rare information and lacking data, the problem is not well considered enough. To bring back the subject, in this work, we propose a new way of enhancing Inflammatory Bowel Disease (IBD) prediction by using the Correlation Matrix with the Pearson on Metagenomic Data. Our implications have the purpose of finding out whether we could do predictions better using a specific amount of features selected by Pearson correlation coefficient. The result with the proposed method is quite promising, when we address some high correlation features out, the model can predict better comparing to randomly select features.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Kim ER, Chang DK (2014) Colorectal cancer in inflammatory bowel disease: the risk, pathogenesis, prevention and diagnosis. World J Gastroenterol 20(29):9872–9881. https://doi.org/10.3748/wjg.v20.i29.9872
Centers for Disease Control and Prevention: Inflammatory bowel disease (IBD), from https://www.cdc.gov/ibd/what-is-IBD.htm
NIH-U.S. National Library of Medicine: Crohn’s Disease, from https://medlineplus.gov/crohnsdisease.html
National Institute of Diabetes and Digestive and Kidney Diseases (NIDDK): Ulcerative Colitis, from https://www.niddk.nih.gov/health-information/digestive-diseases/ulcerative-colitis
NHS: Overview Cancer, from https://www.nhs.uk/conditions/cancer/
NIH-National Cancer Institute: Understanding Cancer, from https://www.cancer.gov/about-cancer/understanding/what-is-cancer
American Cancer Society: What Is Cancer? from https://www.cancer.org/cancer/cancer-basics/what-is-cancer.html
(ASCRS) American Society of Colon & Rectal Surgeons: The Colon: What it is, What it Does and Why it is Important: Overview Cancer, from https://fascrs.org/patients/diseases-and-conditions/a-z/the-colon-what-it-is,-what-it-does
Innerbody: Rectum, from https://www.innerbody.com/image_digeov/dige14-new3.html
World Health Organization: Cancer. Retrieved September 28, 2020, from https://www.who.int/news-room/fact-sheets/detail/cancer
American Cancer Society: Key Statistics for Colorectal Cancer. Retrieved September 28, 2020, from https://www.cancer.org/cancer/colon-rectal-cancer/about/key-statistics.html
Vogenberg F, Isaacson Barash C, Pursel M (2010) Personalized medicine: Part 1: Evolution and development into theranostics. Retrieved September 27, 2020, from https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2957753/
Top 10 Applications of Machine Learning in Healthcare - FWS. (n.d.). Retrieved September 27, 2020, from https://www.flatworldsolutions.com/healthcare/articles/top-10-applications-of-machine-learning-in-healthcare.php
Nguyen TH, Zucker J (2019) Enhancing metagenome-based disease prediction by unsupervised binning approaches. In: 2019 11th international conference on knowledge and systems engineering (KSE), da nang, Vietnam, 2019, pp 1–5. https://doi.org/10.1109/KSE.2019.8919295
Nguyen TH, Nguyen TN (2019) Disease prediction using metagenomic data visualizations based on manifold learning and convolutional neural network. Lecture Notes in Computer Science, vol 11814. Springer, Cham. https://doi.org/10.1007/978-3-030-35653-8_9
O ndov BD, Bergman NH, Phillippy AM (2011) Interactive metagenomic visualization in a web browser. BMC Bioinform. 12:385. https://doi.org/10.1186/1471-2105-12-385. (ISSN:1471-2105)
Nguyen TH et al (2018) Disease classification in metagenomics with 2D embeddings and deep learning. In: The annual French conference in machine learning (CAp 2018). France: Rouen; June 2018. ar**v: 1806.09046
Thanh-Hai N, Thai-Nghe N (2020) Diagnosis approaches for colorectal cancer using manifold learning and deep learning. SN COMPUT. SCI. 1:281
Laurens van der Maaten GH (2008) Visualizing data using t-sne. J Mach Learn Res 9:8
Nguyen T, Chevaleyre Y, Prifti E, Sokolovska N, Zucker J (2017) Deep learning for metagenomic data: using 2D embeddings and convolutional neural networks. ar**v: 1712.00244
Benesty J, Chen J, Huang Y, Cohen I (2009) Pearson correlation coefficient. In: Noise reduction in speech processing. Springer Topics in Signal Processing, vol 2. Springer, Berlin, Heidelberg. from https://doi.org/10.1007/978-3-642-00296-0_5
Correlation Test Between Two Variables in R. (n.d.). Retrieved October 13, 2020, from http://www.sthda.com/english/wiki/correlation-test-between-two-variables-in-r
The ‘K’ in K-fold cross-validation: davide anguita, Luca Ghelardoni, Alessandro Ghio, Luca Oneto and Sandro Ridella https://www.elen.ucl.ac.be/Proceedings/esann/esannpdf/es2012-62.pdf
Overfitting (2020) Retrieved 13 October 2020, from https://en.wikipedia.org/wiki/Overfitting
Selection bias (2020) Retrieved 13 October 2020, from https://en.wikipedia.org/wiki/Selection_bias
Sokol H, Leducq V, Aschard H et al (2017) Gut 66:1039–1048
Fioravanti D et al (2018) Phylogenetic convolutional neural networks in metagenomics. BMC Bioinformatics 19.S2 (2018): n. pag. Crossref. Web
Boughorbel S, Jarray F, El-Anbari M (2017) Optimal classifier for imbalanced data using Matthews Correlation Coefficient metric. PLoS ONE 12(6): e0177678. https://doi.org/10.1371/journal.pone.0177678
En.wikipedia.org (2020) Matthews correlation coefficient. Retrieved October 12, 2020, https://en.wikipedia.org/wiki/Matthews_correlation_coefficient
Huang J, Ling CX (2005) Using AUC and accuracy in evaluating learning algorithms. IEEE Trans Knowl Data Eng 17(3):299–310. https://doi.org/10.1109/TKDE.2005.50
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Luong, H.H. et al. (2022). Feature Selection Using Correlation Matrix on Metagenomic Data with Pearson Enhancing Inflammatory Bowel Disease Prediction. In: Ibrahim, R., K. Porkumaran, Kannan, R., Mohd Nor, N., S. Prabakar (eds) International Conference on Artificial Intelligence for Smart Community. Lecture Notes in Electrical Engineering, vol 758. Springer, Singapore. https://doi.org/10.1007/978-981-16-2183-3_102
Download citation
DOI: https://doi.org/10.1007/978-981-16-2183-3_102
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-16-2182-6
Online ISBN: 978-981-16-2183-3
eBook Packages: Computer ScienceComputer Science (R0)