Abstract
Artificial intelligence (AI) and machine learning (ML) have emerged over the past decade as the cutting-edge technologies most expected to revolutionize the research and development sector. This is fueled in part by game-changing developments in computer technology and the concomitant evaporation of barriers to collecting massive amounts of data. Meanwhile, the cost of researching, testing, manufacturing, and distributing new pharmaceuticals has risen. In light of these challenges, the pharmaceutical industry is interested in AI/ML methods because to their automation, predictability, and the ensuing anticipated boost in efficiency. The use of ML techniques in the pharmaceutical industry has matured during the past 15 years. Clinical trial design, management, and analysis are the most recent drug development process steps to benefit from AI and ML. As we move toward a world in which AI/ML is increasingly integrated into R&D, it is essential to sort through the corresponding jargon and hype. Equally crucial is the understanding that the scientific method is still relevant for drawing conclusions from evidence. By doing so, we can better evaluate the potential benefits of AI/ML in the pharmaceutical industry and make well-informed decisions on their best application. The purpose of this paper is to clarify certain fundamental ideas, provide some examples of their application, and then provide some helpful perspective on how to best apply AI/ML techniques to research and development.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Alpaydin E (2020) Introduction to machine learning, 4th ed. p 1–3, 13–18
Athreya AP, Gaglio AJ, Cairns J, Kalari KR, Weinshilboum RM, Wang L, Kalbarczyk ZT, Iyer RK (2018) Machine learning helps identify new drug mechanisms in triple-negative breast cancer. IEEE Trans Nanobioscience 17(3):251–259. https://doi.org/10.1109/TNB.2018.2851997
Berrar D (2018) Bayes’ theorem and naive Bayes classifier. In: Encyclopedia of bioinformatics and computational biology: ABC of bioinformatics. Elsevier, pp 403–412
Bouckaert RR, Frank E, Hall MA, Holmes G, Pfahringer B, Reutemann P, Witten IH (2010) WEKA—experiences with a Java open-source project. J Mach Learn Res 11:2533–2541
Burbidge R, Trotter M, Buxton B, Holden S (2001) Drug design by machine learning: support vector machines for pharmaceutical data analysis. Comput Chem 26(1):5–14
Celebi ME (ed) (2014) Partitional clustering algorithms. Springer
Charbuty B, Abdulazeez A (2021) Classification based on decision tree algorithm for machine learning. J Appl Sci Technol Trends 2(01):20–28
Chen XW, Gao JX (2016) Big data bioinformatics. Methods 111:1–2. https://doi.org/10.1016/j.ymeth
Chetty M, Hallinan J, Ruz GA, Wipat A (2022) Computational intelligence and machine learning in bioinformatics and computational biology. Biosystems 222:104792. https://doi.org/10.1016/j.biosystems.2022.104792
Contreras P, Murtagh F (2015) Hierarchical clustering. In: Handbook of cluster analysis, pp 103–123
Demšar J, Curk T, Erjavec A, Gorup Č, Hočevar T, Milutinovič M et al (2013) Orange: data mining toolbox in python. J Mach Learn Res 14(1):2349–2353
Ekins R, Chu FW (1999) Microarrays: their origins and applications. Trends Biotechnol 17(6):217–218
Erickson BJ (2021) Basic artificial intelligence techniques: machine learning and deep learning. Radiol Clin N Am 59(6):933–940. https://doi.org/10.1016/j.rcl.2021.06.004
Esposito S, Carputo D, Cardi T, Tripodi P (2019) Applications and trends of machine learning in genomics and phenomics for next-generation breeding. Plan Theory 9(1):34
Frank E, Hall M, Holmes G, Kirkby R, Pfahringer B, Witten IH, Trigg L (2010) Weka-A machine learning workbench for data mining. In: Data mining and knowledge discovery handbook, pp 1269–1277
Gentleman RC, Carey VJ, Bates DM, Bolstad B, Dettling M, Dudoit S et al (2004) Bioconductor: open software development for computational biology and bioinformatics. Genome Biol 5(10):1–16
Goudbeek M, Swingley D, Smits R (2009) Supervised and unsupervised learning of multidimensional acoustic categories. J Exp Psychol Hum Percept Perform 35(6):1913–1933. https://doi.org/10.1037/a0015781
Gupta R, Srivastava D, Sahu M, Tiwari S, Ambasta RK, Kumar P (2021) Artificial intelligence to deep learning: machine intelligence approach for drug discovery. Mol Divers 25(3):1315–1360. https://doi.org/10.1007/s11030-021-10217-3
Hofmann M, Klinkenberg R (eds) (2016) RapidMiner: data mining use cases and business analytics applications. CRC Press
Karim MR, Beyan O, Zappa A, Costa IG, Rebholz-Schuhmann D, Cochez M, Decker S (2021) Deep learning-based clustering approaches for bioinformatics. Brief Bioinform 22(1):393–415. https://doi.org/10.1093/bib/bbz170
Kelchtermans P, Bittremieux W, De Grave K, Degroeve S, Ramon J, Laukens K et al (2014) Machine learning applications in proteomics research: how the past can boost the future. Proteomics 14(4–5):353–366
Kotu V, Deshpande B (2014) Predictive analytics and data mining: concepts and practice with rapidminer. Morgan Kaufmann
Krohannon A, Srivastava M, Rauch S, Srivastava R, Dickinson BC, Janga SC, Sowary CA (2022) CRISPR-Cas13 guide RNA predictor for transcript depletion. BMC Genomics 23(1):172. https://doi.org/10.1186/s12864-022-08366-2
Langley P (2011) The changing science of machine learning. Mach Learn 82(3):275–279. https://doi.org/10.1007/s10994-011-5242-y
Le NQK, Do DT, Hung TNK, Lam LHT, Huynh TT, Nguyen NTK (2020) A computational framework based on ensemble deep neural networks for essential genes identification. Int J Mol Sci 21:9070. https://doi.org/10.3390/ijms21239070
Li R, Li L, Xu Y, Yang J (2022) Machine learning meets omics: applications and perspectives. Brief Bioinform 23(1):bbab460
Libbrecht MW, Noble WS (2015) Machine learning applications in genetics and genomics. Nat Rev Genet 16(6):321–332
Liu C, Che D, Liu X, Song Y (2013) Applications of machine learning in genomics and systems biology. Comput Math Methods Med 2013:587492
Liu Y, Qiao N, Altinel Y (2021) Reinforcement learning in Neurocritical and neurosurgical care: principles and possible applications. Comput Math Methods Med 6657119:1. https://doi.org/10.1155/2021/6657119
Liu L, Zhai W, Wang F, Yu L, Zhou F, **ang Y, Huang S, Zheng C, Yuan Z, He Y, Yu Z, Ji J (2022) Using machine learning to identify gene interaction networks associated with breast cancer. BMC Cancer 22(1):1070. https://doi.org/10.1186/s12885-022-10170-w
Medin DL, Schaffer MM (1978) Context theory of classification learning. Psychol Rev 85(3):207–238. https://doi.org/10.1037/0033-295X.85.3.207
Meyer D, Wien FT (2001) Support vector machines. R News 1(3):23–26
Min S, Lee B, Yoon S (2017) Deep learning in bioinformatics. Brief Bioinform 18:851–869. https://doi.org/10.1093/bib/bbw068
Mohsen Y-N, Earl H, Dan T, John S, Milad E (2021) Application of machine learning algorithms in plant breeding: predicting yield from hyperspectral reflectance in soybean? Front Plant Sci 11:624273. https://doi.org/10.3389/fpls.2020.624273
Mou M, Pan Z, Lu M, Sun H, Wang Y, Luo Y, Zhu F (2022) Application of machine learning in spatial proteomics. J Chem Inf Model 62(23):5875–5895
Muggleton SH (2005) Machine learning for systems biology. In: International conference on inductive logic programming. Springer Berlin Heidelberg, Berlin, Heidelberg, pp 416–423
Munjal NK, Fleischman AD, Coller RJ (2023) Machine learning, predicting future hospitalizations, and the importance of perception. Hosp Pediatr 13(5):e114–e116. https://doi.org/10.1542/hpeds.2023-007224
Navada A, Ansari AN, Patil S, Sonkamble BA (2011) Overview of use of decision tree algorithms in machine learning. In IEEE control and system graduate research colloquium. IEEE. p 37–42
Ngiam KY, Khor IW (2019) Big data and machine learning algorithms for health-care delivery. Lancet Oncol 20(5):e262–e273. https://doi.org/10.1016/S1470-2045(19)30149-4. Erratum in: Lancet Oncol. 20(6):293
Nosi V, Luca A, Milan M, Arigoni M, Benvenuti S, Cacchiarelli D, Cesana M, Riccardo S, Di Filippo L, Cordero F et al (2021) MET exon 14 skip**: a case study for the detection of genetic variants in cancer driver genes by deep learning. Int J Mol Sci 22:4217. https://doi.org/10.3390/ijms22084217
Nunez-Iglesias J, Kennedy R, Parag T, Shi J, Chklovskii DB (2013) Machine learning of hierarchical clustering to segment 2D and 3D images. PLoS One 8(8):e71715
Perakakis N, Yazdani A, Karniadakis GE, Mantzoros C (2018) Omics, big data and machine learning as tools to propel understanding of biological mechanisms and to discover novel diagnostics and therapeutics. Metabolism 87:A1–A9
Persson Hoden K, Hu X, Martinez G, Dixelius C (2021) Smart PARE: an R package for efficient identification of true mRNA cleavage sites. Int J Mol Sci 22:4267. https://doi.org/10.3390/ijms22084267
Pirooznia M, Yang JY, Yang MQ, Deng Y (2008) A comparative study of different machine learning methods on microarray gene expression data. BMC Genomics 9:1–13
Pushpakom S, Iorio F, Eyers PA, Escott KJ, Hopper S, Wells A, Doig A, Guilliams T, Latimer J, McNamee C, Norris A, Sanseau P, Cavalla D, Pirmohamed M (2019) Drug repurposing: progress, challenges and recommendations. Nat Rev Drug Discov 18(1):41–58. https://doi.org/10.1038/nrd.2018.168
Reimers M, Carey VJ (2006) Bioconductor: an open source framework for bioinformatics and computational biology. Methods Enzymol 411:119–134
Ripley BD (2001) The R project in statistical computing. MSOR Connections. The Newsletter of the LTSN Maths, Stats & OR Network 1(1):23–25
Saritas MM, Yasar A (2019) Performance analysis of ANN and Naive Bayes classification algorithm for data classification. Int J Intell Syst Appl Eng 7(2):88–91
Sarle Warren S (1994) Neural networks and statistical models. In SUGI 19: proceedings of the nineteenth annual SAS users group international conference. SAS Institute, p 1538–1550. ISBN 9781555446116. OCLC 35546178
Sonagara D, Badheka S (2014) Comparison of basic clustering algorithms. Int J Comput Sci Mob Comput 3(10):58–61
Stuart R, Peter N (2003) Artificial intelligence: a modern approach, 2nd edn. Prentice Hall. ISBN 978-0137903955
Tierney L (2012) The R statistical computing environment. In: Statistical challenges in modern astronomy V. Springer New York, New York, NY, pp 435–447
Vamathevan J, Clark D, Czodrowski P, Dunham I, Ferran E, Lee G, Li B, Madabhushi A, Shah P, Spitzer M, Zhao S (2019) Applications of machine learning in drug discovery and development. Nat Rev Drug Discov 18(6):463–477. https://doi.org/10.1038/s41573-019-0024-5
Venkatesh KK, Strauss RA, Grotegut CA, Heine RP, Chescheir NC, Stringer JSA, Stamilio DM, Menard KM, Jelovsek JE (2020) Machine learning and statistical models to predict postpartum hemorrhage. Obstet Gynecol 135(4):935–944. https://doi.org/10.1097/AOG.0000000000003759
Wang S, Liu D, Ding M, Du Z, Zhong Y, Song T, Zhu J, Zhao R (2021) SE-onion net: a convolution neural network for protein-ligand binding affinity prediction. Front Genet 11:607824. https://doi.org/10.3389/fgene.2020.607824
Weltz J, Volfovsky A, Laber EB (2022) Reinforcement learning methods in public health. Clin Ther 44(1):139–154. https://doi.org/10.1016/j.clinthera.2021.11.002
Yan J, Wang X (2022) Unsupervised and semi-supervised learning: the next frontier in machine learning for plant systems biology. Plant J 111(6):1527–1538. https://doi.org/10.1111/tpj.15905. Epub 2022 Jul 27
Zhou Y, Shi W, Zhao D, **ao S, Wang K, Wang J (2022) Identification of immune-associated genes in diagnosing aortic valve calcification with metabolic syndrome by integrated bioinformatics analysis and machine learning. Front Immunol 13:937886. https://doi.org/10.3389/fimmu.2022.937886
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this chapter
Cite this chapter
Alam, S., Israr, J., Kumar, A. (2024). Artificial Intelligence and Machine Learning in Bioinformatics. In: Singh, V., Kumar, A. (eds) Advances in Bioinformatics. Springer, Singapore. https://doi.org/10.1007/978-981-99-8401-5_16
Download citation
DOI: https://doi.org/10.1007/978-981-99-8401-5_16
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-8400-8
Online ISBN: 978-981-99-8401-5
eBook Packages: Biomedical and Life SciencesBiomedical and Life Sciences (R0)