Abstract
The de novo design of scaffold-focused and target-specific molecular structures using deep learning generative modeling introduces a promising solution to the discovery of novel and potent bioactive drug compounds. Deep learning generative modeling exhibits the creativity that machine intelligence can offer in composing, painting, and even the scratching of novel molecular structures. This chapter mainly covers that how generative chemistry can be effectively applied to the design and generation of scaffold-focused and target-specific small molecules. To this emerging paradigm, the chapter starts with a brief history of artificial intelligence (AI) in drug discovery. Chemical databases, molecular representations, and cheminformatics related tools are covered as the infrastructure. Two example applications of using generative adversarial networks (GAN) and recurrent neural networks (RNN) to realize the de novo compound generation towards the cannabinoid receptor 2 (CB2) are discussed in the chapter. Summary, challenges, and future perspectives follow.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
DiMasi JA, Grabowski HG, Hansen RW (2016) Innovation in the pharmaceutical industry: new estimates of R & D costs. J Health Econ 47:20–33
Wouters OJ, McKee M, Luyten J (2020) Estimated research and development investment needed to bring a new medicine to market, 2009-2018. JAMA 323(9):844–853
Yasi EA, Kruyer NS, Peralta-Yahya P (2020) Advances in G protein-coupled receptor high-throughput screening. Curr Opin Biotechnol 64:210–217
Blay V et al (2020) High-Throughput Screening: today’s biochemical and cell-based approaches. Drug Discov Today 25(10):1807–1821
Ge H et al (2019) Significantly different effects of tetrahydroberberrubine enantiomers on dopamine D1/D2 receptors revealed by experimental study and integrated in silico simulation. J Comput Aided Mol Des 33(4):447–459
Pagadala NS, Syed K, Tuszynski J (2017) Software for molecular docking: a review. Biophys Rev 9(2):91–102
Bian Y-M et al (2019) Computational systems pharmacology analysis of cannabidiol: a combination of chemogenomics-knowledgebase network analysis and integrated in silico modeling and simulation. Acta Pharmacol Sin 40(3):374
Bian Y et al (2017) Integrated in silico fragment-based drug design: case study with allosteric modulators on metabotropic glutamate receptor 5. AAPS J 19(4):1235–1248
Kwon JJ et al (2022) Structure–function analysis of the SHOC2–MRAS–PP1C holophosphatase complex. Nature 609(7926):408–415
Wang J et al (2004) Development and testing of a general amber force field. J Comput Chem 25(9):1157–1174
Vanommeslaeghe K et al (2010) CHARMM general force field: a force field for drug-like molecules compatible with the CHARMM all-atom additive biological force fields. J Comput Chem 31(4):671–690
Hajduk PJ, Greer J (2007) A decade of fragment-based drug design: strategic advances and lessons learned. Nat Rev Drug Discov 6(3):211–219
Yang S-Y (2010) Pharmacophore modeling and applications in drug discovery: challenges and recent advances. Drug Discov Today 15(11–12):444–450
Wieder M et al (2017) Common hits approach: combining pharmacophore modeling and molecular dynamics simulations. J Chem Inf Model 57(2):365–385
Liu Z et al (2020) Discovery of orally bioavailable chromone derivatives as potent and selective BRD4 inhibitors: scaffolding hop**, optimization and pharmacological evaluation. J Med Chem 63(10):5242–5256
Hu Y, Stumpfe D, Bajorath J (2017) Recent advances in scaffold hop**: miniperspective. J Med Chem 60(4):1238–1246
Muegge I, Mukherjee P (2016) An overview of molecular fingerprint similarity search in virtual screening. Expert Opin Drug Discovery 11(2):137–148
Fan Y et al (2019) Investigation of machine intelligence in compound cell activity classification. Mol Pharm 16(11):4472–4484
Minerali E et al (2020) Comparing machine learning algorithms for predicting drug-induced liver injury (DILI). Mol Pharm 17(7):2628–2637
Karras T, et al (2019) Analyzing and improving the image quality of stylegan. ar**v preprint ar**v:1912.04958
Wen T-H, et al (2015) Semantically conditioned lstm-based natural language generation for spoken dialogue systems. ar**v preprint ar**v:1508.01745
Zhavoronkov A et al (2019) Deep learning enables rapid identification of potent DDR1 kinase inhibitors. Nat Biotechnol 37(9):1038–1040
Turing AM (2009) Computing machinery and intelligence. In: Parsing the turing test. Springer, pp 23–65
Chollet F (2018) Deep learning mit Python und Keras: das Praxis-Handbuch vom Entwickler der Keras-Bibliothek. MITP-Verlags GmbH & Co. KG
Segler MH, Preuss M, Waller MP (2018) Planning chemical syntheses with deep neural networks and symbolic AI. Nature 555(7698):604–610
Lipinski CA (2016) Rule of five in 2015 and beyond: target and ligand structural limitations, ligand chemistry structure and drug discovery project decisions. Adv Drug Deliv Rev 101:34–41
Bian Y et al (2019) Prediction of orthosteric and allosteric regulations on cannabinoid receptors using supervised machine learning classifiers. Mol Pharm 16(6):2605–2615
Lo Y-C et al (2018) Machine learning in chemoinformatics and drug discovery. Drug Discov Today 23(8):1538–1546
**g Y et al (2018) Deep learning for drug design: an artificial intelligence paradigm for drug discovery in the big data era. AAPS J 20(3):58
Vamathevan J et al (2019) Applications of machine learning in drug discovery and development. Nat Rev Drug Discov 18(6):463–477
Bian Y et al (2023) Target-driven machine learning-enabled virtual screening (TAME-VS) platform for early-stage hit identification. Front Mol Biosci 10:1163536
Korotcov A et al (2017) Comparison of deep learning with multiple machine learning methods and metrics using diverse drug discovery data sets. Mol Pharm 14(12):4462–4475
Ma XH et al (2009) Comparative analysis of machine learning methods in ligand-based virtual screening of large compound libraries. Comb Chem High Throughput Screen 12(4):344–357
Verma J, Khedkar VM, Coutinho EC (2010) 3D-QSAR in drug design-a review. Curr Top Med Chem 10(1):95–115
Fan F et al (2019) The integration of pharmacophore-based 3D QSAR modeling and virtual screening in safety profiling: a case study to identify antagonistic activities against adenosine receptor, A2A, using 1,897 known drugs. PLoS One 14(1):e0204378
Gladysz R et al (2018) Spectrophores as one-dimensional descriptors calculated from three-dimensional atomic properties: applications ranging from scaffold hop** to multi-target virtual screening. J Chem 10(1):9
Nguyen TT, Nguyen ND, Nahavandi S (2020) Deep reinforcement learning for multiagent systems: a review of challenges, solutions, and applications. IEEE Trans Cybernet 50:3826–3839
LeCun Y, Bengio Y, Hinton G (2015) Deep learning. Nature 521(7553):436–444
Goodfellow I, Bengio Y, Courville A (2016) Deep learning. MIT Press
Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780
Goodfellow I et al (2014) Generative adversarial nets. In: Advances in neural information processing systems
Bian Y, **e X-Q (2021) Generative chemistry: drug discovery with deep learning generative models. J Mol Model 27:1–18
The UniProt Consortium (2017) UniProt: the universal protein knowledgebase. Nucleic Acids Res 45(D1):D158–D169
Berman HM et al (2000) The protein data bank. Nucleic Acids Res 28(1):235–242
Wang R et al (2005) The PDBbind database: methodologies and updates. J Med Chem 48(12):4111–4119
Kim S et al (2019) PubChem 2019 update: improved access to chemical data. Nucleic Acids Res 47(D1):D1102–D1109
Gaulton A et al (2017) The ChEMBL database in 2017. Nucleic Acids Res 45(D1):D945–D954
Papadatos G et al (2016) SureChEMBL: a large-scale, chemically annotated patent document database. Nucleic Acids Res 44(D1):D1220–D1228
Wishart DS et al (2018) DrugBank 5.0: a major update to the DrugBank database for 2018. Nucleic Acids Res 46(D1):D1074–D1082
Sterling T, Irwin JJ (2015) ZINC 15–ligand discovery for everyone. J Chem Inf Model 55(11):2324–2337
Huang Z et al (2014) ASD v2. 0: updated content and novel features focusing on allosteric regulation. Nucleic Acids Res 42(D1):D510–D516
Ruddigkeit L et al (2012) Enumeration of 166 billion organic small molecules in the chemical universe database GDB-17. J Chem Inf Model 52(11):2864–2875
Weininger D (1988) SMILES, a chemical language and information system. 1. Introduction to methodology and encoding rules. J Chem Inf Comput Sci 28(1):31–36
Heller SR et al (2015) InChI, the IUPAC international chemical identifier. J Chem 7(1):23
Durant JL et al (2002) Reoptimization of MDL keys for use in drug discovery. J Chem Inf Comput Sci 42(6):1273–1280
Glen RC et al (2006) Circular fingerprints: flexible molecular descriptors with applications from physical chemistry to ADME. IDrugs 9(3):199
Rogers D, Hahn M (2010) Extended-connectivity fingerprints. J Chem Inf Model 50(5):742–754
Hert J et al (2004) Comparison of fingerprint-based methods for virtual screening using multiple bioactive reference structures. J Chem Inf Comput Sci 44(3):1177–1185
Pérez-Nueno VI et al (2009) APIF: a new interaction fingerprint based on atom pairs and its application to virtual screening. J Chem Inf Model 49(5):1245–1260
Jiang D et al (2021) Could graph neural networks learn better molecular representation for drug discovery? A comparison study of descriptor-based and graph-based models. J Chem 13(1):1–23
Landrum G (2016) Rdkit: open-source cheminformatics software. GitHub and SourceForge 10:3592822
O’Boyle NM et al (2011) Open Babel: an open chemical toolbox. J Chem 3(1):33
Willighagen EL et al (2017) The Chemistry Development Kit (CDK) v2. 0: atom ty**, depiction, molecular formulas, and substructure searching. J Chem 9(1):33
Arabie P, et al (2006) Studies in classification, data analysis, and knowledge organization. https://doi.org/10.1007/3-540-35978-8_34
Abadi M et al (2016) Tensorflow: a system for large-scale machine learning. In: 12th {USENIX} symposium on operating systems design and implementation ({OSDI} 16)
Etaati L (2019) Deep learning tools with cognitive toolkit (CNTK). In: Machine learning with microsoft technologies. Springer, pp 287–302
Team T, et al (2016) Theano: a Python framework for fast computation of mathematical expressions. https://doi.org/10.48550/ar**v.1605.02688
Paszke A et al (2019) PyTorch: an imperative style, high-performance deep learning library. In: Advances in neural information processing systems
Chollet F (2015) keras is an open-source neural-network library written in Python. GitHub. https://github.com/fchollet/keras
Pedregosa F et al (2011) Scikit-learn: machine learning in Python. J Mach Learn Res 12:2825–2830
Bian Y et al (2019) Deep convolutional generative adversarial network (dcGAN) models for screening and design of small molecules targeting cannabinoid receptors. Mol Pharm 16(11):4451–4460
LeCun Y et al (1995) Comparison of learning algorithms for handwritten digit recognition. In: International conference on artificial neural networks, Perth, WA
Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: Advances in neural information processing systems
Zeiler MD, Fergus R (2014) Visualizing and understanding convolutional networks. In: European conference on computer vision. Springer
Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. ar**v preprint ar**v:1409.1556
Heusel M, et al (2017) Gans trained by a two time-scale update rule converge to a nash equilibrium. 12(1):ar**v preprint ar**v:1706.08500
Bian Y, **e X-Q (2022) Artificial intelligent deep learning molecular generative modeling of scaffold-focused and cannabinoid CB2 target-specific small-molecule sublibraries. Cells 11(5):915
Prykhodko O et al (2019) A de novo molecular generation method using latent vector based generative adversarial network. J Chem 11(1):1–13
Moret M et al (2020) Generative molecular design in low data regimes. Nat Mach Intellig 2(3):171–180
Iwamura H et al (2001) In vitro and in vivo pharmacological characterization of JTE-907, a novel selective ligand for cannabinoid CB2 receptor. J Pharmacol Exp Ther 296(2):420–425
Ueda Y et al (2005) Involvement of cannabinoid CB2 receptor-mediated response and efficacy of cannabinoid CB2 receptor inverse agonist, JTE-907, in cutaneous inflammation in mice. Eur J Pharmacol 520(1–3):164–171
Yang P et al (2012) Lead discovery, chemistry optimization, and biological evaluation studies of novel biamide derivatives as CB2 receptor inverse agonists and osteoclast inhibitors. J Med Chem 55(22):9973–9987
Pertwee R et al (1995) AM630, a competitive cannabinoid receptor antagonist. Life Sci 56(23–24):1949–1955
Ross RA et al (1999) Agonist-inverse agonist characterization at CB1 and CB2 cannabinoid receptors of L759633, L759656 and AM630. Br J Pharmacol 126(3):665
Yang P et al (2013) Novel triaryl sulfonamide derivatives as selective cannabinoid receptor 2 inverse agonists and osteoclast inhibitors: discovery, optimization, and biological evaluation. J Med Chem 56(5):2045–2058
Acknowledgments
The authors acknowledge the funding support to the **e Laboratory and CDAR Center from the NIH (R01DA052329, P30PDA035778A and R56AG074951).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Ethics declarations
All authors have no conflict of interest to declare.
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this chapter
Cite this chapter
Bian, Y., Hou, G., **e, XQ. (2023). Artificial Intelligence Generative Chemistry Design of Target-Specific Scaffold-Focused Small Molecule Drug Libraries. In: Jagadeesh, G., Balakumar, P., Senatore, F. (eds) The Quintessence of Basic and Clinical Research and Scientific Publishing. Springer, Singapore. https://doi.org/10.1007/978-981-99-1284-1_31
Download citation
DOI: https://doi.org/10.1007/978-981-99-1284-1_31
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-1283-4
Online ISBN: 978-981-99-1284-1
eBook Packages: Biomedical and Life SciencesBiomedical and Life Sciences (R0)