Log in

Physicochemical graph neural network for learning protein–ligand interaction fingerprints from sequence data

  • Article
  • Published:

From Nature Machine Intelligence

View current issue Submit your manuscript

A preprint version of the article is available at bioRxiv.

Abstract

In drug discovery, determining the binding affinity and functional effects of small-molecule ligands on proteins is critical. Current computational methods can predict these protein–ligand interaction properties but often lose accuracy without high-resolution protein structures and falter in predicting functional effects. Here we introduce PSICHIC (PhySIcoCHemICal graph neural network), a framework incorporating physicochemical constraints to decode interaction fingerprints directly from sequence data alone. This enables PSICHIC to attain capabilities in decoding mechanisms underlying protein–ligand interactions, achieving state-of-the-art accuracy and interpretability. Trained on identical protein–ligand pairs without structural data, PSICHIC matched and even surpassed leading structure-based methods in binding-affinity prediction. In an experimental library screening for adenosine A1 receptor agonists, PSICHIC discerned functional effects effectively, ranking the sole novel agonist within the top three. PSICHIC’s interpretable fingerprints identified protein residues and ligand atoms involved in interactions, and helped in unveiling selectivity determinants of protein–ligand interaction. We foresee PSICHIC resha** virtual screening and deepening our understanding of protein–ligand interactions.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1: PSICHIC (PhySIcoCHemICal graph neural network).
Fig. 2: Generalizability of interaction fingerprints.
Fig. 3: Interpretability of interaction fingerprints.
Fig. 4: Virtual screening with interaction fingerprints.
Fig. 5: Selectivity profiling with interaction fingerprints.

Similar content being viewed by others

Data availability

All raw and benchmark data resources used are publicly available. The raw data were obtained from the following databases: Protein Data Bank (https://www.rcsb.org)62, UniProt (https://www.uniprot.org)63, PDBBind (http://www.pdbbind.org.cn/)30, ExCAPE-ML (https://solr.ideaconsult.net/search/excape/)36 and Papyrus37,64. All datasets used in this study for training and testing the models, including the manually curated protein–ligand functional effect dataset and the large-scale interaction dataset, are made publicly available65. Source data are provided with this paper.

Code availability

A GitHub repository containing the source code and data files for retraining and evaluating PSICHIC is available at https://github.com/huankoh/PSICHIC (ref. 66). The repository contains a user-friendly, open-source online platform for PSICHIC’s virtual screening application, integrated with Google Colaboratory for easy web-based interaction. The weights of the trained PSICHIC model are also available in the repository.

References

  1. Kitchen, D. B., Decornez, H., Furr, J. R. & Bajorath, J. Docking and scoring in virtual screening for drug discovery: methods and applications. Nat. Rev. Drug Discov. 3, 935–949 (2004).

    Article  Google Scholar 

  2. Hopkins, A. L. Predicting promiscuity. Nature 462, 167–168 (2009).

    Article  Google Scholar 

  3. Chen, L. et al. TransformerCPI: improving compound–protein interaction prediction by sequence-based deep learning with self-attention mechanism and label reversal experiments. Bioinformatics 36, 4406–4414 (2020).

    Article  MathSciNet  Google Scholar 

  4. Jiang, M. et al. Drug–target affinity prediction using graph neural network and contact maps. RSC Adv. 10, 20701–20712 (2020).

    Article  Google Scholar 

  5. Bagherian, M. et al. Machine learning approaches and databases for prediction of drug–target interaction: a survey paper. Brief. Bioinform. 22, 247–269 (2021).

    Article  Google Scholar 

  6. Li, S. et al. Structure-aware interactive graph neural networks for the prediction of protein–ligand binding affinity. In Proc. 27th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining 975–985 (Association for Computing Machinery, 2021).

  7. Dhakal, A., McKay, C., Tanner, J. J. & Cheng, J. Artificial intelligence in the prediction of protein–ligand interactions: recent advances and future directions. Brief. Bioinform. 23, bbab476 (2022).

    Article  Google Scholar 

  8. Lu, W. et al. TANKBind: trigonometry-aware neural networks for drug–protein binding structure prediction. Adv. Neural Inf. Process. Syst. 35, 7236–7249 (2022).

    Google Scholar 

  9. Bai, P., Miljković, F., John, B. & Lu, H. Interpretable bilinear attention network with domain adaptation improves drug–target prediction. Nat. Mach. Intell. 5, 126–136 (2023).

    Article  Google Scholar 

  10. Ng, H. W. et al. Competitive molecular docking approach for predicting estrogen receptor subtype α agonists and antagonists. BMC Bioinf. 15, S4 (2014).

    Article  Google Scholar 

  11. Rodríguez, D., Gao, Z.-G., Moss, S. M., Jacobson, K. A. & Carlsson, J. Molecular docking screening using agonist-bound GPCR structures: probing the A2A adenosine receptor. J. Chem. Inf. Model. 55, 550–563 (2015).

    Article  Google Scholar 

  12. Kooistra, A. J., Leurs, R., de Esch, I. J. P. & de Graaf, C. Structure-based prediction of G-protein-coupled receptor ligand function: a β-adrenoceptor case study. J. Chem. Inf. Model. 55, 1045–1061 (2015).

    Article  Google Scholar 

  13. Cai, T., Abbu, K. A., Liu, Y. & **e, L. DeepREAL: a deep learning powered multi-scale modeling framework for predicting out-of-distribution ligand-induced GPCR activity. Bioinformatics 38, 2561–2570 (2022).

    Article  Google Scholar 

  14. Michel, M., Menéndez Hurtado, D. & Elofsson, A. PconsC4: fast, accurate and hassle-free contact predictions. Bioinformatics 35, 2677–2679 (2018).

    Article  Google Scholar 

  15. Rao, R., Meier, J., Sercu, T., Ovchinnikov, S. & Rives, A. Transformer protein language models are unsupervised structure learners. In Proc. 8th International Conference on Learning Representations (ICLR, 2020).

  16. Lin, Z. et al. Evolutionary-scale prediction of atomic-level protein structure with a language model. Science 379, 1123–1130 (2023).

    Article  MathSciNet  Google Scholar 

  17. Jiang, M. et al. Sequence-based drug-target affinity prediction using weighted graph neural networks. BMC Genomics 23, 449 (2022).

    Article  Google Scholar 

  18. Wang, P. et al. Structure-aware multimodal deep learning for drug–protein interaction prediction. J. Chem. Inf. Model. 62, 1308–1317 (2022).

    Article  Google Scholar 

  19. Gainza, P. et al. Deciphering interaction fingerprints from protein molecular surfaces using geometric deep learning. Nat. Methods 17, 184–192 (2020).

    Article  Google Scholar 

  20. Jumper, J. et al. Highly accurate protein structure prediction with AlphaFold. Nature 596, 583–589 (2021).

    Article  Google Scholar 

  21. Wong, F. et al. Benchmarking AlphaFold‐enabled molecular docking predictions for antibiotic discovery. Mol. Syst. Biol. 18, e11081 (2022).

    Article  Google Scholar 

  22. He, X. et al. AlphaFold2 versus experimental structures: evaluation on G protein-coupled receptors. Acta Pharmacol. Sin. 44, 1–7 (2023).

    Article  Google Scholar 

  23. Nguyen, T. et al. GraphDTA: predicting drug–target binding affinity with graph neural networks. Bioinformatics 37, 1140–1147 (2021).

    Article  Google Scholar 

  24. Corso, G., Stärk, H., **g, B., Barzilay, R. & Jaakkola, T. S. DiffDock: diffusion steps, twists, and turns for molecular docking. In Proc. 10th International Conference on Learning Representations (ICLR, 2020).

  25. Somnath, V. R., Bunne, C. & Krause, A. Multi-scale representation learning on proteins. Adv. Neural Inf. Process. Syst. 34, 25244–25255 (2021).

    Google Scholar 

  26. Corso, G., Cavalleri, L., Beaini, D., Liò, P. & Veličković, P. Principal neighbourhood aggregation for graph nets. Adv. Neural Inf. Process. Syst. 33, 13260–13271 (2020).

    Google Scholar 

  27. Rarey, M. & Dixon, J. S. Feature trees: a new molecular similarity measure based on tree matching. J. Comput. Aided Mol. Des. 12, 471–490 (1998).

    Article  Google Scholar 

  28. **, W., Barzilay, R. & Jaakkola, T. Junction tree variational autoencoder for molecular graph generation. In Proc. 35th International Conference on Machine Learning 2323–2332 (PMLR, 2018).

  29. Bianchi, F. M., Grattarola, D. & Alippi, C. Spectral clustering with graph neural networks for graph pooling. In Proc. 37th International Conference on Machine Learning 874–883 (PMLR, 2020).

  30. Su, M. et al. Comparative assessment of scoring functions: the CASF-2016 update. J. Chem. Inf. Model. 59, 895–913 (2019).

    Article  Google Scholar 

  31. Stärk, H., Ganea, O., Pattanaik, L., Barzilay, D. R. & Jaakkola, T. EquiBind: geometric deep learning for drug binding structure prediction. In Proc. 39th International Conference on Machine Learning 20503–20521 (PMLR, 2022).

  32. Huang, K., **ao, C., Glass, L. M. & Sun, J. MolTrans: molecular interaction transformer for drug–target interaction prediction. Bioinformatics 37, 830–836 (2021).

    Article  Google Scholar 

  33. Zitnik, M., Sosič, R., Maheshwari, S. & Leskovec, J. BioSNAP Datasets: Stanford Biomedical Network Dataset Collection (Stanford Univ., 2018); https://snap.stanford.edu/biodata

  34. Liu, T., Lin, Y., Wen, X., Jorissen, R. N. & Gilson, M. K. BindingDB: a web-accessible database of experimentally determined protein–ligand binding affinities. Nucleic Acids Res. 35, D198–D201 (2007).

    Article  Google Scholar 

  35. Liu, Z. et al. PDB-wide collection of binding data: current status of the PDBbind database. Bioinformatics 31, 405–412 (2015).

    Article  Google Scholar 

  36. Sun, J. et al. ExCAPE-DB: an integrated large scale dataset facilitating big data analysis in chemogenomics. J. Cheminform. 9, 17 (2017).

    Article  Google Scholar 

  37. Béquignon, O. J. M. et al. Papyrus: a large-scale curated dataset aimed at bioactivity predictions. J. Cheminform. 15, 3 (2023).

    Article  Google Scholar 

  38. Cortellis Drug Discovery Intelligence (Clarivate, 2023); https://www.cortellis.com/drugdiscovery/

  39. Lin, H. et al. Discovery of potent and selective covalent protein arginine methyltransferase 5 (PRMT5) inhibitors. ACS Med. Chem. Lett. 10, 1033–1038 (2019).

    Article  Google Scholar 

  40. Rusere, L. N. et al. HIV-1 protease inhibitors incorporating stereochemically defined P2′ ligands to optimize hydrogen bonding in the substrate envelope. J. Med. Chem. 62, 8062–8079 (2019).

    Article  Google Scholar 

  41. Yilmaz, N. K., Swanstrom, R. & Schiffer, C. A. Improving viral protease inhibitors to counter drug resistance. Trends Microbiol. 24, 547–557 (2016).

    Article  Google Scholar 

  42. Draper-Joyce, C. J. et al. Structure of the adenosine-bound human adenosine A1 receptor–Gi complex. Nature 558, 559–563 (2018).

    Article  Google Scholar 

  43. Mendez, D. et al. ChEMBL: towards direct deposition of bioassay data. Nucleic Acids Res. 47, D930–D940 (2019).

    Article  Google Scholar 

  44. Bento, A. P. et al. An open source chemical structure curation pipeline using RDKit. J. Cheminform. 12, 51 (2020).

    Article  Google Scholar 

  45. Nguyen, A. T. N. et al. Extracellular loop 2 of the adenosine A1 receptor has a key role in orthosteric ligand affinity and agonist efficacy. Mol. Pharmacol. 90, 703–714 (2016).

    Article  Google Scholar 

  46. Roth, B. L., Sheffler, D. J. & Kroeze, W. K. Magic shotguns versus magic bullets: selectively non-selective drugs for mood disorders and schizophrenia. Nat. Rev. Drug Discov. 3, 353–359 (2004).

    Article  Google Scholar 

  47. Harding, S. D. et al. The IUPHAR/BPS Guide to PHARMACOLOGY in 2024. Nucleic Acids Res. 52, D1438–D1449 (2024).

    Article  Google Scholar 

  48. Jacobson, K. A. & Gao, Z.-G. Adenosine receptors as therapeutic targets. Nat. Rev. Drug Discov. 5, 247–264 (2006).

    Article  Google Scholar 

  49. Perreira, M. et al. “Reversine” and its 2-substituted adenine derivatives as potent and selective A3 adenosine receptor antagonists. J. Med. Chem. 48, 4910–4918 (2005).

    Article  Google Scholar 

  50. Glukhova, A. et al. Structure of the adenosine A1 receptor reveals the basis for subtype selectivity. Cell 168, 867–877.e13 (2017).

    Article  Google Scholar 

  51. Deng, Z., Chuaqui, C. & Singh, J. Structural Interaction Fingerprint (SIFt): a novel method for analyzing three-dimensional protein−ligand binding interactions. J. Med. Chem. 47, 337–344 (2004).

    Article  Google Scholar 

  52. Thal, D. M. et al. Recent advances in the determination of G protein-coupled receptor structures. Curr. Opin. Struct. Biol. 51, 28–34 (2018).

    Article  Google Scholar 

  53. Draper-Joyce, C. J. et al. Positive allosteric mechanisms of adenosine A1 receptor-mediated analgesia. Nature 597, 571–576 (2021).

    Article  Google Scholar 

  54. Jeffrey Conn, P., Christopoulos, A. & Lindsley, C. W. Allosteric modulators of GPCRs: a novel approach for the treatment of CNS disorders. Nat. Rev. Drug Discov. 8, 41–54 (2009).

    Article  Google Scholar 

  55. Freitas, R. Fde & Schapira, M. A systematic analysis of atomic protein–ligand interactions in the PDB. MedChemComm 8, 1970–1981 (2017).

    Article  Google Scholar 

  56. Krivák, R. & Hoksza, D. P2Rank: machine learning based tool for rapid and accurate prediction of ligand binding sites from protein structure. J. Cheminform. 10, 39 (2018).

    Article  Google Scholar 

  57. Cai, T. et al. GraphNorm: a principled approach to accelerating graph neural network training. In Proc. 38th International Conference on Machine Learning 1204–1215 (PMLR, 2021).

  58. Kingma, D. P. & Ba, J. Adam: a method for stochastic optimization. In Proc. 3rd International Conference on Learning Representations (ICLR, 2015).

  59. Loshchilov, I. & Hutter, F. Decoupled weight decay regularization. In Proc. 6th International Conference on Learning Representations (ICLR, 2018).

  60. Khazanov, N. A. & Carlson, H. A. Exploring the composition of protein–ligand binding sites on a large scale. PLoS Comput. Biol. 9, e1003321 (2013).

    Article  Google Scholar 

  61. Baltos, J.-A. et al. Quantification of adenosine A1 receptor biased agonism: implications for drug discovery. Biochem. Pharmacol. 99, 101–112 (2016).

    Article  Google Scholar 

  62. Berman, H. M. et al. The Protein Data Bank. Nucleic Acids Res. 28, 235–242 (2000).

    Article  Google Scholar 

  63. The UniProt Consortium. UniProt: the universal protein knowledgebase in 2023. Nucleic Acids Res. 51, D523–D531 (2023).

    Article  Google Scholar 

  64. Béquignon, O. J. M. et al. Accompanying data - Papyrus - a large scale curated dataset aimed at bioactivity predictions. Zenodo https://doi.org/10.5281/zenodo.10943207 (2024).

  65. Koh, H. Y., Nguyen, A. T. N., Pan, S., May, L. T. & Webb, G. I. Datasets for “Physicochemical graph neural network for learning protein–ligand interaction fingerprints from sequence data”. Zenodo https://doi.org/10.5281/zenodo.10901712 (2024).

  66. Koh, H. Y. huankoh/PSICHIC: v1.0.0. Zenodo https://doi.org/10.5281/zenodo.10901685 (2024).

  67. Stepniewska-Dziubinska, M. M., Zielenkiewicz, P. & Siedlecki, P. Development and evaluation of a deep learning model for protein–ligand binding affinity prediction. Bioinformatics 34, 3666–3674 (2018).

    Article  Google Scholar 

  68. Zheng, L., Fan, J. & Mu, Y. OnionNet: a multiple-layer intermolecular-contact-based convolutional neural network for protein–ligand binding affinity prediction. ACS Omega 4, 15956–15965 (2019).

    Article  Google Scholar 

  69. Jiang, D. et al. InteractionGraphNet: a novel and efficient deep graph representation learning framework for accurate protein–ligand interaction predictions. J. Med. Chem. 64, 18209–18232 (2021).

    Article  Google Scholar 

  70. Koes, D. R., Baumgartner, M. P. & Camacho, C. J. Lessons learned in empirical scoring with smina from the CSAR 2011 benchmarking exercise. J. Chem. Inf. Model. 53, 1893–1904 (2013).

    Article  Google Scholar 

  71. McNutt, A. T. et al. GNINA 1.0: molecular docking with deep learning. J. Cheminform. 13, 43 (2021).

    Article  Google Scholar 

  72. Sverrisson, F., Feydy, J., Correia, B. E. & Bronstein, M. M. Fast end-to-end learning on protein surfaces. In Proc. 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 15272–15281 (IEEE, 2021).

  73. Roy, K. et al. Some case studies on application of “rm2” metrics for judging quality of quantitative structure–activity relationship predictions: emphasis on scaling of response data. J. Comput. Chem. 34, 1071–1082 (2013).

    Article  Google Scholar 

  74. McInnes, L., Healy, J., Saul, N. & Großberger, L. UMAP: uniform manifold approximation and projection. J. Open Source Softw. 3, 861 (2018).

    Article  Google Scholar 

  75. Adasme, M. F. et al. PLIP 2021: expanding the scope of the protein–ligand interaction profiler to DNA and RNA. Nucleic Acids Res. 49, W530–W534 (2021).

    Article  Google Scholar 

Download references

Acknowledgements

Research on adenosine receptor signalling was supported by a National Heart Foundation Future Leader Fellowship (101857 to L.T.M.), National Health and Medical Research Council (NHMRC) of Australia Ideas grant (APP2013629 to L.T.M., G.I.W. and A.T.N.N.) and a Department of Health and Aged Care (MRFF) Stem Cell Therapies Mission grant (MRF2015957 to L.T.M. and A.T.N.N.). H.Y.K.’s scholarship is supported by the Australian Government Research Training Program (RTP) Scholarship and the Australian Research Council under grant ARC DP210100072. High-throughput screening was performed at the National Drug Discovery Centre, WEHI, Parkville, Australia, with support from the Australian Government Medical Research Future Fund (MRFF). Our acknowledgement extends to Cortellis Drug Discovery Intelligence for granting public access to the curated functional effect dataset, and to BioRender for the display elements used in our figures, which were created using BioRender.com. Special thanks to Monash Institute of Pharmaceutical Sciences (MIPS) Monash University for access to the MIPS library, in particular P. Sexton and A. Christopoulos for purchase of the MIPS library and to J. Baell for the design of the library. We thank C. S. Lu for assistance with pharmacological evaluation. Computational resources were generously provided by the Nectar Research Cloud, a collaborative Australian research platform supported by the NCRIS-funded Australian Research Data Commons (ARDC) and the MASSIVE HPC facility. We extend our sincere gratitude to B. K. Koh, H. J. W. Koh and Y. Li for their invaluable feedback on paper writing and figures.

Author information

Authors and Affiliations

Authors

Contributions

H.Y.K. designed and developed the PSICHIC method, evaluated PSICHIC against leading methods, applied it to virtually screen the MIPS library for novel A1R agonists and applied it for selectivity profiling of AR subtypes. A.T.N.N. performed the pharmacological validation of novel A1R agonists and conducted data analysis. H.Y.K. and A.T.N.N. prepared figures and wrote the paper. A.T.N.N., S.P., L.T.M. and G.I.W. supervised the project. Project design, data interpretation and paper preparation were performed by all authors.

Corresponding authors

Correspondence to Anh T. N. Nguyen, Shirui Pan, Lauren T. May or Geoffrey I. Webb.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Nature Machine Intelligence thanks Hai** Lu, and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended Data Fig. 1 Schematic Diagram of PSICHIC Architecture.

The figure should be read from top to bottom. Initially, ligand atom and protein residue graphs pass through a single hidden layer network before entering three physicochemical graph convolutional layers. Within each layer, intramolecular interactions (represented by yellow blocks) are modeled using two independent PNA-GNNs26. PSICHIC then imposes physicochemical constraints (light red blocks) by pooling ligand atoms and protein residues using junction tree decomposition28 and minCUT Clustering29, respectively. PSICHIC models intermolecular interactions (white blocks) in three steps: first, it aggregates ligand functional groups into a “ligand ball” using attentional aggregation; second, it models the interaction strengths between this ligand ball and protein regions through cross-attention message passing, where features are weighted and transferred between the ligand and protein; third, PSICHIC unpools the functional groups and clustered regions back into updated ligand atoms and protein residues. Finally, PSICHIC generates an interaction fingerprint that weighs atoms and residues based on importance scores from intermolecular interactions. This fingerprint feeds into a single-hidden-layer network for predicting interaction properties. The figure can be read in conjunction with the Method section.

Extended Data Fig. 2 Robustness and Ablation Study of PSICHIC.

a, b, Line plots with best-fit linear regression lines and 95% confidence error bands from bootstrap** (points represent data points). a, PSICHIC Absolute Error against ESM2 contact map prediction performance (ROC-AUC) lacks significant correlation relative to the 0.05 p-value threshold (\(P=0.452\), two-sided Pearson’s correlation, \(n=653\)). b, PSICHIC Absolute Error vs. Lipophilic Ligand Efficiency (LLE) also lacks significance relative to the 0.05 p-value threshold (\(P=0.275\), two-sided Pearson’s correlation, \(n=653\)). c, Bar graph comparing average PSICHIC Absolute Error for ligands that violate versus obey Lipinski’s rule of five (RO5). No significant difference between the two ligand types relative to 0.05 p-value threshold (\(P=0.871\) two-sided independent t-test; \(n=653\); bar, mean; error bars, one SEM; points, data points). d, Ablation study of PSICHIC Architecture (\(n=5\); bar, mean; error bars, one SEM; points, data points). Discarding PSICHIC’s physicochemical constraints (w/o Constraints) reduced performance, mirroring the effects seen without interaction modeling (w/o Interaction), highlighting the importance of injecting physicochemical constraints. Omitting the Importance Score mechanism (w/o Importance Score) reduced performance. Without ESM2 embeddings (w/o ESM2 embedding), performance slightly lags behind TankBind8 which utilizes its own protein model embeddings. This comparison is not directly comparable and favors TankBind. e, Line plots for the relationship between PSICHIC’s predicted affinity and confidence in PSICHIC residue importance scores (left), and the correlation of PSICHIC residue importance scores with residues’ binding site proximity (right), with points indicating mean and shaded error band indicating one SEM. PSICHIC with physicochemical constraints (red) and without (grey) showed constraints were central to learning interaction patterns that adhere to physicochemical principles. Without constraints, PSICHIC exhibits stronger confidence in residue importance scores (left). This confidence is misplaced as the scores did not correlate with residues’ proximity to the binding site (right), suggesting that sequence-based methods could overfit the data if constraints are not incorporated.

Extended Data Fig. 3 Multi-Comparison Matrix for Benchmarking Sequence-based Methods.

The matrix displays pairwise comparisons among PSICHIC, DrugBAN, STAMP-DPI, TransCPI, GraphDTA, WGNN-DTA, MolTrans, and DGraphDTA across 9 drug discovery settings from Human, BioSNAP, and BindingDB benchmark datasets. The evaluations are made using random split, unseen ligand scaffold split, and unseen protein target split on three sequence-only benchmark datasets from Human32, BioSNAP33, and BindingDB34, thereby creating 9 drug discovery settings. Performance is assessed using the Area Under the Receiver Operating Characteristics Curve (ROC-AUC). The colors on the Heat Map signify the mean differences in ROC-AUC performance. A positive difference, indicated in red, signifies that the method in the row outperforms the method in the column on average. Each cell contains three lines detailing performance metrics: the first line shows the average difference in ROC-AUC between the method in the row and the method in the column; the second line shows the number of wins/ties/losses across various drug discovery settings; and the third line indicates the exact p-value, determined through the two-sided Wilcoxon Signed Rank Test (\(n=9\) for all models). Text within each cell is presented in bold if the p-value is lower than 0.05.

Extended Data Fig. 4 High-Quality Protein-Ligand Functional Effect Dataset Curation.

PSICHIC, a data-driven framework, can predict various interaction properties after training on labeled sequence datasets. To enable PSICHIC to predict protein-ligand functional effects, we curated data from reputable databases, specifically Cortellis Drug Discovery38, ExCAPE-ML36, and Papyrus37. We gathered samples from Cortellis on 02/02/2023, focusing on the proteins categorized as “Receptor” in the database with over 20 samples (Source: Cortellis Drug Discovery Intelligence, 02 02, 2023 https://www.cortellis.com/drugdiscovery/, ® 2023 Clarivate. All rights reserved.). This resulted in 22,085 agonists and 17,211 antagonists. Due to the absence of negative data, we adopted a rigorous approach to include decoys. From ExCAPE-ML36 and Papyrus37, we selected protein-ligand pairs with pXC50 or pKD/pKI values below 5, focusing on high-quality data in the database. After standardizing the molecular data using the ChEMBL pipeline44, we generated a dataset comprising 160,910 unique protein-ligand pairs: 22,085 agonists, 17,211 antagonists, and 121,614 non-binders, which includes 131 unique protein receptors and 128,122 unique ligands. The dataset also contains assay-dependent potency values (EC50 for agonists and IC50 for antagonists), which were omitted for model training but used for the UMAP plot in Fig. 2. The curated dataset is publicly available, with detailed methodology in Supplementary Material 14.

Extended Data Fig. 5 PSICHIC Pharmacophore: A Case Study with Ligands of the Galectin-3 Protein Target.

a, The line plot illustrates relationships based on the predicted binding affinity by PSICHIC: (1) an orange line shows the relationship between affinity (x-axis) and confidence in residue importance scores (y-axis), the latter being measured by the Gini coefficient, (2) a red line depicts the relationship between affinity (x-axis) and the correlation of PSICHIC residue importance scores with residues’ proximity to the binding site (y-axis). The error band (shaded region) of line plots indicates one standard error of the mean (SEM). b, Complex structures of Galectin-3 with low (PDB ID: 6QLS) and high (PDB ID: 6I76) predicted binding affinities are displayed. A darker red hue on the structures signifies higher PSICHIC importance scores. For low-affinity 6QLS, scores are dispersed—sometimes extending 180 degrees away from the binding site—yet key binding residues are highlighted. Conversely, scores for high-affinity 6I76 are primarily focused on the binding sites. c, PSICHIC’s ligand atom importance scores spotlight key functional groups, especially fluorine atoms, in their interaction with Galectin-3. The 14 binding ligands, 11 of which share the same scaffold (*), are ordered based on PSICHIC-predicted affinities. The emphasis of PSICHIC on the fluorine functional groups aligns with the original studies, where fluorine plays a critical role in forming fluorine–amide interactions with Galectin-3’s binding site. d, As a control, PSICHIC did not universally prioritize fluorine atoms when the fluorine functional groups do not form important interactions with the target protein. Further details are provided in Supplementary Methods 17.

Extended Data Fig. 6 PSICHICXL Development using Large-scale (XL) Interaction Dataset.

a, Pipeline for constructing Large-scale (XL) Interaction Dataset. a-I., From the Protein-Ligand Functional Effect dataset we curated, samples comprising a protein, ligand, and functional label (agonist, antagonist, or non-binder) were extracted. a-II., From ExCAPE-ML36, protein-ligand pairs with pXC50 values below 5 were labeled as non-binders and above 7 as binders. a-III., From Papyrus37, we selected pairs with high-quality binding affinity, labeling those with pKD/KI below 5 as non-binders and above 7 as binders. a-IV., These databases were standardized and normalized using the ChEMBL pipeline44 and combined to form a large-scale dataset of approximately 3 million unique protein-ligand pairs, labeled with either binding affinity or functional effect. The final dataset contains 618,247 fully labeled and 2,341,057 partially labeled pairs, encompassing 5,107 unique proteins and 1,084,834 unique ligands. b, Multi-task optimization of PSICHICXL on Large-scale (XL) Interaction Dataset. Schematic illustrating the multi-objective loss function for handling partially labeled protein-ligand pairs in the Large-scale (XL) Interaction Dataset. Two scenarios of partial annotation were addressed: (i) when only the functional effect class (agonist, antagonist, or non-binder) is known, only the cross-entropy loss is calculated; (ii) when only binding affinity is known and ligand is a binder (that is, the functional effect of the ligand on its protein target is not known), the cross-entropy loss function is minimized as \((-\log \left(p\ne \text{non-binder}\right))\) or \((-\left(\log \left({p}_{\text{agonist}}\right)+\log \left({p}_{\text{antagonist}}\right)\right))\), together with the mean-squared-error loss applied to the binding affinity predictions. Refer to Methods and Supplementary Methods 19 for full details on training PSICHICXL on the Large-scale (XL) Interaction Dataset.

Extended Data Fig. 7 PSICHICXL: Extensive Exposure to Broad Spectrum of Protein Families and Types.

a, Challenges in Training PSICHICXL on the Large-scale Interaction Dataset: Training with random sampling presented an overfitting risk due to uneven data distribution. 50 out of 5,107 proteins (about 1%) accounted for over half of the dataset (a-I.), with many being non-binders (a-II.). A few proteins comprised a large portion of the data, increasing the likelihood of being sampled (a-III.; red line, median; box limits, upper and lower quartiles; whiskers, 1.5× interquartile range; points, outliers; \(n=\mathrm{5,107}\)). b, Protein Representation (pie chart - outer: family; inner: proteins within families): The data show diversity across protein families (ChEMBL level 2). However, some proteins are disproportionately represented within the families; for example, 2 proteins (UniProt IDs: Q03431 and P43220) comprise 86% of ‘GPCR: Others’, causing an imbalance. c, Optimizing Sampling for PSICHICXL: A good training data landscape for PSICHICXL should balance distribution both across and within protein families, as depicted in the pie chart. d, This was established through a three-step process. First, we categorized each protein-ligand interaction as either a binder or non-binder (for example, ProteinA_Binder to ProteinZ_Non-Binder). Second, we weighted groups by the square root of the unique scaffold number, with a 90th percentile cap to prevent overrepresentation. Third, we probabilistically select a group, then a unique scaffold within it, followed by a ligand with that scaffold, ensuring diversely represented protein-ligand interactions. The training distribution is depicted in pie charts (c, d-I). Agonists and antagonists were slightly underrepresented (d-II). Hence, PSICHICXL can be further fine-tuned using a subset of the large-scale interaction dataset. Nonetheless, given PSICHICXL’s training on a wide range of proteins and ligands (c, d), PSICHICXL should be highly effective for most proteins without additional fine-tuning, and is recommended for general use (d-III; red line, median; box limits, upper and lower quartiles; whiskers, 1.5× interquartile range; points, outliers; \(n=\mathrm{5,107}\)).

Extended Data Table 1 Dataset Statistics, Splits, and Interaction Properties for Benchmark and Application Datasets
Extended Data Table 2 PDBBind Binding Affinity Predictions: Comprehensive Statistical Analysis with Mean ± Standard Deviation (Average Test Set Affinity: PDBBind v2016 – 6.450; PDBBind v2020 – 6.492)
Extended Data Table 3 Table of Notation for PSICHIC Architecture

Supplementary information

Supplementary Information

Supplementary Methods 1–24, Figs. 1–12 and Tables 1–12.

Source data

Source Data Fig. 2

Source data for plotting the figure.

Source Data Fig. 3

Source data for plotting the figure.

Source Data Fig. 4

Source data for plotting the figure.

Source Data Fig. 5

Source data for plotting the figure.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Koh, H.Y., Nguyen, A.T.N., Pan, S. et al. Physicochemical graph neural network for learning protein–ligand interaction fingerprints from sequence data. Nat Mach Intell 6, 673–687 (2024). https://doi.org/10.1038/s42256-024-00847-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/s42256-024-00847-1

  • Springer Nature Limited

Navigation