CoProver: A Recommender System for Proof Construction

Yeh, Eric; Hitaj, Briland; Owre, Sam; Quemener, Maena; Shankar, Natarajan

doi:10.1007/978-3-031-42753-4_16

Eric Yeh⁹,
Briland Hitaj⁹,
Sam Owre⁹,
Maena Quemener⁹ &
…
Natarajan Shankar⁹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 14101))

Included in the following conference series:

International Conference on Intelligent Computer Mathematics

340 Accesses
1 Altmetric

Abstract

Interactive Theorem Provers (ITPs) are an indispensable tool in the arsenal of formal method experts as a platform for construction and (formal) verification of proofs. The complexity of the proofs in conjunction with the level of expertise typically required for the process to succeed can often hinder the adoption of ITPs. A recent strain of work has investigated methods to incorporate machine learning models trained on ITP user activity traces as a viable path towards full automation. While a valuable line of investigation, many problems still require human supervision to be completed fully, thus applying learning methods to assist the user with useful recommendations can prove more fruitful. Following the vein of user assistance, we introduce CoProver, a proof recommender system based on transformers, capable of learning from past actions during proof construction, all while exploring knowledge stored in the ITP concerning previous proofs. CoProver employs a neurally learnt sequence-based encoding of sequents, capturing long distance relationships between terms and hidden cues therein. We couple CoProver with the Prototype Verification System (PVS) and evaluate its performance on two key areas, namely: (1) Next Proof Action Recommendation, and (2) Relevant Lemma Retrieval given a library of theories. We evaluate CoProver on a series of well-established metrics originating from the recommender system and information retrieval communities, respectively. We show that CoProver successfully outperforms prior state of the art applied to recommendation in the domain. We conclude by discussing future directions viable for CoProver (and similar approaches) such as argument prediction, proof summarization, and more.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

EUR 32.99 /Month

Get 10 units per month
Download Article/Chapter or Ebook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Subscribe now

Buy Now

Chapter: EUR 29.95; Price includes VAT (Germany)

eBook: EUR 53.49; Price includes VAT (Germany)

Softcover Book: EUR 70.61; Price includes VAT (Germany)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Proof Guidance in PVS with Sequential Pattern Mining

The Role of Entropy in Guiding a Connection Prover

Simple Dataset for Proof Method Recommendation in Isabelle/HOL

Notes

1.
https://shemesh.larc.nasa.gov/fm/pvs/PVS-library/.
2.
A step is a single forward-predict pass over a training instance, and multiple steps can be performed over the same data during the training phase.
3.
Initial experiments with larger samples showed no difference in performance with a system trained with the smaller set.
4.
https://www.pytorchlightning.ai/.
5.
https://scikit-learn.org/.

References

Alama, J., Kühlwein, D., Tsivtsivadze, E., Urban, J., Heskes, T.: Premise selection for mathematics by corpus analysis and kernel methods. CoRR abs/1108.3446 (2011). ar**v:1108.3446
Balunovic, M., Bielik, P., Vechev, M.T.: Learning to solve SMT formulas. In: NeurIPS, pp. 10338–10349 (2018)
Google Scholar
Bansal, K., Loos, S., Rabe, M., Szegedy, C., Wilcox, S.: Holist: An environment for machine learning of higher order logic theorem proving. In: International Conference on Machine Learning, pp. 454–463. PMLR (2019)
Google Scholar
Biere, A., Heule, M., van Maaren, H., Walsh, T. (eds.): Handbook of Satisfiability. IOS Press (2009)
Google Scholar
Blanchette, J.C., Greenaway, D., Kaliszyk, C., Kühlwein, D., Urban, J.: A learning-based fact selector for Isabelle/HOL. J. Autom. Reason. 57(3), 219–244 (2016)
Google Scholar
Bromley, J., Guyon, I., LeCun, Y., Säckinger, E., Shah, R.: Signature verification using a "siamese" time delay neural network. In: Advances in Neural Information Processing Systems, vol. 6 (1993)
Google Scholar
Brown, T.B., et al.: Language models are few-shot learners. CoRR abs/2005.14165 (2020). ar**v:2005.14165
Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: Pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pp. 4171–4186. Association for Computational Linguistics, Minneapolis, Minnesota (Jun 2019). 10.18653/v1/N19-1423, http://aclanthology.org/N19-1423
Efron, B., Tibshirani, R.J.: An Introduction to the Bootstrap. No. 57 in Monographs on Statistics and Applied Probability, Chapman & Hall/CRC, Boca Raton, Florida, USA (1993)
Google Scholar
First, E., Brun, Y., Guha, A.: Tactok: semantics-aware proof synthesis. In: Proceedings of the ACM on Programming Languages 4(OOPSLA), pp. 1–31 (2020)
Google Scholar
Gage, P.: A new algorithm for data compression. C Users J. D(2), 23–38 (1994)
Google Scholar
Gauthier, T., Kaliszyk, C., Urban, J.: Learning to reason with hol4 tactics. ar**v preprint ar**v:1804.00595 (2018)
Gauthier, T., Kaliszyk, C., Urban, J., Kumar, R., Norrish, M.: Learning to prove with tactics. CoRR abs/1804.00596 (2018), ar**v:1804.00596
Gauthier, T., Kaliszyk, C., Urban, J., Kumar, R., Norrish, M.: Tactictoe: learning to prove with tactics. J. Autom. Reason. 65(2), 257–286 (2021)
Article MathSciNet MATH Google Scholar
Gransden, T., Walkinshaw, N., Raman, R.: SEPIA: search for proofs using inferred automata. In: Felty, A.P., Middeldorp, A. (eds.) CADE 2015. LNCS (LNAI), vol. 9195, pp. 246–255. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-21401-6_16
Chapter Google Scholar
Hsu, C.W., Chang, C.C., Lin, C.J.: A practical guide to support vector classification. Tech. rep., Department of Computer Science, National Taiwan University (2003). http://www.csie.ntu.edu.tw/~cjlin/papers.html
Huang, D., Dhariwal, P., Song, D., Sutskever, I.: Gamepad: A learning environment for theorem proving. ar**v preprint ar**v:1806.00608 (2018)
Huang, P.S., He, X., Gao, J., Deng, L., Acero, A., Heck, L.: Learning deep structured semantic models for web search using clickthrough data. In: Proceedings of the 22nd ACM International Conference on Information and Knowledge Management, pp. 2333–2338. CIKM ’13, Association for Computing Machinery, New York, NY, USA (2013). https://doi.org/10.1145/2505515.2505665
Irving, G., Szegedy, C., Alemi, A.A., Eén, N., Chollet, F., Urban, J.: Deepmath-deep sequence models for premise selection. Adv. Neural. Inf. Process. Syst. 29, 2235–2243 (2016)
Google Scholar
Jakubův, J., Urban, J.: ENIGMA: efficient learning-based inference guiding machine. In: Geuvers, H., England, M., Hasan, O., Rabe, F., Teschke, O. (eds.) CICM 2017. LNCS (LNAI), vol. 10383, pp. 292–302. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-62075-6_20
Chapter Google Scholar
Jiang, A.Q., et al.: Thor: Wielding hammers to integrate language models and automated theorem provers (2022). https://doi.org/10.48550/ARXIV.2205.10893, ar**v:2205.10893
Jiang, A.Q., et al.: Draft, sketch, and prove: Guiding formal theorem provers with informal proofs (2022). https://doi.org/10.48550/ARXIV.2210.12283, ar**v:2210.12283
Jones, K.S.: A statistical interpretation of term specificity and its application in retrieval. J. Document. 28, 11–21 (1972)
Article Google Scholar
Kaliszyk, C., Urban, J.: Femalecop: Fairly efficient machine learning connection prover. In: Logic for Programming, Artificial Intelligence, and Reasoning. pp. 88–96. Springer (2015)
Google Scholar
Komendantskaya, E., Heras, J., Grov, G.: Machine learning in proof general: Interfacing interfaces. ar**v preprint ar**v:1212.3618 (2012)
Kühlwein, D., Blanchette, J.C., Kaliszyk, C., Urban, J.: MaSh: machine learning for sledgehammer. In: Blazy, S., Paulin-Mohring, C., Pichardie, D. (eds.) ITP 2013. LNCS, vol. 7998, pp. 35–50. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-39634-2_6
Chapter Google Scholar
Lample, Get al.: Hypertree proof search for neural theorem proving (2022). https://doi.org/10.48550/ARXIV.2205.11491, ar**v:2205.11491
Li, S., et al.: PyTorch Distributed: Experiences on accelerating data parallel training. CoRR abs/2006.15704 (2020), ar**v:2006.15704
Li, W., Yu, L., Wu, Y., Paulson, L.C.: Isarstep: a benchmark for high-level mathematical reasoning. ar**v preprint ar**v:2006.09265 (2020)
Liu, Y., et al.: RoBERTa: A robustly optimized BERT pretraining approach (2019). https://doi.org/10.48550/ARXIV.1907.11692, ar**v:1907.11692
Loos, S., Irving, G., Szegedy, C., Kaliszyk, C.: Deep network guided proof search. ar**v preprint ar**v:1701.06972 (2017)
Luhn, H.P.: A statistical approach to mechanized encoding and searching of literary information. IBM J. Res. Dev. 1(4), 309–317 (1957). https://doi.org/10.1147/rd.14.0309
Article MathSciNet Google Scholar
Manning, C.D., Raghavan, P., Schatze, H.: Introduction to Information Retrieval. Cambridge University Press, Cambridge, UK (2008). http://nlp.stanford.edu/IR-book/information-retrieval-book.html
Mikuła, M., et al.: Magnushammer: A transformer-based approach to premise selection (2023)
Google Scholar
Mitra, B., Craswell, N.: (2018)
Google Scholar
Otten, J., Bibel, W.: leancop: lean connection-based theorem proving. J. Symb. Comput. 36(1–2), 139–161 (2003)
Article MathSciNet MATH Google Scholar
Owre, S., Rushby, J.M., Shankar, N.: PVS: a prototype verification system. In: Kapur, D. (ed.) CADE 1992. LNCS, vol. 607, pp. 748–752. Springer, Heidelberg (1992). https://doi.org/10.1007/3-540-55602-8_217
Chapter Google Scholar
Rabe, M.N., Szegedy, C.: Towards the automatic mathematician. In: Platzer, A., Sutcliffe, G. (eds.) CADE 2021. LNCS (LNAI), vol. 12699, pp. 25–37. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-79876-5_2
Chapter Google Scholar
Raffel, C., et al.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019), ar**v:1910.10683
Reimers, N., Gurevych, I.: Sentence-BERT: Sentence embeddings using Siamese BERT-Networks. CoRR abs/1908.10084 (2019). ar**v:1908.10084
Schulz, S.: E - a brainiac theorem prover. J. AI Commun. 15(2/3), 111–126 (2002)
MATH Google Scholar
Selsam, D., Bjørner, N.: Guiding high-performance SAT solvers with Unsat-core predictions. In: Janota, M., Lynce, I. (eds.) SAT 2019. LNCS, vol. 11628, pp. 336–353. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-24258-9_24
Chapter MATH Google Scholar
Sennrich, R., Haddow, B., Birch, A.: Neural machine translation of rare words with subword units. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 1715–1725. Association for Computational Linguistics, Berlin, Germany (Aug 2016). https://doi.org/10.18653/v1/P16-1162, http://aclanthology.org/P16-1162
Shankar, N.: Automated reasoning, fast and slow. In: Proceedings of the 24th international conference on Automated Deduction, pp. 145–161. CADE’13, Springer-Verlag, Berlin, Heidelberg (2013). https://doi.org/10.1007/978-3-642-38574-2_10, http://dx.doi.org/10.1007/978-3-642-38574-2_10
Urban, J., Vyskočil, J., Štěpánek, P.: MaLeCoP machine learning connection prover. In: Brünnler, K., Metcalfe, G. (eds.) TABLEAUX 2011. LNCS (LNAI), vol. 6793, pp. 263–277. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-22119-4_21
Chapter Google Scholar
Vaswani, A.,et al.: Attention is all you need. In: Advances in Neural Information Processing Systems, vol. 30 (2017)
Google Scholar
Wang, M., Tang, Y., Wang, J., Deng, J.: Premise selection for theorem proving by deep graph embedding. ar**v preprint ar**v:1709.09994 (2017)
Whalen, D.: Holophrasm: a neural automated theorem prover for higher-order logic. ar**v preprint ar**v:1608.02644 (2016)
Wolf, T., et al.: Transformers: State-of-the-art natural language processing. In: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, pp. 38–45. Association for Computational Linguistics, Online (Oct 2020), www.aclweb.org/anthology/2020.emnlp-demos.6
Wu, M., Norrish, M., Walder, C., Dezfouli, A.: Tacticzero: Learning to prove theorems from scratch with deep reinforcement learning. ar**v preprint ar**v:2102.09756 (2021)
Yang, K., Deng, J.: Learning to prove theorems via interacting with proof assistants. In: International Conference on Machine Learning, pp. 6984–6994. PMLR (2019)
Google Scholar

Download references

Acknowledgement

This material is based upon work supported by the Defense Advanced Research Projects Agency (DARPA) under Contract No. HR00112290064 and by the National Institute of Aeronautics. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the United States Government or DARPA.

Author information

Authors and Affiliations

SRI International, Menlo Park, CA, 94025, USA
Eric Yeh, Briland Hitaj, Sam Owre, Maena Quemener & Natarajan Shankar

Authors

Eric Yeh
View author publications
You can also search for this author in PubMed Google Scholar
Briland Hitaj
View author publications
You can also search for this author in PubMed Google Scholar
Sam Owre
View author publications
You can also search for this author in PubMed Google Scholar
Maena Quemener
View author publications
You can also search for this author in PubMed Google Scholar
Natarajan Shankar
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Eric Yeh .

Editor information

Editors and Affiliations

ENSIIE, Evry, France
Catherine Dubois
University of Birmingham, Birmingham, UK
Manfred Kerber

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Yeh, E., Hitaj, B., Owre, S., Quemener, M., Shankar, N. (2023). CoProver: A Recommender System for Proof Construction. In: Dubois, C., Kerber, M. (eds) Intelligent Computer Mathematics. CICM 2023. Lecture Notes in Computer Science(), vol 14101. Springer, Cham. https://doi.org/10.1007/978-3-031-42753-4_16

Download citation

DOI: https://doi.org/10.1007/978-3-031-42753-4_16
Published: 28 August 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-42752-7
Online ISBN: 978-3-031-42753-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

CoProver: A Recommender System for Proof Construction

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Proof Guidance in PVS with Sequential Pattern Mining

The Role of Entropy in Guiding a Connection Prover

Simple Dataset for Proof Method Recommendation in Isabelle/HOL

Notes

References

Acknowledgement

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

CoProver: A Recommender System for Proof Construction

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Proof Guidance in PVS with Sequential Pattern Mining

The Role of Entropy in Guiding a Connection Prover

Simple Dataset for Proof Method Recommendation in Isabelle/HOL

Notes

References

Acknowledgement

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation