Log in

A theory of formal synthesis via inductive learning

  • Original Article
  • Published:
Acta Informatica Aims and scope Submit manuscript

Abstract

Formal synthesis is the process of generating a program satisfying a high-level formal specification. In recent times, effective formal synthesis methods have been proposed based on the use of inductive learning. We refer to this class of methods that learn programs from examples as formal inductive synthesis. In this paper, we present a theoretical framework for formal inductive synthesis. We discuss how formal inductive synthesis differs from traditional machine learning. We then describe oracle-guided inductive synthesis (OGIS), a framework that captures a family of synthesizers that operate by iteratively querying an oracle. An instance of OGIS that has had much practical impact is counterexample-guided inductive synthesis (CEGIS). We present a theoretical characterization of CEGIS for learning any program that computes a recursive language. In particular, we analyze the relative power of CEGIS variants where the types of counterexamples generated by the oracle varies. We also consider the impact of bounded versus unbounded memory available to the learning algorithm. In the special case where the universe of candidate programs is finite, we relate the speed of convergence to the notion of teaching dimension studied in machine learning theory. Altogether, the results of the paper take a first step towards a theoretical foundation for the emerging field of formal inductive synthesis.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price includes VAT (Germany)

Instant access to the full article PDF.

Fig. 1
Fig. 2

Similar content being viewed by others

Notes

  1. CEGIS techniques in literature [37, 60] initiate search for correct program using positive examples and use specification to obtain additional positive examples corresponding to counterexamples.

  2. Note that we can extend this definition to include counterexamples of size bounded by that of the largest positive example seen so far plus a constant. The proof arguments given in Sect. 5 continue to work with only minor modifications.

  3. This holds due to the specialization of \(\Phi \) to a partial specification, and as a trace property. For general \(\Phi \), the learner need not exclude all counterexamples.

  4. In this framework, a synthesis engine is only required to converge to the correct concept without requiring it to recognize it has converged and terminate. For a finite concept or language, termination can be trivially guaranteed when the oracle is assumed to be non-redundant and does not repeat examples.

References

  1. Alur, R., Bodik, R., Juniwal, G., Martin, M.M.K., Raghothaman, M., Seshia, S.A., Singh, R., Solar-Lezama, A., Torlak, E., Udupa, A.: Syntax-Guided Synthesis. In: Proceedings of the IEEE International Conference on Formal Methods in Computer-Aided Design (FMCAD) (2013)

  2. Angluin, D.: Inductive inference of formal languages from positive data. Inf. Control 45, 117–135 (1980). doi:10.1016/S0019-9958(80)90285-5

    Article  MathSciNet  MATH  Google Scholar 

  3. Angluin, D.: Learning regular sets from queries and counterexamples. Inf. Comput. 75(2), 87–106 (1987)

    Article  MathSciNet  MATH  Google Scholar 

  4. Angluin, D.: Queries and concept learning. Mach. Learn. 2(4), 319–342 (1988). doi:10.1023/A:1022821128753

    MathSciNet  Google Scholar 

  5. Angluin, D.: Queries revisited. Theoretical computer science. Algorithmic learning theory 313(2), 175–194 (2004). doi:10.1016/j.tcs.2003.11.004. http://www.sciencedirect.com/science/article/pii/S030439750300608X

  6. Angluin, D., Smith, C.H.: Inductive inference: theory and methods. ACM Comput. Surv. 15, 237–269 (1983)

    Article  MathSciNet  Google Scholar 

  7. Atig, M.F., Bouajjani, A., Qadeer, S.: Context-bounded analysis for concurrent programs with dynamic creation of threads. Log. Methods Comput. Sci. (2011). doi:10.2168/LMCS-7(4:4)2011

    MathSciNet  MATH  Google Scholar 

  8. Barrett, C., Sebastiani, R., Seshia, S.A., Tinelli, C.: Satisfiability modulo theories. In: Biere, A., van Maaren, H., Walsh, T. (eds.) Handbook of Satisfiability, Chapter 8, vol. 4. IOS Press, Amsterdam (2009)

    Google Scholar 

  9. Bengio, Y., Goodfellow, I.J., Courville, A.: Deep Learning. Book in preparation for MIT Press (2015). http://www.iro.umontreal.ca/~bengioy/dlbook

  10. Biere, A.: Bounded model checking. In: Handbook of Satisfiability, pp. 457–481 (2009). doi:10.3233/978-1-58603-929-5-457

  11. Blum, L., Blum, M.: Toward a mathematical theory of inductive inference. Inf. Control 28(2), 125–155 (1975). doi:10.1016/s0019-9958(75)90261-2

    Article  MathSciNet  MATH  Google Scholar 

  12. Blumer, A., Ehrenfeucht, A., Haussler, D., Warmuth, M.K.: Learnability and the Vapnik–Chervonenkis dimension. J. ACM 36(4), 929–965 (1989). doi:10.1145/76359.76371

    Article  MathSciNet  MATH  Google Scholar 

  13. Bryant, R.E.: Graph-based algorithms for Boolean function manipulation. IEEE Trans. Comput. C–35(8), 677–691 (1986)

    Article  MATH  Google Scholar 

  14. Chen, Y., Safarpour, S., Marques-Silva, J.: Automated design debugging with maximum satisfiability. IEEE Trans. CAD Integr. Circuits Syst. 29(11), 1804–1817 (2010). doi:10.1109/TCAD.2010.2061270

    Article  Google Scholar 

  15. Clarke, E.M., Emerson, E.A.: Design and synthesis of synchronization skeletons using branching-time temporal logic. In: Kozen, D. (ed.) Logic of Programs, Workshop. Springer, London (1981)

  16. Clarke, E.M., Grumberg, O., Peled, D.A.: Model Checking. MIT Press, Cambridge (2000)

    Google Scholar 

  17. Clarkson, M.R., Schneider, F.B.: Hyperproperties. J. Comput. Secur. 18(6), 1157–1210 (2010)

    Article  Google Scholar 

  18. Giannakopoulou, D., Pasareanu, C.S. (eds.): Special issue on learning techniques for compositional reasoning. Formal Methods in System Design 32(3), pp. 173–174 (2008)

  19. Gold, E.M.: Language identification in the limit. Inf. Control 10(5), 447–474 (1967). doi:10.1016/S0019-9958(67)91165-5

    Article  MathSciNet  MATH  Google Scholar 

  20. Goldman, S.A., Kearns, M.J.: On the complexity of teaching. J. Comput. Syst. Sci. 50, 303–314 (1992)

    MathSciNet  Google Scholar 

  21. Goldman, S.A., Rivest, R.L., Schapire, R.E.: Learning binary relations and total orders. SIAM J. Comput. 22(5), 1006–1034 (1993). doi:10.1137/0222062

    Article  MathSciNet  MATH  Google Scholar 

  22. Gordon, M.J.C., Melham, T.F.: Introduction to HOL: A Theorem Proving Environment for Higher-Order Logic. Cambridge University Press, Cambridge (1993)

    MATH  Google Scholar 

  23. Grebenshchikov, S., Lopes, N.P., Popeea, C., Rybalchenko, A.: Synthesizing software verifiers from proof rules. In: ACM SIGPLAN Notices, 47, ACM, pp. 405–416 (2012)

  24. Gulwani, S., Jha, S., Tiwari, A., Venkatesan, R.: Synthesis of loop-free programs. In: PLDI, pp. 62–73 (2011). doi:10.1145/1993498.1993506

  25. Hegedűs, T.: Geometrical concept learning and convex polytopes. In: Proceedings of the Seventh Annual Conference on Computational Learning Theory, COLT ’94, ACM, New York, NY, USA, pp. 228–236 (1994). doi:10.1145/180139.181124

  26. Jackson, J.C.: An efficient membership-query algorithm for learning DNF with respect to the uniform distribution. J. Comput. Syst. Sci. 55(3), 414–440 (1997). doi:10.1006/jcss.1997.1533

    Article  MATH  Google Scholar 

  27. Jain, S.: Systems that Learn: An Introduction to Learning Theory. MIT Press, Cambridge (1999)

    Google Scholar 

  28. Jain, S., Kinber, E.: Iterative learning from positive data and negative counterexamples. Inf. Comput. 205(12), 1777–1805 (2007). doi:10.1016/j.ic.2007.09.001

    Article  MathSciNet  MATH  Google Scholar 

  29. Jantke, K.P., Beick, H.-R.: Combining Postulates of Naturalness in Inductive Inference. Elektronische Informationsverarbeitung und Kybernetik 17(8/9), 465–484 (1981)

    MathSciNet  MATH  Google Scholar 

  30. Jha, S., Seshia, S.A.: A theory of formal synthesis via inductive learning. Ar**v e-prints (2015)

  31. Jha, S., Gulwani, S., Seshia, S.A., Tiwari, A.: Oracle-guided Component-based Program Synthesis. ICSE ’10, ACM, New York, NY, USA, pp. 215–224 (2010). doi:10.1145/1806799.1806833

  32. Jha, S., Gulwani, S., Seshia, S.A., Tiwari, A.: Synthesizing switching logic for safety and dwell-time requirements. In: Proceedings of the International Conference on Cyber-Physical Systems (ICCPS), pp. 22–31 (2010)

  33. Jha, S., Seshia, S.A.: Are there good mistakes? a theoretical analysis of CEGIS. In: 3rd Workshop on Synthesis (SYNT) (2014)

  34. Jha, S., Seshia, S.A., Tiwari, A.: Synthesis of optimal switching logic for hybrid systems. In: Proceedings of the international conference on embedded software (EMSOFT), pp. 107–116 (2011)

  35. Jha, S., Seshia, S.A., Zhu, X.: On the teaching dimension of octagons for formal synthesis. In: 5th Workshop on Synthesis (SYNT) (2016)

  36. Jha, S.K.: Towards automated system synthesis using SCIDUCTION. Ph.D. thesis, EECS Department, University of California, Berkeley (2011). http://www.eecs.berkeley.edu/Pubs/TechRpts/2011/EECS-2011-118.html

  37. **, X., Donzé, A., Deshmukh, J., Seshia, S.A.: Mining requirements from closed-loop control models. In: HSCC (2013)

  38. Kaufmann, M., Manolios, P., Moore, J.S.: Computer-Aided Reasoning: An Approach. Kluwer Academic Publishers, Dordrecht (2000)

    Google Scholar 

  39. Kuncak, V., Mayer, M., Piskac, R., Suter, P.: Software synthesis procedures. Commun. ACM 55(2), 103–111 (2012)

    Article  Google Scholar 

  40. Lange, S.: Algorithmic Learning of Recursive Languages. Mensch-und-Buch-Verlag, Berlin (2000)

    Google Scholar 

  41. Lange, S., Zeugmann, T., Zilles, S.: Learning indexed families of recursive languages from positive data: a survey. Theor. Comput. Sci. 397(1–3), 194–232 (2008). doi:10.1016/j.tcs.2008.02.030

    Article  MathSciNet  MATH  Google Scholar 

  42. Lange, S., Zilles, S.: Formal language identification: query learning vs. gold-style learning. Inf. Process. Lett. 91(6), 285–292 (2004). doi:10.1016/j.ipl.2004.05.010

  43. Li, W.: Specification mining: new formalisms, algorithms and applications. Ph.D. thesis, EECS Department, University of California, Berkeley (2014)

  44. Li, W., Dworkin, L., Seshia, S.A.: Mining assumptions for synthesis. In: 2011 9th IEEE/ACM International Conference on Formal Methods and Models for Codesign (MEMOCODE), pp. 43–50 (2011)

  45. Malik, S., Zhang, L.: Boolean satisfiability: from theoretical hardness to practical success. Commun. ACM (CACM) 52(8), 76–82 (2009). doi:10.1145/1536616.1536637

    Article  Google Scholar 

  46. Manna, Z., Waldinger, R.: A deductive approach to program synthesis. ACM Trans. Program. Lang. Syst. 2(1), 90–121 (1980). doi:10.1145/357084.357090

    Article  MATH  Google Scholar 

  47. Mitchell, T.M.: Machine Learning, 1st edn. McGraw-Hill Inc, New York (1997)

    MATH  Google Scholar 

  48. Morgado, A., Liffiton, M., Marques-Silva, J.: MaxSAT-based MCS enumeration. In: Biere, A., Nahir, A., Vos, T. (eds.) Hardware and Software: Verification and Testing, Lecture Notes in Computer Science 7857, Springer Berlin Heidelberg, pp. 86–101 (2013). doi:10.1007/978-3-642-39611-3_13

  49. Owre, S., Rushby, J.M., Shankar, N.: PVS: a prototype verification system. In: Kapur, D., (ed.) In: 11th International Conference on Automated Deduction (CADE), Lecture Notes in Artificial Intelligence 607, Springer-Verlag, pp. 748–752 (1992)

  50. Pnueli, A., Rosner, R.: On the synthesis of a reactive module. In: ACM Symposium on Principles of Programming Languages (POPL), pp. 179–190 (1989)

  51. Queille, J.-P., Sifakis, J.: Specification and verification of concurrent systems in CESAR. In: Symposium on programming, LNCS 137, pp. 337–351 (1982)

  52. Quinlan, J.R.: Induction of decision trees. Mach. Learn. 1(1), 81–106 (1986). doi:10.1023/A:1022643204877

  53. Rogers Jr., H.: Theory of Recursive Functions and Effective Computability. MIT Press, Cambridge (1987)

    MATH  Google Scholar 

  54. Salzberg, S., Delcher, A.L., Heath, D., Kasif, S.: Best-case results for nearest-neighbor learning. IEEE Trans. Pattern Anal. Mach. Intell. 17(6), 599–608 (1995). doi:10.1109/34.387506

    Article  Google Scholar 

  55. Seshia, S.A.: Sciduction: combining induction, deduction, and structure for verification and synthesis. In: Proceedings of the Design Automation Conference (DAC), pp. 356–365 (2012)

  56. Seshia, S.A.: Combining induction, deduction, and structure for verification and synthesis. Proc. IEEE 103(11), 2036–2051 (2015)

    Article  Google Scholar 

  57. Shapiro, E.Y.: Algorithmic Program Debugging. MIT Press, Cambridge (1982)

    MATH  Google Scholar 

  58. Shinohara, A., Miyano, S.: Teachability in computational learning. In: ALT, pp. 247–255 (1990)

  59. Solar-Lezama, A., Rabbah, R., Bodík, R., Ebcioglu, K.: Programming by sketching for bit-streaming programs. In: PLDI (2005)

  60. Solar-Lezama, A., Tancau, L., Bodk, R., Seshia, S.A., Saraswat, V.A.: Combinatorial sketching for finite programs. In: ASPLOS, pp. 404–415 (2006). doi:10.1145/1168857.1168907

  61. Srivastava, S., Gulwani, S., Foster, J.S.: From program verification to program synthesis. In: Proceedings of ACM Symposium on Principles of Programming Languages, pp. 313–326 (2010)

  62. Summers, P.D.: A methodology for LISP program construction from examples. J. ACM 24(1), 161–175 (1977)

    Article  MathSciNet  MATH  Google Scholar 

  63. Udupa, A., Raghavan, A., Deshmukh, J.V., Mador-Haim, S., Martin, M.M.K., Alur, R.: Transit: Specifying protocols with concolic snippets, In: Proceedings of the 34th ACM SIGPLAN conference on Programming Language Design and Implementation, pp. 287–296 (2013)

  64. Valiant, L.G.: A theory of the learnable. Commun. ACM 27, 1134–1142 (1984)

    Article  MATH  Google Scholar 

  65. Vapnik, V.N., Chervonenkis, A.Y.: On the uniform convergence of relative frequencies of events to their probabilities 16(2), pp. 264–280 (1971). doi:10.1137/1116025

  66. Weisberg, S.: Applied linear regression, 3rd edn. Wiley, Hoboken (2005). http://www.stat.umn.edu/alr

  67. Wiehagen, R.: Limit detection of recursive functions by specific strategies. Electron. Inf. Process. Cybernet. 12(1/2), 93–99 (1976)

    MathSciNet  Google Scholar 

  68. Wiehagen, R.: A thesis in inductive inference. In: Dix, J., Jantke, K.P., Schmitt, P.H. (eds.) Nonmonotonic and inductive logic, Lecture Notes in Computer Science 543, Springer, pp. 184–207 (1990). doi:10.1007/BFb0023324

  69. Winskel, G.: The Formal Semantics of Programming Languages: An Introduction. MIT Press, Cambridge (1993)

    MATH  Google Scholar 

Download references

Acknowledgements

We thank the anonymous reviewers for their detailed and helpful comments. This work was supported in part by the National Science Foundation (Grants CCF-1139138 and CNS-1545126), DARPA under agreement number FA8750-16-C-0043, the Toyota Motor Corporation under the CHESS center, a gift from Microsoft Research, and the TerraSwarm Research Center, one of six centers of STARnet, a Semiconductor Research Corporation program sponsored by MARCO and DARPA.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Susmit Jha.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Jha, S., Seshia, S.A. A theory of formal synthesis via inductive learning. Acta Informatica 54, 693–726 (2017). https://doi.org/10.1007/s00236-017-0294-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00236-017-0294-5

Navigation