Query Reverse Engineering of Pre-deleted Uncorrelated Operators

  • Conference paper
  • First Online:
Data Mining and Big Data (DMBD 2023)

Abstract

Recent years have seen an increasing reliance on data processing to accomplish work tasks. However, many users do not have the programming background to write complex programs, especially query statements. Query Reverse Engineering solves the problem of deriving query statements from the database and the desired output table in reverse. SQUARES, which is based on Domain-Specific Languages (DSL), is one of the most advanced models in this field. However, the existence of uncorrelated DSL operators constrains the synthesis efficiency. This paper proposes PdQRE based on SQUARES, which improves efficiency by predicting whether DSL operators are correlated with the query statement and pre-deleting uncorrelated operators. On the test-55 dataset, the synthesis rate of PdQRE improved from 80.0% to 89.1%, and the average synthesis time was reduced from 251 s to 127 s compared to SQUARES. Comparison with Scythe et al. in the Recent posts dataset shows that PdQRE outperforms other models in Query Synthesis.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 59.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 79.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free ship** worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Brancas, R., Terra-Neves, M., Ventura, M., Manquinho, V., Martins, R.: CUBES: a parallel synthesizer for SQL using examples. ar**v preprint ar**v:2203.04995 (2022)

  2. Catalfamo, W., Censuales, S.: Schema query reverse engineering

    Google Scholar 

  3. Feng, Y., Martins, R., Bastani, O., Dillig, I.: Program synthesis using conflict-driven learning. ACM SIGPLAN Not. 53(4), 420–435 (2018)

    Article  Google Scholar 

  4. Feng, Y., Martins, R., Van Geffen, J., Dillig, I., Chaudhuri, S.: Component-based synthesis of table consolidation and transformation tasks from examples. ACM SIGPLAN Not. 52(6), 422–436 (2017)

    Article  Google Scholar 

  5. Huang, P.S., Wang, C., Singh, R., Yih, W.T., He, X.: Natural language to structured query generation via meta-learning. ar**v preprint ar**v:1803.02400 (2018)

  6. Li, H., Chan, C.Y., Maier, D.: Query from examples: an iterative, data-driven approach to query construction. Proc. VLDB Endow. 8(13), 2158–2169 (2015)

    Article  Google Scholar 

  7. Martins, R., Chen, J., Chen, Y., Feng, Y., Dillig, I.: Trinity: an extensible synthesis framework for data science. Proc. VLDB Endow. 12(12), 1914–1917 (2019)

    Article  Google Scholar 

  8. Meiying, L.: Techniques for efficient query reverse engineering. Ph.D. thesis, National University of Singapore (Singapore) (2022)

    Google Scholar 

  9. Orvalho, P., Terra-Neves, M., Ventura, M., Martins, R., Manquinho, V.: SQUARES: a SQL synthesizer using query reverse engineering. Proc. VLDB Endow. 13(12), 2853–2856 (2020)

    Article  Google Scholar 

  10. Takenouchi, K., Ishio, T., Okada, J., Sakata, Y.: PATSQL: efficient synthesis of SQL queries from example tables with quick inference of projected columns. ar**v preprint ar**v:2010.05807 (2020)

  11. Tan, W.C., Zhang, M., Elmeleegy, H., Srivastava, D.: Reverse engineering aggregation queries. Proc. VLDB Endow. 10(11), 1394–1405 (2017)

    Article  Google Scholar 

  12. Tran, Q.T., Chan, C.Y., Parthasarathy, S.: Query by output. In: Proceedings of the 2009 ACM SIGMOD International Conference on Management of Data, pp. 535–548 (2009)

    Google Scholar 

  13. Tran, Q.T., Chan, C.Y., Parthasarathy, S.: Query reverse engineering. VLDB J. 23(5), 721–746 (2014)

    Article  Google Scholar 

  14. Wang, B., Shin, R., Liu, X., Polozov, O., Richardson, M.: RAT-SQL: relation-aware schema encoding and linking for text-to-SQL parsers. ar**v preprint ar**v:1911.04942 (2019)

  15. Wang, C., Cheung, A., Bodik, R.: Interactive query synthesis from input-output examples. In: Proceedings of the 2017 ACM International Conference on Management of Data, pp. 1631–1634 (2017)

    Google Scholar 

  16. Wang, C., Cheung, A., Bodik, R.: Synthesizing highly expressive SQL queries from input-output examples. In: Proceedings of the 38th ACM SIGPLAN Conference on Programming Language Design and Implementation, pp. 452–466 (2017)

    Google Scholar 

  17. Yu, T., et al.: SyntaxSQLNet: syntax tree networks for complex and cross-domaintext-to-SQL task. ar**v preprint ar**v:1810.05237 (2018)

  18. Zhang, S., Sun, Y.: Automatically synthesizing SQL queries from input-output examples. In: 2013 28th IEEE/ACM International Conference on Automated Software Engineering (ASE), pp. 224–234. IEEE (2013)

    Google Scholar 

  19. Zloof, M.M.: Query-by-example: the invocation and definition of tables and forms. In: Proceedings of the 1st International Conference on Very Large Data Bases, pp. 1–24 (1975)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Quansheng Dou .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Dou, Q., Wang, H., Tang, H., Pan, H., Zhang, S. (2024). Query Reverse Engineering of Pre-deleted Uncorrelated Operators. In: Tan, Y., Shi, Y. (eds) Data Mining and Big Data. DMBD 2023. Communications in Computer and Information Science, vol 2018. Springer, Singapore. https://doi.org/10.1007/978-981-97-0844-4_4

Download citation

  • DOI: https://doi.org/10.1007/978-981-97-0844-4_4

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-97-0843-7

  • Online ISBN: 978-981-97-0844-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Navigation