Abstract
Recent years have seen an increasing reliance on data processing to accomplish work tasks. However, many users do not have the programming background to write complex programs, especially query statements. Query Reverse Engineering solves the problem of deriving query statements from the database and the desired output table in reverse. SQUARES, which is based on Domain-Specific Languages (DSL), is one of the most advanced models in this field. However, the existence of uncorrelated DSL operators constrains the synthesis efficiency. This paper proposes PdQRE based on SQUARES, which improves efficiency by predicting whether DSL operators are correlated with the query statement and pre-deleting uncorrelated operators. On the test-55 dataset, the synthesis rate of PdQRE improved from 80.0% to 89.1%, and the average synthesis time was reduced from 251 s to 127 s compared to SQUARES. Comparison with Scythe et al. in the Recent posts dataset shows that PdQRE outperforms other models in Query Synthesis.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Brancas, R., Terra-Neves, M., Ventura, M., Manquinho, V., Martins, R.: CUBES: a parallel synthesizer for SQL using examples. ar**v preprint ar**v:2203.04995 (2022)
Catalfamo, W., Censuales, S.: Schema query reverse engineering
Feng, Y., Martins, R., Bastani, O., Dillig, I.: Program synthesis using conflict-driven learning. ACM SIGPLAN Not. 53(4), 420–435 (2018)
Feng, Y., Martins, R., Van Geffen, J., Dillig, I., Chaudhuri, S.: Component-based synthesis of table consolidation and transformation tasks from examples. ACM SIGPLAN Not. 52(6), 422–436 (2017)
Huang, P.S., Wang, C., Singh, R., Yih, W.T., He, X.: Natural language to structured query generation via meta-learning. ar**v preprint ar**v:1803.02400 (2018)
Li, H., Chan, C.Y., Maier, D.: Query from examples: an iterative, data-driven approach to query construction. Proc. VLDB Endow. 8(13), 2158–2169 (2015)
Martins, R., Chen, J., Chen, Y., Feng, Y., Dillig, I.: Trinity: an extensible synthesis framework for data science. Proc. VLDB Endow. 12(12), 1914–1917 (2019)
Meiying, L.: Techniques for efficient query reverse engineering. Ph.D. thesis, National University of Singapore (Singapore) (2022)
Orvalho, P., Terra-Neves, M., Ventura, M., Martins, R., Manquinho, V.: SQUARES: a SQL synthesizer using query reverse engineering. Proc. VLDB Endow. 13(12), 2853–2856 (2020)
Takenouchi, K., Ishio, T., Okada, J., Sakata, Y.: PATSQL: efficient synthesis of SQL queries from example tables with quick inference of projected columns. ar**v preprint ar**v:2010.05807 (2020)
Tan, W.C., Zhang, M., Elmeleegy, H., Srivastava, D.: Reverse engineering aggregation queries. Proc. VLDB Endow. 10(11), 1394–1405 (2017)
Tran, Q.T., Chan, C.Y., Parthasarathy, S.: Query by output. In: Proceedings of the 2009 ACM SIGMOD International Conference on Management of Data, pp. 535–548 (2009)
Tran, Q.T., Chan, C.Y., Parthasarathy, S.: Query reverse engineering. VLDB J. 23(5), 721–746 (2014)
Wang, B., Shin, R., Liu, X., Polozov, O., Richardson, M.: RAT-SQL: relation-aware schema encoding and linking for text-to-SQL parsers. ar**v preprint ar**v:1911.04942 (2019)
Wang, C., Cheung, A., Bodik, R.: Interactive query synthesis from input-output examples. In: Proceedings of the 2017 ACM International Conference on Management of Data, pp. 1631–1634 (2017)
Wang, C., Cheung, A., Bodik, R.: Synthesizing highly expressive SQL queries from input-output examples. In: Proceedings of the 38th ACM SIGPLAN Conference on Programming Language Design and Implementation, pp. 452–466 (2017)
Yu, T., et al.: SyntaxSQLNet: syntax tree networks for complex and cross-domaintext-to-SQL task. ar**v preprint ar**v:1810.05237 (2018)
Zhang, S., Sun, Y.: Automatically synthesizing SQL queries from input-output examples. In: 2013 28th IEEE/ACM International Conference on Automated Software Engineering (ASE), pp. 224–234. IEEE (2013)
Zloof, M.M.: Query-by-example: the invocation and definition of tables and forms. In: Proceedings of the 1st International Conference on Very Large Data Bases, pp. 1–24 (1975)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Dou, Q., Wang, H., Tang, H., Pan, H., Zhang, S. (2024). Query Reverse Engineering of Pre-deleted Uncorrelated Operators. In: Tan, Y., Shi, Y. (eds) Data Mining and Big Data. DMBD 2023. Communications in Computer and Information Science, vol 2018. Springer, Singapore. https://doi.org/10.1007/978-981-97-0844-4_4
Download citation
DOI: https://doi.org/10.1007/978-981-97-0844-4_4
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-97-0843-7
Online ISBN: 978-981-97-0844-4
eBook Packages: Computer ScienceComputer Science (R0)