Log in

An intelligent natural language query processor for a relational database

  • Original Article
  • Published:
Iran Journal of Computer Science Aims and scope Submit manuscript

Abstract

Every single byte of data is stored in either a structured or unstructured database. In this era of data exploration, retrieving and processing this information is tedious, as databases are ubiquitous. Basic knowledge in query processing languages like SQL, DMX, or QUEL is essential for retrieving such information from a database. Most people, however, are unaware of such query processing languages and find it difficult to write queries because of their lack of knowledge about the structure and format. Queries vary depending on the database used and how results need to be displayed. This can be addressed using an intelligent database system (IDBS) with natural language processing (NLP) capability. An intelligent natural language interface for a database (NLIDB) allows users to query the database in their spoken language. This paper describes an intelligent NLIDB that takes English language queries as input and converts them into corresponding SQL queries for retrieving information. The NLP procedures used here recognize the tokens and predict the possibility of generating clauses including SELECT, WHERE, and FROM. A query translation algorithm is used to map the identified tokens to SQL tokens. To generate SQL queries, a template is used. A query predictor based on maximum entropy generates the SQL queries when the query translator fails. Thus the model was trained with generated queries and different combinations of chunk tags, and their restraints were predicted. The proposed NLP technique is implemented in the maximum entropy model. The model either predicts SQL templates or generates SQL queries. This technique yields 100% correct results for the template-based system. The system offers a maximum probable result which matches the user query for the prediction module. The system consistently generates accurate results for a natural language query in template mode or SQL query mode. Easy retrieval of data from a huge database by making use of local language is of high relevance. Adding other languages and training the model can be seen as a scope of future work.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2

Similar content being viewed by others

References

  1. Woods, W.A., Kaplan, R.M., Webber, B.N.: The lunar sciences natural language information system. BBN Report 2378, (1972)

  2. Androutsopoulos, I., Ritchie, G.D., Thanisch, P.: Natural language interfaces to databases—an introduction. J. Nat. Lang. Eng. 1(1), 29–81 (1995)

    Article  Google Scholar 

  3. Popescu, O., An, V.O., Sheinin, V., Khorashani, E., Yeo, H.: Tackling complex queries to relational databases. Lecture Notes Comput. Sci. 11431, 688–701 (2019)

  4. Chiang, D.: Hierarchical phrase-based translation. Comput. Linguist. 33(2), 201–228 (2007)

    Article  Google Scholar 

  5. Vinod Chandra, S.S., Anand, H.S.: Artificial Intelligence and Machine Learning. PHI Learning, New Delhi (2014)

    Google Scholar 

  6. Li, H., Shi, Y.: A wordnet-based natural language interface to relational databases. In: IEEE Conference on Computer and Automation Engineering, pp. 514–518, (2010)

  7. Kate, A., Kamble, S., Bodkhe, A., Joshi, M.: Conversion of natural language query to SQL query. In: Second International Conference on Electronics, Communication and Aerospace Technology (ICECA), pp. 488–491 (2018)

  8. Deepthi, S., Rejimaon, R., Vinod Chandra, S.S.: A review on natural language interface for database. Int. J. Appl. Eng. Res. 8(4), 399–402 (2013)

    Google Scholar 

  9. Sander, A., Wauer, R.: Integrating terminologies into standard SQL: a new approach for research on routine data. J. Biomed. Semant. 10, 7 (2019)

    Article  Google Scholar 

  10. Mitchell, P.M., Beatrice, S., Mary, A.M.: Building a large annotated corpus of English: the Penn Treebank. Comput. Linguist. 19(2), 313–330 (1993)

    Google Scholar 

  11. Rejimoan, R., Vinod Chandra, S.S.: Maximum entropy based natural language interface for relational database. In: ACCIS, Elsevier, Amsterdam, pp. 68–76 (2014)

  12. Berger, A., Stephen, A.D., Vincet, J.D.: A maximum entropy approach to natural language processing. Comput. Linguist. 22(1), 39–71 (1996)

    Google Scholar 

  13. Miller, G.A.: Wordnet: a lexical database for English. Commun. ACM 38(11), 39–41 (1995)

    Article  Google Scholar 

  14. Acedo, L.: A Hidden Markov model for the linguistic analysis of the Voynich manuscript. Math. Comput. Appl. 24(1), 14 (2019)

    MathSciNet  Google Scholar 

  15. Affolter, K., Stockinger, K., Bernstein, A.: A comparative survey of recent natural language interfaces for databases. VLDB J. 28, 793–819 (2019)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to S. S. Vinod Chandra.

Ethics declarations

Conflict of Interest

The author declares that there is no conflict of interest.

Ethical approval

This article does not contain any studies with human participants or animals performed by the author.

Availability of data and material

The data sets can be accessed upon request. A sample data set is available at http://mirworks.in/downloads.php

Code availability

http://mirworks.in/downloads.php

Informed consent

This article does not contain any studies with human participants or animals performed by the author.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Chandra, S.S.V. An intelligent natural language query processor for a relational database. Iran J Comput Sci 5, 109–115 (2022). https://doi.org/10.1007/s42044-021-00095-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s42044-021-00095-1

Keywords

Navigation