Abstract
Data storage in various systems such as SQL and NoSQL leads to important problems when trying to unify data querying. Multiple storage systems conduct to heterogeneous data structures and to multiple query languages. In the context of horizontally and disjointed distributed data, this paper proposes a system that allows the user to natively query a polystore system without taking care of data distribution and heterogeneity. Our approach relies on two mechanisms: (i) map** dictionaries to define the navigation between systems, (ii) operator rewriting mechanisms from native query operators (selection, projection, aggregation and join) to execute queries on any polystore system. Using a dataset from TPC-H benchmark and a horizontally distributed between document and relational database management system, we conduct experiments showing that the rewriting process has a minimum impact when compared to executing queries in both systems.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Codd, E.F.: Further normalization of the data base relational model. Data Base Syst. 6, 33–64 (1972)
Duggan, J., Elmore, A.J., Stonebraker, M., et al.: The BigDAWG polystore system. ACM Sigmod Record 44(2), 11–16 (2015)
Candel, C.J.F., Ruiz, D.S., García-Molina, J.J.: A unified metamodel for NoSQL and relational databases. Inf. Syst. 104, 101898 (2022)
Barret, N., Manolescu, I., Upadhyay, P.: Toward generic abstractions for data of any model. In: Proceedings of the 31st ACM International Conference on Information & Knowledge Management, pp. 4803–4807 (2022)
Ben Hamadou, H., Gallinucci, E., Golfarelli, M.: Answering GPSJ queries in a polystore: a dataspace-based approach. In: Laender, A.H.F., Pernici, B., Lim, E.-P., de Oliveira, J.P.M. (eds.) ER 2019. LNCS, vol. 11788, pp. 189–203. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-33223-5_16
Daniel, G., Gómez, A., Cabot, J.:UMLto [No] SQL: map** conceptual schemas to heterogeneous datastores. In: 2019 13th International Conference on Research Challenges in Information Science (RCIS), pp. 1–13. IEEE (2019)
Misargopoulos, A., Papavassiliou, G., Gizelis, C.A., Nikolopoulos-Gkamatsis, F.: TYPHON: hybrid data lakes for real-time big data analytics – an evaluation framework in the telecom industry. In: Maglogiannis, I., Macintyre, J., Iliadis, L. (eds.) AIAI 2021. IAICT, vol. 628, pp. 128–137. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-79157-5_12
Kolev, B., Valduriez, P., Bondiombouy, C., Jiménez-Peris, R., Pau, R., Pereira, J.: CloudMdsQL: querying heterogeneous cloud data stores with a common language. Distributed and Parallel Databases 34(4), 463–503 (2015). https://doi.org/10.1007/s10619-015-7185-y
Hai, R., Quix, C., Zhou, C.: Query rewriting for heterogeneous data lakes. In: Benczúr, A., Thalheim, B., Horváth, T. (eds.) ADBIS 2018. LNCS, vol. 11019, pp. 35–49. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-98398-1_3
Curino, C.A., Moon, H.J., Deutsch, A., Zaniolo, C.: Update rewriting and integrity constraint maintenance in a schema evolution support system: PRISM++. Proc. VLDB Endowment 4(2), 117–128 (2010)
Acknowledgements
This work was supported by the French Gov. through the Territoire d’Innovation program, an action of the Grand Plan d’Investissement backed by France 2030, Toulouse Métropole and the GIS neOCampus.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
El Ahdab, L., Teste, O., Megdiche, I., Peninou, A. (2023). A Polystore Querying System Applied to Heterogeneous and Horizontally Distributed Data. In: Strauss, C., Amagasa, T., Kotsis, G., Tjoa, A.M., Khalil, I. (eds) Database and Expert Systems Applications. DEXA 2023. Lecture Notes in Computer Science, vol 14146. Springer, Cham. https://doi.org/10.1007/978-3-031-39847-6_35
Download citation
DOI: https://doi.org/10.1007/978-3-031-39847-6_35
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-39846-9
Online ISBN: 978-3-031-39847-6
eBook Packages: Computer ScienceComputer Science (R0)