Versatile XQuery Processing in MapReduce

Sauer, Caetano; Bächle, Sebastian; Härder, Theo

doi:10.1007/978-3-642-40683-6_16

Caetano Sauer¹⁹,
Sebastian Bächle¹⁹ &
Theo Härder¹⁹

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 8133))

Included in the following conference series:

East European Conference on Advances in Databases and Information Systems

1017 Accesses
6 Citations

Abstract

The MapReduce (MR) framework has become a standard tool for performing large batch computations—usually of aggregative nature—in parallel over a cluster of commodity machines. A significant share of typical MR jobs involves standard database-style queries, where it becomes cumbersome to specify map and reduce functions from scratch. To overcome this burden, higher-level languages such as HiveQL, PigLatin, and JAQL have been proposed to allow the automatic generation of MR jobs from declarative queries. We identify two major problems of these existing solutions: (i) they introduce new query languages and implement systems from scratch for the sole purpose of expressing MR jobs; and (ii) despite solving some of the major limitations of SQL, they still lack the flexibility required by big data applications. We propose BrackitMR, an approach based on the XQuery language with extended JSON support. XQuery not only is an established query language, but also has a more expressive data model and more powerful language constructs, enabling a much greater degree of flexibility. From a system design perspective, we extend an existing single-node query processor, Brackit, adding MR as a distributed coordination layer. Such heavy reuse of the standard query processor not only provides performance, but also allows for a more elegant design which transparently integrates MR processing into a generic query engine.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

EUR 32.99 /Month

Get 10 units per month
Download Article/Chapter or Ebook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Subscribe now

Buy Now

Chapter: EUR 29.95; Price includes VAT (Germany)

eBook: EUR 42.79; Price includes VAT (Germany)

Softcover Book: EUR 53.49; Price includes VAT (Germany)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

BrackitMR: Flexible XQuery Processing in MapReduce

Evaluation of high-level query languages based on MapReduce in Big Data

Article Open access 06 October 2018

Representing MapReduce Optimisations in the Nested Relational Calculus

References

Afanasiev, L., Grust, T., Marx, M., Rittinger, J., Teubner, J.: An Inflationary Fixed Point Operator in XQuery. In: ICDE Conference, pp. 1504–1506. IEEE (2008)
Google Scholar
Bächle, S.: Separating Key Concerns in Query Processing – Set Orientation, Physical Data Independence, and Parallelism. Ph.D. thesis, University of Kaiserslautern, Germany (2012)
Google Scholar
Beyer, K.S., Ercegovac, V., Gemulla, R., Balmin, A., Eltabakh, M.Y., Kanne, C.C., Özcan, F., Shekita, E.J.: Jaql: A Scripting Language for Large-Scale Semistructured Data Analysis. PVLDB 4(12), 1272–1283 (2011)
Google Scholar
Dean, J., Ghemawat, S.: MapReduce: A Flexible Data Processing Tool. Commun. ACM 53(1), 72–77 (2010)
Article Google Scholar
Graefe, G.: Query Evaluation Techniques for Large Databases. ACM Comput. Surv. 25(2), 73–170 (1993)
Article Google Scholar
Lämmel, R.: Google’s MapReduce Programming Model – Revisited. Sci. Comput. Program. 70(1), 1–30 (2008)
Article MATH Google Scholar
Olston, C., Reed, B., Srivastava, U., Kumar, R., Tomkins, A.: Pig Latin: A Not-So-Foreign Language for Data Processing. In: SIGMOD Conference, pp. 1099–1110 (2008)
Google Scholar
Robie, J., Brantner, M., Florescu, D., Fourny, G., Westmann, T.: JSONiq: XQuery for JSON, JSON for XQuery, pp. 63–72 (2012)
Google Scholar
Sauer, C., Härder, T.: Compilation of Query Languages into MapReduce. Datenbank-Spektrum 13(1), 5–15 (2013)
Article Google Scholar
Stewart, R.J., Trinder, P.W., Loidl, H.-W.: Comparing High Level MapReduce Query Languages. In: Temam, O., Yew, P.-C., Zang, B. (eds.) APPT 2011. LNCS, vol. 6965, pp. 58–72. Springer, Heidelberg (2011)
Chapter Google Scholar
Thusoo, A., Sarma, J.S., Jain, N., Shao, Z., Chakka, P., Zhang, N., Anthony, S., Liu, H., Murthy, R.: Hive – A Petabyte Scale Data Warehouse using Hadoop. In: ICDE Conference, pp. 996–1005 (2010)
Google Scholar
W3C: XQuery 3.0: An XML Query Language (2011), http://www.w3.org/TR/xquery-30/
White, T.: Hadoop - The Definitive Guide: Storage and Analysis at Internet Scale, 2nd edn. O’Reilly (2011)
Google Scholar

Download references

Author information

Authors and Affiliations

University of Kaiserslautern, P.O. Box 3049, 67653, Kaiserslautern, Germany
Caetano Sauer, Sebastian Bächle & Theo Härder

Authors

Caetano Sauer
View author publications
You can also search for this author in PubMed Google Scholar
Sebastian Bächle
View author publications
You can also search for this author in PubMed Google Scholar
Theo Härder
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Università di Genova, Italy
Barbara Catania
DIBRIS, Università di Genova, Italy
Giovanna Guerrini
Department of Software Engineering Faculty of Mathematics and Physics, Charles University, Malostranské nám. 25, 11800, Prague 1, Czech Republic
Jaroslav Pokorný

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Sauer, C., Bächle, S., Härder, T. (2013). Versatile XQuery Processing in MapReduce. In: Catania, B., Guerrini, G., Pokorný, J. (eds) Advances in Databases and Information Systems. ADBIS 2013. Lecture Notes in Computer Science, vol 8133. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-40683-6_16

Download citation

DOI: https://doi.org/10.1007/978-3-642-40683-6_16
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-40682-9
Online ISBN: 978-3-642-40683-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Versatile XQuery Processing in MapReduce

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

BrackitMR: Flexible XQuery Processing in MapReduce

Evaluation of high-level query languages based on MapReduce in Big Data

Representing MapReduce Optimisations in the Nested Relational Calculus

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Versatile XQuery Processing in MapReduce

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

BrackitMR: Flexible XQuery Processing in MapReduce

Evaluation of high-level query languages based on MapReduce in Big Data

Representing MapReduce Optimisations in the Nested Relational Calculus

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation