Definitions
RDF, the Resource Description Framework, has been recognized as a de facto standard to describe resources in a semi-structured manner. In particular, RDF is a graph-based format which allows to define named links between resources in the form of triples subject, predicate, object, also called statements. A statement expresses a relationship (defined by a predicate) between resources (subject and object). The relationship is always from subject to object (it is directional). The same resource can be used in multiple triples playing the same or different roles, e.g., it can be used as a subject in one triple, as well as a predicate or an object in another one. This ability enables definition of multiple connections between the triples, hence creation of a connected graph of data. Such graph can be represented as nodes that stands for the resources and edges capturing the relationships between the...
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Abadi DJ, Marcus A, Madden SR, Hollenbach K (2007) Scalable semantic web data management using vertical partitioning. In: Proceedings of the 33rd international conference on very large data bases. VLDB Endowment, pp 411–422
Abouzeid A, Bajda-Pawlikowski K, Abadi DJ, Rasin A, Silberschatz A (2009) Hadoopdb: an architectural hybrid of mapreduce and DBMS technologies for analytical workloads. PVLDB 2(1):922–933. http://www.vldb.org/pvldb/2/vldb09-861.pdf
Armbrust M, **n RS, Lian C, Huai Y, Liu D, Bradley JK, Meng X, Kaftan T, Franklin MJ, Ghodsi A, Zaharia M (2015) Spark SQL: relational data processing in spark. In: SIGMOD. https://doi.org/10.1145/2723372.2742797
Bernstein PA, Chiu DMW (1981) Using semi-joins to solve relational queries. J ACM (JACM) 28(1): 25–40
Chen X, Chen H, Zhang N, Zhang S (2014) SparkRDF: elastic discreted RDF graph processing engine with distributed memory. In: Proceedings of the ISWC 2014 posters & demonstrations track a track within the 13th international semantic web conference, ISWC 2014, Riva del Garda, 21 Oct 2014, pp 261–264. http://ceur-ws.org/Vol-1272/paper_43.pdf
Chen X, Chen H, Zhang N, Zhang S (2015) SparkRDF: elastic discreted RDF graph processing engine with distributed memory. In: IEEE/WIC/ACM international conference on web intelligence and intelligent agent technology, WI-IAT 2015, Singapore, 6–9 Dec 2015, vol I, pp 292–300. https://doi.org/10.1109/WI-IAT.2015.186
Dean J, Ghemawa S (2004) MapReduce: simplified data processing on large clusters. In: OSDI
Djahandideh B, Goasdoué F, Kaoudi Z, Manolescu I, Quiané-Ruiz J, Zampetakis S (2015) Cliquesquare in action: flat plans for massively parallel RDF queries. In: 31st IEEE international conference on data engineering, ICDE 2015, Seoul, 13–17 Apr 2015, pp 1432–1435. https://doi.org/10.1109/ICDE.2015.7113394
Goasdoué F, Kaoudi Z, Manolescu I, Quiané-Ruiz J, Zampetakis S (2015) Cliquesquare: flat plans for massively parallel RDF queries. In: 31st IEEE international conference on data engineering, ICDE 2015, Seoul, 13–17 Apr 2015, pp 771–782. https://doi.org/10.1109/ICDE.2015.7113332
Gonzalez JE, **n RS, Dave A, Crankshaw D, Franklin MJ, Stoica I (2014) GraphX: graph processing in a distributed dataflow framework. In: OSDI. https:// www.usenix.org/conference/osdi14/technical-sessions/ presentation/gonzalez
Goodman EL, Grunwald D (2014) Using vertex-centric programming platforms to implement SPARQL queries on large graphs. In: Proceedings of the 4th workshop on irregular applications: architectures and algorithms, IA3 ’14. IEEE Press, Piscataway, pp 25–32. https://doi.org/10.1109/IA3.2014.10
Huang J, Abadi DJ, Ren K (2011a) Scalable SPARQL querying of large RDF graphs. PVLDB 4(11): 1123–1134
Huang J, Abadi DJ, Ren K (2011b) Scalable SPARQL querying of large RDF graphs. Proc VLDB Endow 4(11):1123–1134
Husain M, McGlothlin J, Masud MM, Khan L, Thuraisingham BM (2011) Heuristics-based query processing for large RDF graphs using cloud computing. IEEE Trans Knowl Data Eng 23(9): 1312–1327
Kim H, Ravindra P, Anyanwu K (2013) Optimizing RDF(S) queries on cloud platforms. In: 22nd international world wide web conference, WWW ’13, Rio de Janeiro, 13–17 May 2013, Companion volume, pp 261–264. http://dl.acm.org/citation.cfm?id=2487917
Lee K, Liu L (2013) Scaling queries over big RDF graphs with semantic hash partitioning. Proc VLDB Endow 6(14):1894–1905
Low Y, Gonzalez J, Kyrola A, Bickson D, Guestrin C, Hellerstein JM (2012) Distributed graphLab: a framework for machine learning in the cloud. PVLDB 5(8):716–727
Naacke H, Curé O, Amann B (2016) SPARQL query processing with Apache spark. CoRR abs/1604.08903. http://arxiv.org/abs/1604.08903
Neumann T, Weikum G (2010) The RDF-3x engine for scalable management of RDF data. VLDB J 19(1): 91–113
Olston C, Reed B, Srivastava U, Kumar R, Tomkins A (2008) Pig latin: a not-so-foreign language for data processing. In: Proceedings of the ACM SIGMOD international conference on management of data, SIGMOD 2008, Vancouver, 10–12 June 2008, pp 1099–1110. https://doi.org/10.1145/1376616.1376726
Poggi A, Lembo D, Calvanese D, De Giacomo G, Lenzerini M, Rosati R (2008) Linking data to ontologies. In: Spaccapietra S (ed) Journal on data semantics X. Springer, Berlin/Heidelberg, pp 133–173
Ravindra P, Kim H, Anyanwu K (2011) An intermediate algebra for optimizing RDF graph pattern matching on mapreduce. In: The semanic web: research and applications – 8th extended semantic web conference, ESWC 2011, Heraklion, Crete, 29 May – 2 June 2011, Proceedings, Part II, pp 46–61. https://doi.org/10.1007/978-3-642-21064-8_4
Rohloff K, Schantz RE (2010) High-performance, massively scalable distributed systems using the mapreduce software framework: the shard triple-store. In: Programming support innovations for emerging distributed applications. ACM, p 4
Sakr S (2016) Big data 2.0 processing systems – a survey. Springer briefs in computer science. Springer. https://doi.org/10.1007/978-3-319-38776-5
Sakr S, Liu A, Fayoumi AG (2013) The family of mapreduce and large-scale data processing systems. ACM Comput Surv 46(1). https://doi.org/10.1145/2522968.2522979
Schätzle A, Przyjaciel-Zablocki M, Hornung T, Lausen G (2013) Pigsparql: a SPARQL query processing baseline for big data. In: Proceedings of the ISWC 2013 posters & demonstrations track, Sydney, 23 Oct 2013, pp 241–244. http://ceur-ws.org/Vol-1035/iswc2013_poster_16.pdf
Schätzle A, Przyjaciel-Zablocki M, Berberich T, Lausen G (2015a) S2X: graph-parallel querying of RDF with GraphX. In: 1st international workshop on big-graphs online querying (Big-O(Q))
Schätzle A, Przyjaciel-Zablocki M, Skilevic S, Lausen G (2015b) S2RDF: RDF querying with SPARQL on spark. CoRR abs/1512.07021. http://arxiv.org/abs/1512.07021
Valduriez P (1987) Join indices. ACM Trans Database Syst 12(2):218–246. https://doi.org/10.1145/22952.22955
Zaharia M, Chowdhury M, Franklin MJ, Shenker S, Stoica I (2010) Spark: cluster computing with working sets. In: HotCloud
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this entry
Cite this entry
Wylot, M., Sakr, S. (2019). Framework-Based Scale-Out RDF Systems. In: Sakr, S., Zomaya, A.Y. (eds) Encyclopedia of Big Data Technologies. Springer, Cham. https://doi.org/10.1007/978-3-319-77525-8_225
Download citation
DOI: https://doi.org/10.1007/978-3-319-77525-8_225
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-77524-1
Online ISBN: 978-3-319-77525-8
eBook Packages: Computer ScienceReference Module Computer Science and Engineering