Log in

SW-Store: a vertically partitioned DBMS for Semantic Web data management

  • Special Issue Paper
  • Published:
The VLDB Journal Aims and scope Submit manuscript

Abstract

Efficient management of RDF data is an important prerequisite for realizing the Semantic Web vision. Performance and scalability issues are becoming increasingly pressing as Semantic Web technology is applied to real-world applications. In this paper, we examine the reasons why current data management solutions for RDF data scale poorly, and explore the fundamental scalability limitations of these approaches. We review the state of the art for improving performance of RDF databases and consider a recent suggestion, “property tables”. We then discuss practically and empirically why this solution has undesirable features. As an improvement, we propose an alternative solution: vertically partitioning the RDF data. We compare the performance of vertical partitioning with prior art on queries generated by a Web-based RDF browser over a large-scale (more than 50 million triples) catalog of library data. Our results show that a vertically partitioned schema achieves similar performance to the property table technique while being much simpler to design. Further, if a column-oriented DBMS (a database architected specially for the vertically partitioned case) is used instead of a row-oriented DBMS, another order of magnitude performance improvement is observed, with query times drop** from minutes to several seconds. Encouraged by these results, we describe the architecture of SW-Store, a new DBMS we are actively building that implements these techniques to achieve high performance RDF data management.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price includes VAT (Spain)

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Abadi, D., Marcus, A., Madden, S., Hollenbach, K.: Using the Barton libraries dataset as an RDF benchmark. Technical Report MIT-CSAIL-TR-2007-036, MIT Press, USA

  2. Abadi, D.J.: Column stores for wide and sparse data. In: CIDR (2007)

  3. Abadi, D.J.: Query execution in column-oriented database systems. PhD Dissertation, PhD Thesis, MIT Press, USA (2008)

  4. Abadi, D.J., Madden, S., Ferreira, M.: Integrating compression and execution in column-oriented database systems. In: SIGMOD (2006)

  5. Abadi, D.J., Madden, S.R., Hachem, N.: Column-stores vs. row-stores: How different are they really? In: SIGMOD (2008)

  6. Abadi, D.J., Myers, D.S., DeWitt, D.J., Madden, S.R.: Materialization strategies in a column-oriented DBMS. In: Proceedings of ICDE (2007)

  7. Agrawal, R., Somani, A., Xu, Y.: Storage and querying of E-commerce data. In: VLDB (2001)

  8. Ailamaki, A., DeWitt, D.J., Hill, M.D., Skounakis, M.: Weaving relations for cache performance. In: VLDB, pp. 169–180 (2001)

  9. Alexaki, S., Christophides, V., Karvounarakis, G., Plexousakis, D., Tolle, K.: The ICS-FORTH RDFSuite: managing voluminous RDF description bases. In: SemWeb (2001)

  10. Bajda-Pawlikowski, K.: Querying RDF data stored in DBMS: SPARQL to SQL Conversion. Technical Report TR-1409, Yale Computer Science Department, USA

  11. Batory D.S.: On searching transposed files. ACM Trans. Database Syst. 4(4), 531–544 (1979)

    Article  Google Scholar 

  12. Beckmann, J., Halverson, A., Krishnamurthy, R., Naughton, J.: Extending RDBMSs to support sparse datasets using an interpreted attribute storage format. In: ICDE (2006)

  13. Bertino E., Kim W.: Indexing techniques for queries on nested objects. IEEE Trans. Knowl. Data Eng. 1(2), 196–214 (1989)

    Article  Google Scholar 

  14. Boncz, P., Manegold, S., Kersten, M.: Database architecture optimized for the new bottleneck: memory access. In: VLDB, pp. 54–65 (1999)

  15. Boncz P.A., Kersten M.L.: MIL primitives for querying a fragmented world. VLDB J. 8(2), 101–119 (1999)

    Article  Google Scholar 

  16. Boncz, P.A., Zukowski, M., Nes, N.: MonetDB/X100: hyper-pipelining query execution. In: CIDR, pp. 225–237 (2005)

  17. Bonstrom, V., Hinze, A., Schweppe, H.: Storing RDF as a graph. In: Proceedings of LA-WEB (2003)

  18. Broekstra, J., Kampman, A., van Harmelen, F.: Sesame: a generic architecture for storing and querying RDF and RDF schema. In: ISWC, pp. 54–68 (2002)

  19. Chong, E.I., Das, S., Eadon, G., Srinivasan, J.: An efficient SQL-based RDF querying scheme. In: VLDB, pp. 1216–1227 (2005)

  20. Copeland, G.P., Khoshafian, S.N.: A decomposition storage model. In: Proceedings of SIGMOD, pp. 268–279 (1985)

  21. Corwin J., Silberschatz A., Miller P.L., Marenco L.: Dynamic tables: An architecture for managing evolving, heterogeneous biomedical data in relational database management systems. J. Am. Med. Inf. Assoc. 14(1), 86–93 (2007)

    Article  Google Scholar 

  22. Falcons. Searching the semantic web. Web page. http://iws.seu.edu.cn/services/falcons/objectsearch/index.jsp/

  23. Florescu D., Kossmann D.: Storing and querying XML data using an RDMBS. IEEE Data Eng. Bull. 22(3), 27–34 (1999)

    Google Scholar 

  24. Harris, S., Gibbins, N.: 3store: efficient bulk RDF storage. In: Proceedings of PSSS’03, pp. 1–15 (2003)

  25. Hellerstein, J.M., Naughton, J.F., Pfeffer, A.: Generalized search trees for database systems. In: Proceedings of VLDB, pp. 562–573. Zurich (1995)

  26. Howe, B., Maier, D., Rayner, N., Rucker, J.: Quarrying dataspaces: schemaless profiling of unfamiliar information sources. In: Proceedings of the workshop on information integration methods, architectures, and systems (IIMAS) (2008)

  27. Kemper A., Moerkotte G.: Access support relations: an indexing method for object bases. Inf. Syst. 17(2), 117–145 (1992)

    Article  MATH  Google Scholar 

  28. Library catalog data. http://simile.mit.edu/rdf-test-data/barton/

  29. Longwell: http://simile.mit.edu/longwell/

  30. Lu, J., Cao, F., Ma, L., Yu, Y., Pan, Y.: An Effective SPARQL support over relational databases. In: Proceedings of the joint ODBIS/SWDB workshop on semantic web, ontologies, and databases (2007)

  31. Lu, J., Ma, L., Zhang, L., Brunner, J.-S., Wang, C., Pan, Y., Yu, Y.: SOR: A practical system for ontology storage, reasoning and search. In: Proceedings of VLDB, pp. 1402–1405 (2007)

  32. Lu, J., Wang, C., Ma, L., Yu, Y., Pan, Y.: Performance and scalability evaluation of practical ontology systems. In: Proceedings of the joint ODBIS/SWDB workshop on semantic web, ontologies, and databases (2007)

  33. MacNicol, R., French, B.: Sybase IQ multiplex—designed for analytics. In: VLDB pp. 1227–1230 (2004)

  34. Metaweb: Freebase parallax. Web page. http://mqlx.com/~david/parallax/

  35. Milo, T., Suciu, D.: Index structures for path expressions. In: Proceedings of ICDT, pp. 277–295 (1999)

  36. Olofson, C.: Worldwide rdbms 2005 vendor shares. Technical report 201692, IDC, USA (2006)

  37. Redland RDF application framework. http://librdf.org/

  38. RDF Primer. W3C Recommendation. http://www.w3.org/TR/rdf-primer (2004)

  39. RDQL—A Query Language for RDF. W3C Member Submission 9 January 2004. http://www.w3.org/Submission/RDQL/, 2004

  40. Simile website. http://simile.mit.edu/

  41. SPARQL Query Language for RDF. W3C Working Draft 4 October 2006. http://www.w3.org/TR/rdf-sparql-query/, 2006

  42. Schmidt, M., Hornung, T., Kuchlin, N., Lausen, G., Pinkel, C.: An experimental comparison of RDF data management approaches in a SPARQL benchmark scenario. In: Proceedings of ISWC (2008)

  43. Shanmugasundaram, J., Tufte, K., Zhang, C., He, G., DeWitt, D.J., Naughton, J.F.: Relational databases for querying XML documents: Limitations and opportunities. In: Proceedings of VLDB, pp. 302–314 (1999)

  44. Sindice. The semantic web index. http://sindice.com/

  45. Stonebraker, M., Abadi, D.J., Batkin, A., Chen, X., Cherniack, M., Ferreira, M., Lau, E., Lin, A., Madden, S., O’Neil, E.J., O’Neil, P.E., Rasin, A., Tran, N., Zdonik, S.B.: C-Store: a column-oriented DBMS. In: VLDB, pp. 553–564 (2005)

  46. Swoogle: Semantic web search engine. http://swoogle.umbc.edu/

  47. Theoharis, Y., Christophides, V., Karvounarakis, G.: Benchmarking database representations of RDF/S stores. In: Proceedings of ISWC (2005)

  48. UniProt: RDF dataset. http://dev.isb-sib.ch/projects/uniprot-rdf/

  49. Vesset, D.: Worldwide data warehousing tools 2005 vendor shares. Technical report 203229, IDC, USA (2006)

  50. W3C SWEO Community Project: Linking open data on the semantic web. http://esw.w3.org/topic/SweoIG/TaskForces/CommunityProjects/LinkingOpen Data

  51. World Wide Web Consortium (W3C). http://www.w3.org/

  52. Weiss, C., Karras, P., Bernstein, A.: Hexastore: sextuple indexing for semantic web data management. In: Proceedings of VLDB (2008)

  53. Wilkinson, K.: Jena property table implementation. In: SSWS (2006)

  54. Wilkinson, K., Sayers, C., Kuno, H., Reynolds, D.: Efficient RDF storage and retrieval in Jena2. In: SWDB, pp. 131–150 (2003)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Daniel J. Abadi.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Abadi, D.J., Marcus, A., Madden, S.R. et al. SW-Store: a vertically partitioned DBMS for Semantic Web data management. The VLDB Journal 18, 385–406 (2009). https://doi.org/10.1007/s00778-008-0125-y

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00778-008-0125-y

Keywords

Navigation