Benchmarking Knowledge Graph Embeddings

Paulheim, Heiko; Ristoski, Petar; Portisch, Jan

doi:10.1007/978-3-031-30387-6_3

Heiko Paulheim⁶,
Petar Ristoski⁷ &
Jan Portisch⁸

Part of the book series: Synthesis Lectures on Data, Semantics, and Knowledge ((SLDSK))

162 Accesses

Abstract

RDF2vec (and other techniques) provide embedding vectors for knowledge graphs. While we have used a simple node classification task so far, this chapter introduces a few datasets and three common benchmarks for embedding methods—i.e., SW4ML, GEval, and DLCC—and shows how to use them for comparing different variants of RDF2vec. The novel DLCC benchmark allows us to take a closer look at what RDF2vec vectors actually represent, and to analyze what proximity in the vector space means for them.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

EUR 32.99 /Month

Get 10 units per month
Download Article/Chapter or Ebook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Subscribe now

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 49.99; Price excludes VAT (USA)

Hardcover Book: USD 49.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
http://w3id.org/sw4ml-datasets.
2.
http://dl-learner.org/.
3.
http://data.bgs.ac.uk/.
4.
https://github.com/janothan/DL-TC-Generator/tree/master/src/main/resources/queries.
5.
The desired size classes can be configured in the framework.
6.
Since the classification tasks are balanced, random guessing would yield an accuracy of 0.5.

References

Bloehdorn S, Sure Y (2007) Kernel methods for mining instance data in ontologies. The Semantic Web, pp 58–71
Google Scholar
de Boer V, Wielemaker J, van Gent J, Hildebrand M, Isaac A, van Ossenbruggen J, Schreiber G (2012) Supporting linked data production for cultural heritage institutes: The Amsterdam museum case study. In: The semantic web: research and applications. Springer, pp 733–747. https://doi.org/10.1007/978-3-642-30284-8_56
de Vries GKD (2013) A fast approximation of the Weisfeiler-Lehman graph kernel for RDF data. In: ECML/PKDD (1), pp 606–621
Google Scholar
Frey J, Götz F, Hofer M, Hellmann S (2022) Managing and compiling data dependencies for semantic applications using databus client. In: Research conference on metadata and semantics research. Springer, pp 114–125
Google Scholar
Hoffart J, Seufert S, Nguyen DB, Theobald M, Weikum G (2012) KORE: keyphrase overlap relatedness for entity disambiguation. In: Chen X, Lebanon G, Wang H, Zaki MJ (eds) 21st ACM international conference on information and knowledge management, CIKM’12, Maui, HI, USA, October 29 - November 02, 2012, ACM, pp 545–554. https://doi.org/10.1145/2396761.2396832
Lee MD, Pincombe B, Welsh M (2005) An empirical evaluation of models of text document similarity. Proc Ann Meet Cogn Sci Soc 7(7):1254–1529, https://hdl.handle.net/2440/28910
Mendes PN, Jakob M, García-Silva A, Bizer C (2011) Dbpedia spotlight: shedding light on the web of documents. In: Ghidini C, Ngomo AN, Lindstaedt SN, Pellegrini T (eds) Proceedings the 7th international conference on semantic systems, I-SEMANTICS 2011, Graz, Austria, September 7-9, 2011, ACM, ACM International Conference Proceeding Series, pp 1–8. https://doi.org/10.1145/2063518.2063519
Mikolov T, Chen K, Corrado G, Dean J (2013) Efficient estimation of word representations in vector space. ar**v:1301.3781
Portisch J, Paulheim H (2022) The dlcc node classification benchmark for analyzing knowledge graph embeddings. In: International semantic web conference
Google Scholar
Ristoski P, Rosati J, Di Noia T, De Leone R, Paulheim H (2019) Rdf2vec: Rdf graph embeddings and their applications. Semantic Web 10(4):721–752
Article Google Scholar
Ristoski P, Vries GKDd, Paulheim H (2016) A collection of benchmark datasets for systematic evaluations of machine learning on the semantic web. In: International semantic web conference, Springer, pp 186–194
Google Scholar

Download references

Author information

Authors and Affiliations

University of Mannheim, Mannheim, Germany
Heiko Paulheim
eBay (United States), San Jose, CA, USA
Petar Ristoski
SAP SE, Walldorf, Germany
Jan Portisch

Authors

Heiko Paulheim
View author publications
You can also search for this author in PubMed Google Scholar
Petar Ristoski
View author publications
You can also search for this author in PubMed Google Scholar
Jan Portisch
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Heiko Paulheim .

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Paulheim, H., Ristoski, P., Portisch, J. (2023). Benchmarking Knowledge Graph Embeddings. In: Embedding Knowledge Graphs with RDF2vec. Synthesis Lectures on Data, Semantics, and Knowledge. Springer, Cham. https://doi.org/10.1007/978-3-031-30387-6_3

Download citation

DOI: https://doi.org/10.1007/978-3-031-30387-6_3
Published: 04 June 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-30386-9
Online ISBN: 978-3-031-30387-6
eBook Packages: Synthesis Collection of Technology (R0)eBColl Synthesis Collection 12

Publish with us

Policies and ethics