Generating Possible Interpretations for Statistics from Linked Open Data

Paulheim, Heiko

doi:10.1007/978-3-642-30284-8_44

Heiko Paulheim²¹

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 7295))

Included in the following conference series:

Extended Semantic Web Conference

3127 Accesses
27 Citations

Abstract

Statistics are very present in our daily lives. Every day, new statistics are published, showing the perceived quality of living in different cities, the corruption index of different countries, and so on. Interpreting those statistics, on the other hand, is a difficult task. Often, statistics collect only very few attributes, and it is difficult to come up with hypotheses that explain, e.g., why the perceived quality of living in one city is higher than in another. In this paper, we introduce Explain-a-LOD, an approach which uses data from Linked Open Data for generating hypotheses that explain statistics. We show an implemented prototype and compare different approaches for generating hypotheses by analyzing the perceived quality of those hypotheses in a user study.

Download to read the full chapter text

Chapter PDF

StatSpace: A Unified Platform for Statistical Data Exploration

Representing Statistical Indexes as Linked Data Including Metadata about Their Computation Process

Open Statistics: The Rise of a New Era for Open Data?

Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

Agrawal, R., Imielinski, T., Swami, A.: Mining Association Rules between Sets of Items in Large Databases. In: Proceedings of the ACM SIGMOD International Conference on Management of Data, pp. 207–216 (1993)0
Google Scholar
Bizer, C., Heath, T., Berners-Lee, T.: Linked Data - The Story So Far. International Journal on Semantic Web and Information Systems 5(3), 1–22 (2009)
Article Google Scholar
Bizer, C., Lehmann, J., Kobilarov, G., Auer, S., Becker, C., Cyganiak, R., Hellmann, S.: DBpedia - A crystallization point for the Web of Data. Web Semantics - Science Services and Agents on the World Wide Web 7(3), 154–165 (2009)
Article Google Scholar
Bouckaert, R.R., Frank, E., Hall, M., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: WEKA — Experiences with a Java open-source project. Journal of Machine Learning Research 11, 2533–2541 (2010)
Google Scholar
Callahan, E.S., Herring, S.C.: Cultural bias in wikipedia content on famous persons. Journal of the American Society for Information Science and Technology 62(10), 1899–1915 (2011)
Article Google Scholar
Cohen, W.W.: Fast effective rule induction. In: Twelfth International Conference on Machine Learning, pp. 115–123. Morgan Kaufmann (1995)
Google Scholar
Ell, B., Vrandečić, D., Simperl, E.: Labels in the Web of Data. In: Aroyo, L., Welty, C., Alani, H., Taylor, J., Bernstein, A., Kagal, L., Noy, N., Blomqvist, E. (eds.) ISWC 2011, Part I. LNCS, vol. 7031, pp. 162–176. Springer, Heidelberg (2011)
Chapter Google Scholar
Guyon, I., Gunn, S., Nikravesh, M., Zadeh, L.A. (eds.): Feature Extraction – Foundations and Applications. Springer (2006)
Google Scholar
Ihaka, R.: R: Past and future history. In: Proceedings of the 30th Symposium on the Interface (1998)
Google Scholar
Kiefer, C., Bernstein, A., Locher, A.: Adding Data Mining Support to SPARQL Via Statistical Relational Learning Methods. In: Bechhofer, S., Hauswirth, M., Hoffmann, J., Koubarakis, M. (eds.) ESWC 2008. LNCS, vol. 5021, pp. 478–492. Springer, Heidelberg (2008)
Chapter Google Scholar
Kämpgen, B., Harth, A.: Transforming statistical linked data for use in olap systems. In: 7th International Conference on Semantic Systems, I-SEMANTICS 2011 (2011)
Google Scholar
Mulwad, V., Finin, T., Syed, Z., Joshi, A.: Using linked data to interpret tables. In: Proceedings of the First International Workshop on Consuming Linked Data, COLD 2010 (2010)
Google Scholar
Novak, P.K., Vavpetič, A., Trajkovski, I., Lavrač, N.: Towards semantic data mining with g-segs. In: Proceedings of the 11th International Multiconference Information Society, IS 2009 (2009)
Google Scholar
Ott, R.L., Longnecker, M.: Introduction to Statistical Methods and Data Analysis. Brooks/Cole (2006)
Google Scholar
Paulheim, H., Fürnkranz, J.: Unsupervised Feature Generation from Linked Open Data. In: International Conference on Web Intelligence, Mining, and Semantics, WIMS 2012 (2012)
Google Scholar
Piccinini, H., Casanova, M.A., Furtado, A.L., Nunes, B.P.: Verbalization of rdf triples with applications. In: ISWC 2011 – Outrageous Ideas track (2011)
Google Scholar
Suchanek, F.M., Kasneci, G., Weikum, G.: Yago: a core of semantic knowledge. In: Proceedings of the 16th International Conference on World Wide Web, WWW 2007, pp. 697–706. ACM (2007)
Google Scholar
W3C: SPARQL Query Language for RDF (2008), http://www.w3.org/TR/rdf-sparql-query/
Wrobel, S.: An algorithm for multi-relational discovery of subgroups. In: Symposium on Pattern Discovery in Databases (PKDD 1997) (1997)
Google Scholar
Zapilko, B., Harth, A., Mathiak, B.: Enriching and analysing statistics with linked open data. In: Conference on New Techniques and Technologies for Statistics, NTTS (2011)
Google Scholar

Download references

Author information

Authors and Affiliations

Knowledge Engineering Group, Technische Universität Darmstadt, Germany
Heiko Paulheim

Authors

Heiko Paulheim
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Institute AIFB, Karlsruhe Institute of Technology, Englerstrasse 11, 76131, Karlsruhe, Germany
Elena Simperl
CITEC, University of Bielefeld, Morgenbreede 39, 33615, Bielefeld, Germany
Philipp Cimiano
Siemens AG Österreich, Siemensstrasse 90, 1210, Vienna, Austria
Axel Polleres
Technical University of Madrid, C/ Severo Ochoa, 13, 28660, Boadilla del Monte, Madrid, Spain
Oscar Corcho
STLab, ISTC-CNR, Via Nomentana 56, 00161, Rome, Italy
Valentina Presutti

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Paulheim, H. (2012). Generating Possible Interpretations for Statistics from Linked Open Data. In: Simperl, E., Cimiano, P., Polleres, A., Corcho, O., Presutti, V. (eds) The Semantic Web: Research and Applications. ESWC 2012. Lecture Notes in Computer Science, vol 7295. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-30284-8_44

Download citation

DOI: https://doi.org/10.1007/978-3-642-30284-8_44
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-30283-1
Online ISBN: 978-3-642-30284-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Generating Possible Interpretations for Statistics from Linked Open Data

Abstract

Chapter PDF

Similar content being viewed by others

StatSpace: A Unified Platform for Statistical Data Exploration

Representing Statistical Indexes as Linked Data Including Metadata about Their Computation Process

Open Statistics: The Rise of a New Era for Open Data?

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Generating Possible Interpretations for Statistics from Linked Open Data

Abstract

Chapter PDF

Similar content being viewed by others

StatSpace: A Unified Platform for Statistical Data Exploration

Representing Statistical Indexes as Linked Data Including Metadata about Their Computation Process

Open Statistics: The Rise of a New Era for Open Data?

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation