Log in

Incorporating topic and property for knowledge base synchronization

  • Regular Paper
  • Published:
Knowledge and Information Systems Aims and scope Submit manuscript

Abstract

Open-domain knowledge bases have been widely used in many applications, and it is critical to maintain their freshness. Most existing studies update an open knowledge base by predicting the change frequencies of the entities and then updating those unstable ones. However, in the knowledge base, there are various entities and properties with complex structural information, and many entities are time-sensitive. In this work, we propose a novel topic-aware entity stability prediction framework which incorporates property and topic features of the entities to help efficiently update the knowledge base. To deal with the complex entity structure and various entity properties, we first build an entity property graph for each entity, with its property names as edges and property values as nodes. Then, with the constructed entity property graph, we analyze the topic information of the entities and propose a topic classifier via unsupervised clustering to further improve the accuracy of prediction. To tackle the time-sensitive challenge, we measure the monthly average update frequency of the entity, based on its revision history acquired from the source encyclopedia webpage, as the basis for labeling its stability. Finally, we formulate the prediction task as a binary classification problem and solve it with an entity stability predictor, wherein the topic information serves as strong supervision. Extensive experiments on collections of real-world entities have demonstrated the superior performance of our proposed method and also well shown the benefits of each new module in our framework.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price includes VAT (Germany)

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

Availability of data and materials

The raw/processed data required to reproduce these findings cannot be shared at this time as the data also forms part of an ongoing study.

Notes

  1. https://baike.baidu.com.

  2. https://www.mediawiki.org/wiki/Wikidata_Query_Service/User_Manual.

  3. https://www.wikipedia.org.

  4. http://kw.fudan.edu.cn/cndbpedia/download.

References

  1. Galárraga L, Heitz G, Murphy K, Suchanek FM (2014) Canonicalizing open knowledge bases. In: Proceedings of the 23rd Acm International Conference on Conference on Information and Knowledge Management, pp 1679–1688

  2. Vashishth S, Jain P, Talukdar P (2018) Cesi: Canonicalizing open knowledge bases using embeddings and side information. In: Proceedings of the 2018 World Wide Web Conference, pp 1317–1327

  3. Beniwal R, Gawas P, Charan CP, Nutalapati V, Mariserla BMK (2022) Effect of hydroxy groups on nonlinear optical behaviour of encapsulated freebase porphyrin thin films in a borate glass matrix. Mater Sci Eng, B 284:115908

    Article  Google Scholar 

  4. Miller GA (1995) Wordnet: a lexical database for english. Commun ACM 38(11):39–41

    Article  Google Scholar 

  5. Bollacker K, Evans C, Paritosh P, Sturge T, Taylor J (2008) Freebase: a collaboratively created graph database for structuring human knowledge. In: Proceedings of the 2008 ACM SIGMOD International Conference on Management of Data, pp 1247–1250

  6. Mitchell T, Cohen W, Hruschka E, Talukdar P, Yang B, Betteridge J, Carlson A, Dalvi B, Gardner M, Kisiel B et al (2018) Never-ending learning. Commun ACM 61(5):103–115

    Article  Google Scholar 

  7. Hellmann S, Stadler C, Lehmann J, Auer S (2009) Dbpedia live extraction. In: On the Move to Meaningful Internet Systems: OTM 2009: Confederated International Conferences, CoopIS, DOA, IS, and ODBASE 2009, Vilamoura, Portugal, November 1-6, 2009, Proceedings, Part II, pp 1209–1223. Springer

  8. Morsey M, Lehmann J, Auer S, Stadler C, Hellmann S (2012) Dbpedia and the live extraction of structured data from wikipedia. Program 46(2):157–181

    Article  Google Scholar 

  9. Liang12 J, Zhang S, **ao134 Y (2017) How to keep a knowledge base synchronized with its encyclopedia source

  10. Konovalov A, Strauss B, Ritter A, O’Connor B (2017) Learning to extract events from knowledge base revisions. In: Proceedings of the 26th International Conference on World Wide Web, pp 1007–1014

  11. Tang J, Feng Y, Zhao D (2019) Learning to update knowledge graphs by reading news. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pp 2632–2641

  12. Tanon TP, Kaffee L-A (2018) Property label stability in wikidata. In: Companion of the The Web Conference 2018. ACM Press

  13. Dikeoulias I, Strötgen J, Razniewski S (2019) Epitaph or breaking news? analyzing and predicting the stability of knowledge base properties. In: Companion Proceedings of The 2019 World Wide Web Conference, pp 1155–1158

  14. Pellissier Tanon T, Kaffee L-A (2018) Property label stability in wikidata: evolution and convergence of schemas in collaborative knowledge bases. In: Companion Proceedings of the The Web Conference 2018, pp 1801–1803

  15. Vrandečić D, Krötzsch M (2014) Wikidata: a free collaborative knowledgebase. Commun ACM 57(10):78–85

    Article  Google Scholar 

  16. Hogan A, Blomqvist E, Cochez M, d’Amato C, Melo Gd, Gutierrez C, Kirrane S, Gayo JEL, Navigli R, Neumaier S et al (2021) Knowledge graphs. ACM Computing Surveys (CSUR) 54(4):1–37

    Article  Google Scholar 

  17. Ji S, Pan S, Cambria E, Marttinen P, Philip SY (2021) A survey on knowledge graphs: Representation, acquisition, and applications. IEEE transactions on neural networks and learning systems 33(2):494–514

    Article  MathSciNet  Google Scholar 

  18. Pareja A, Domeniconi G, Chen J, Ma T, Suzumura T, Kanezashi H, Kaler T, Schardl T, Leiserson C (2020) Evolvegcn: Evolving graph convolutional networks for dynamic graphs. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, pp. 5363–5370

  19. Xu D, Ruan C, Korpeoglu E, Kumar S, Achan K (2020) Inductive representation learning on temporal graphs. ar**v preprint ar**v:2002.07962

  20. Wang X, Lyu D, Li M, **a Y, Yang Q, Wang X, Wang X, Cui P, Yang Y, Sun B et al (2021) Apan: Asynchronous propagation attention network for real-time temporal graph embedding. In: Proceedings of the 2021 International Conference on Management of Data, pp. 2628–2638

  21. Zhou H, Zheng D, Nisa I, Ioannidis V, Song X, Karypis G (2022) Tgl: A general framework for temporal gnn training on billion-scale graphs. ar**v preprint ar**v:2203.14883

  22. Rossi E, Chamberlain B, Frasca F, Eynard D, Monti F, Bronstein M (2020) Temporal graph networks for deep learning on dynamic graphs. ar**v preprint ar**v:2006.10637

  23. Longa A, Lachi V, Santin G, Bianchini M, Lepri B, Lio P, Scarselli F, Passerini A (2023) Graph neural networks for temporal graphs: State of the art, open challenges, and opportunities. ar**v preprint ar**v:2302.01018

  24. Xu K, Hu W, Leskovec J, Jegelka S (2018) How powerful are graph neural networks? ar**v preprint ar**v:1810.00826

  25. Kipf TN, Welling M (2016) Semi-supervised classification with graph convolutional networks. ar**v preprint ar**v:1609.02907

  26. Cho J, Garcia-Molina H (2003) Estimating frequency of change. ACM Transactions on Internet Technology (TOIT) 3(3):256–290

    Article  Google Scholar 

  27. Riedmiller M, Lernen A (2014) Multi layer perceptron. Machine Learning Lab Special Lecture, University of Freiburg, 7–24

  28. Cordonnier J-B, Loukas A, Jaggi M (2020) Multi-head attention: Collaborate instead of concatenate. ar**v preprint ar**v:2006.16362

  29. Yue L, Jun X, Sihang Z, Siwei W, **feng G, **hong Y, Ke L, Wenxuan T, Wang LX et al (2022) A survey of deep graph clustering: Taxonomy, challenge, and application. ar**v preprint ar**v:2211.12875

  30. Ran X, ** Y, Lu Y, Wang X, Lu Z (2023) Comprehensive survey on hierarchical clustering algorithms and the recent developments. Artif Intell Rev 56(8):8219–8264

    Article  Google Scholar 

  31. Tsitsulin A, Palowitch J, Perozzi B, Müller E (2023) Graph clustering with graph neural networks. J Mach Learn Res 24(127):1–21

    MathSciNet  Google Scholar 

  32. Shin G, Albanie S, **e W (2022) Unsupervised salient object detection with spectral cluster voting. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3971–3980

  33. Zhao S, Zhu L, Wang X, Yang Y (2022) Centerclip: Token clustering for efficient text-video retrieval. In: Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 970–981

  34. Verma D, Meila M (2003) A comparison of spectral clustering algorithms. University of Washington Tech Rep UWCSE030501 1:1–18

  35. Filippone M, Camastra F, Masulli F, Rovetta S (2008) A survey of kernel and spectral methods for clustering. Pattern Recogn 41(1):176–190

    Article  Google Scholar 

  36. Ruby U, Yendapalli V (2020) Binary cross entropy with deep learning technique for image classification. Int J Adv Trends Comput Sci Eng 9(10):8353

    Google Scholar 

  37. Blakely D, Lanchantin J, Qi Y (2021) Time and space complexity of graph convolutional networks. Accessed on: Dec 31

  38. Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, Polosukhin I (2017) Attention is all you need. Advances in neural information processing systems 30

  39. Tremblay N, Loukas A (2020) Approximating spectral clustering via sampling: a review. Sampling Techniques for Supervised or Unsupervised Tasks, 129–183

  40. Neter J, Kutner MH, Nachtsheim CJ, Wasserman W et al (1996) Applied linear statistical models

  41. Liaw A, Wiener M et al (2002) Classification and regression by randomforest. R news 2(3):18–22

    Google Scholar 

  42. Schlichtkrull M, Kipf TN, Bloem P, Van Den Berg R, Titov I, Welling M (2018) Modeling relational data with graph convolutional networks. In: The Semantic Web: 15th International Conference, ESWC 2018, Heraklion, Crete, Greece, June 3–7, 2018, Proceedings 15, pp 593–607. Springer

  43. Velickovic P, Cucurull G, Casanova A, Romero A, Lio P, Bengio Y et al (2017) Graph attention networks. stat 1050(20):10–48550

    Google Scholar 

  44. Xu B, Xu Y, Liang J, **e C, Liang B, Cui W, **ao Y (2017) Cn-dbpedia: A never-ending chinese knowledge extraction system. In: International Conference on Industrial, Engineering and Other Applications of Applied Intelligent Systems, pp 428–438. Springer

  45. Wijaya DT, Nakashole N, Mitchell T (2015) “a spousal relation begins with a deletion of engage and ends with an addition of divorce”: Learning state changing verbs from wikipedia revision history. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp 518–523

  46. Razniewski S (2016) Optimizing update frequencies for decaying information. In: Proceedings of the 25th ACM International on Conference on Information and Knowledge Management, pp 1191–1200

  47. Galárraga L, Razniewski S, Amarilli A, Suchanek FM (2017) Predicting completeness in knowledge bases. In: Proceedings of the Tenth Acm International Conference on Web Search and Data Mining, pp. 375–383

  48. Luggen M, Audiffren J, Difallah D, Cudré-Mauroux P (2021) Wiki2prop: A multimodal approach for predicting wikidata properties from wikipedia. In: Proceedings of the Web Conference 2021, pp. 2357–2366

  49. Shenoy K, Ilievski F, Garijo D, Schwabe D, Szekely P (2022) A study of the quality of wikidata. Journal of Web Semantics 72:100679

    Article  Google Scholar 

  50. Liu Y, Hua W, **n K, Hosseini S, Zhou X (2023) Tea: Time-aware entity alignment in knowledge graphs. In: Proceedings of the ACM Web Conference 2023, pp. 2591–2599

  51. Najafipour S, Hosseini S, Hua W, Kangavari MR, Zhou X (2020) Soulmate: Short-text author linking through multi-aspect temporal-textual embedding. IEEE Trans Knowl Data Eng 34(1):448–461

    Article  Google Scholar 

Download references

Funding

This work was supported by the National Natural Science Foundation of China (No.61876186).

Author information

Authors and Affiliations

Authors

Contributions

Jiajun Tong contributed to the conception of the study and manuscript preparation. Zhixiao Wang helped perform the analysis with constructive discussions. **aobin Rui contributed significantly to analysis and manuscript preparation. All authors reviewed the manuscript.

Corresponding author

Correspondence to Zhixiao Wang.

Ethics declarations

Conflict of interest

The authors declare that there is no Conflict of interest regarding the publication of this article.

Ethical Approval

Not applicable.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Tong, J., Wang, Z. & Rui, X. Incorporating topic and property for knowledge base synchronization. Knowl Inf Syst (2024). https://doi.org/10.1007/s10115-024-02160-0

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s10115-024-02160-0

Keywords

Navigation