Abstract
With the rapid growth of web data, information about the same target gathered from multiple sources often exhibits conflicts. This problem motivates the need for truth discovery, which is to automatically resolve conflicts and find the truth from multiple conflicting claims. Existing truth discovery methods are mainly based on iterative updates or probability models. A common limitation of these methods is that their models are complex to be built. In this paper, we propose a concise end-to-end deep neural network for truth discovery, which regards the task as a classification problem. Firstly, for each target, we extract a unique claim, and for each unique claim, we construct a source-unique-claim vector depending on whether the source provides this value. Then on the training dataset, we label the vector as true/false according to the ground truth. Finally, we use a deep neural network to build a classification model for each target to judge which claim is the truth. Experimental results on two real-world datasets show that our proposed model has better performance than existing state-of-the-art methods.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Li, X., Dong, X.L., Lyons, K., Meng, W., et al.: Truth finding on the deep web: is the problem solved? Proc. VLDB Endowment 6(2), 97–108 (2012)
Dong, X.L., Saha, B., Srivastava, D.: Less is more: selecting sources wisely for integration. Proc. VLDB Endowment 6(2), 37–48 (2012)
Li, Y., Gao, J., Meng, C., Li, Q., et al.: A survey on truth discovery. Proc. ACM SIGKDD Explorations Newsletter 17(2), 1–16 (2016)
Yin, X., Han, J., Philip, S.Y.: Truth discovery with multiple conflicting information providers on the web. IEEE Trans. Knowl. Data Eng. 20(6), 796–808 (2008)
Zhao, B., Rubinstein, B.I., Gemmell, J., Han, J.: A Bayesian approach to discovering truth from conflicting sources for data integration. Proc. VLDB Endowment 5(6), 550–561 (2012)
Zhao, B., Han, J.: A probabilistic model for estimating real-valued truth from conflicting sources. In: Proceedings of the 10th International Workshop on Quality in Databases, pp. 1–7. In conjunction with VLDB (2012)
Pasternack, J., Roth, D.: Latent credibility analysis. In: WWW, pp. 1009–1020 (2013)
Ma, F., Li, Y., Li, Q., Qiu, M., et al.: FaitCrowd: fine grained truth discovery for crowdsourced data aggregation. In: Proceedings of the 21th ACM SIGKDD International Conference on KDD, ACM, pp. 745–754 (2015)
Li, Q., Li, Y., Gao, J., Su, L., et al.: A confidence-aware approach for truth discovery on long-tail data. Proc. VLDB Endowment 8(4), 425–436 (2014)
Aydin, B.I., Yilmaz, Y.S., Li, Y., Li, Q., Gao, J., et al.: Crowdsourcing for multiple-choice question answering. In: Proceedings of the Twenty-Sixth Annual Conference on Innovative Applications of Artificial Intelligence, Association for the Advancement of Artificial Intelligence. pp. 2946–2953 (2014)
Li, Q., Li, Y., Gao, J., Zhao, B., et al.: Resolving conflicts in heterogeneous data by truth discovery and source reliability estimation. In: Proceedings of the ACM SIGMOD Int. Conf. Manage. Data, pp. 1187–1198 (2014)
Li, Y., Li, Q., Gao, J., Su, L., et al.: On the discovery of evolving truth. In: Proceedings of 21th ACM SIGMOD International Conference on KDD, ACM, pp. 675–684 (2015)
** based truth discovery approach. In: Proceedings of the 22th ACM SIGKDD International Conference on KDD, pp 1935–1944 (2016)
Lyu, S., Ouyang, W., Wang, Y., Shen, H., et al.: Truth discovery by claim and source embedding. IEEE Trans. Knowl. Data Eng. 1 (2019)
Li, L., Qin, B., Ren, W., Lin, T.: Truth discovery with memory network. Tsinghua Sci. Technol. 22(6), 609–618 (2017)
Luna Homepage, http://www.lunadong.com/fusionDataSets.htm. Accessed 10 June 2020
Pasternack, J., Roth, D.: Knowing what to believe (when you already know something). In: COLING.ACL, pp. 877–885 (2010)
Pasternack, J., Roth, D.: Making better informed trust decisions with generalized fact-finding. In: International Joint Conference on Artificial Intelligence (2011)
Galland, A., Abiteboul, S., Marian, A., Senellart, P.: Corroborating information from disagreeing views. In: ACM International Conference on Web Search & Data Mining, pp. 131–140 (2010)
Dong, X.L., Berti-Equille, L., Srivastava, D.: Integrating conflicting data: the role of source dependence. Proc. VlDB Endowment 2(1), 550–561 (2009)
Dong, Y., Dragut, E.C., Meng, W.: Normalization of duplicate records from multiple sources. IEEE Trans. Knowl. Data Eng. 31(4), 769–782 (2019)
Dafna Homepage. http://da.qcri.org/dafna/#/dafna/home_sections/home.html. Accessed 05 June 2020
Gu, Q., Dong, Y., Hu, Y., Liu, Y.: A method for duplicate record detection using deep learning. In: Ni, W., Wang, X., Song, W., Li, Y. (eds.) WISA 2019. LNCS, vol. 11817, pp. 85–91. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-30952-7_10
Acknowledgments
This work is supported by the National Natural Science Foundation of China (No. 61872168, 61702237), Postgraduate Research & Practice Innovation Program of Jiangsu Province (No. KYCX20_2382, No. KYCX20_2396).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Chen, H., Dong, Y., Gu, Q., Liu, Y. (2020). An End-to-End Deep Neural Network for Truth Discovery. In: Wang, G., Lin, X., Hendler, J., Song, W., Xu, Z., Liu, G. (eds) Web Information Systems and Applications. WISA 2020. Lecture Notes in Computer Science(), vol 12432. Springer, Cham. https://doi.org/10.1007/978-3-030-60029-7_35
Download citation
DOI: https://doi.org/10.1007/978-3-030-60029-7_35
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-60028-0
Online ISBN: 978-3-030-60029-7
eBook Packages: Computer ScienceComputer Science (R0)