Log in

Fake news detection on social media using a natural language inference approach

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Fake news detection is a challenging problem in online social media, with considerable social and political impacts. Several methods have already been proposed for the automatic detection of fake news, which are often based on the statistical features of the content or context of news. In this paper, we propose a novel fake news detection method based on Natural Language Inference (NLI) approach. Instead of using only statistical features of the content or context of the news, the proposed method exploits a human-like approach, which is based on inferring veracity using a set of reliable news. In this method, the related and similar news published in reputable news sources are used as auxiliary knowledge to infer the veracity of a given news item. We also collect and publish the first inference-based fake news detection dataset, called FNID, in two formats: the two-class version (FNID-FakeNewsNet) and the six-class version (FNID-LIAR). We use the NLI approach to boost several classical and deep machine learning models, including Decision Tree, Naïve Bayes, Random Forest, Logistic Regression, k-Nearest Neighbors, Support Vector Machine, BiGRU, and BiLSTM along with different word embedding methods including Word2vec, GloVe, fastText, and BERT. The experiments show that the proposed method achieves 85.58% and 41.31% accuracies in the FNID-FakeNewsNet and FNID-LIAR datasets, respectively, which are 10.44% and 13.19% respective absolute improvements.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price includes VAT (Germany)

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

Notes

  1. www.politifact.com

  2. www.snopes.com

  3. www.factcheck.org

  4. https://ieee-dataport.org/open-access/fnid-fake-news-inference-dataset

  5. https://www.politifact.com/api/factchecks

  6. https://ieee-dataport.org/open-access/fnid-fake-news-inference-dataset

References

  1. Ajao O, Bhowmik D, Zargari S (2019) Sentiment aware fake news detection on online social networks. ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp 2507–2511. IEEE

  2. Amirkhani H, AzariJafari M, Pourjafari Z, Faridan-Jahromi S, Kouhkan Z, Amirak A (2021) FarsTail: A Persian Natural Language Inference Dataset, ar**v:2009.08820

  3. Bakhteev O, Ogaltsov A, Ostroukhov P (2020) Fake News Spreader Detection using Neural Tweet Aggregation. CLEF 2020 Labs and Workshops, Notebook Papers, CEUR-WS.org

  4. Behzad B, Bheem B, Elizondo D, Marsh D, Martonosi S (2021) Prevalence and Propagation of Fake News, ar**v:2106.09586

  5. Bojanowski P, Grave E, Joulin A, Mikolov T (2017) Enriching word vectors with subword information. Trans Ass Comput Linguist ics, 5:135–146. MIT Press

  6. Bowman SR, Angeli G, Potts C, Manning DD (2015) A large annotated corpus for learning natural language inference, ar**v:1508.05326

  7. Breiman L (2001) Random forests: Machine learning, vol 45. Springer, pp 5–32

  8. Chen Q, Zhu X, Ling Z, Wei S, Jiang H, Inkpen D (2016) Enhanced lstm for natural language inference. ar**v:1609.06038

  9. Cho K, Van Merriënboer B, Gulcehre C, Bahdanau D, Bougares F, Schwenk H, Bengio Y (2014) Learning phrase representations using RNN encoder-decoder for statistical machine translation, ar**v:1406.1078

  10. Conneau A, Kiela D, Schwenk H, Barrault L, Bordes A (2017) Supervised learning of universal sentence representations from natural language inference data (EMNLP)

  11. Della Vedova ML, Tacchini E, Moret S, Ballarin G, DiPierro M, de Alfaro L (2018) Automatic online fake news detection combining content and social signals. 2018 22nd Conference of Open Innovations Association (FRUCT), pp 272–279. IEEE

  12. Dey R, Salemt FM (2017) Gate-variants of gated recurrent unit (GRU) neural networks. 2017 IEEE 60th international midwest symposium on circuits and systems (MWSCAS). IEEE, pp 1597–1600

  13. Devlin J, Chang M-W, Lee K, Toutanova K (2019) BERT: Pre-training of deep bidirectional transformers for language understanding. Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). Association for Computational Linguistics, pp 4171–4186

  14. Dong X, Victor U, Qian L (2020) Two-path Deep Semi-supervised Learning for Timely Fake News Detection, ar**v:2002.00763

  15. Dreiseitl S, Ohno-Machado L (2002) Logistic regression and artificial neural network classification models: a methodology review. J Biomed Inf 35, 352–359. Elsevier

  16. Farajtabar M, Yang J, Ye X, Xu H, Trivedi R, Khalil E, Li S, Song L, Zha H (2017) Fake News Mitigation via Point Process Based Intervention: International conference on machine learning, pp 1097–1106, PMLR

  17. Golbeck J, Mauriello M, Auxier B, Bhanushali Keval H, Bonk C, Bouzaghrane MA, Buntain C, Chanduka R, Cheakalos P, Everett Jennine B et al (2018) Fake news vs satire: A dataset and analysis. Proceedings of the 10th ACM Conference on Web Science, pp 17–21

  18. Grave E, Bojanowski P, Gupta P, Joulin A, Mikolov T (2018) Learning word vectors for 157 languages. Proceedings of the International Conference on Language Resources and Evaluation (LREC), pp 2018

  19. Hakak S, Khan WZ, Bhattacharya S, Reddy GT, Choo K-R (2020) Propagation of fake news on social media: challenges and opportunities. International Conference on Computational Data and Social Networks, pp 345–353. Springer

  20. Hakak S, Alazab M, Khan S, Gadekallu TR, Maddikunta PKR, Khan WZ (2021) An ensemble machine learning approach through effective feature extraction to classify fake news. Fut Gener Comput Syst 117:47–58. Elsevier

  21. Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9:1735–1780. MIT Press

  22. Holtzman A, Buys J, Forbes M, Bosselut A, Golub D, Choi Y (2018) Learning to Write with Cooperative Discriminators. Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Assoc Comput Linguist:1638–1649

  23. Horne BD, Adali S (2017) This just in: fake news packs a lot in title, uses simpler, repetitive content in text body, more similar to satire than real news. Eleventh International AAAI Conference on Web and Social Media

  24. Hu H, Richardson K, Xu L, Li L, Kuebler S, Moss LS (2020) OCNLI: Original Chinese Natural Language Inference, ar**v:2010.05444

  25. Jiang S, Chen X, Zhang L, Chen S, Liu H (2019) User-Characteristic Enhanced Model for Fake News Detection in Social Media. CCF International conference on natural language processing and chinese computing, pp 634–646. Springer

  26. Jiang L, Wang D, Cai Z, Yan X (2007) Survey of improving naive bayes for classification. International conference on advanced data mining and applications. Springer, pp 134–145

  27. Kaliyar RK, Goswami A, Narang P (2021) FakeBERT: Fake news detection in social media with a BERT-based deep learning approach. Multimed Tools Appl 80(8):11765–11788. Springer

  28. Karimi H, Roy P, Saba-Sadiya S, Tang J (2018) Multi-source multi-class fake news detection. Proc 27th Int Conf Comput Linguisti:1546–1557

  29. Keller JM, Gray MR, Givens JA (1985) A fuzzy k-nearest neighbor algorithm. IEEE Trans Syst Man Cybern 4:580–585.IEEE

  30. Khot T, Sabharwal A, Clark PS (2018) A textual entailment dataset from science question answering. Thirty-Second AAAI Conference on Artificial Intelligence

  31. Kumar P J S, Devi PR, Sai NR, Kumar S, Benarji T (2021) Battling Fake News A Survey on Mitigation Techniques and Identification. 2021 5th International Conference on Trends in Electronics and Informatics (ICOEI). IEEE, pp 829–835

  32. LeCun Y, Bengio Y, Hinton G (2015) Deep learning. Nature 521:436–444. Nature Publishing Group

  33. Li P, Yu H, Zhang W, Xu G, Sun X (2020) SA-NLI: A supervised attention based framework for natural language inference, Elsevier, Neurocomputing

  34. Liu X, He P, Chen W, Gao J (2019) Improving multi-task deep neural networks via knowledge distillation for natural language understanding, ar**v:1904.09482

  35. Li X, Lu P, Hu, Wang X, Lu L (2021) A novel self-learning semi-supervised deep learning network to detect fake news on social media. Multimedia Tools and Applications. Springer, pp 1–9

  36. Liu Y, Ott M, Goyal N, Du J, Joshi M, Chen D, Levy O, Lewis M, Zettlemoyer L, Stoyanov V (2019) Roberta: A robustly optimized bert pretraining approach, . ar**v:1907.11692

  37. MacCartney B (2009) Natural language inference. Stanford University

  38. Mikolov T, Sutskever I, Chen K, Corrado GS, Dean J (2013) Distributed representations of words and phrases and their compositionality. Adv Neural Inf PSyste:3111–3119

  39. Moreno-Sandoval LG, Del Puertas EAP, Quimbaya AP, Alvarado-Valencia JA (2020) Assembly of Polarity: Emotion and user statistics for detection of fake profiles. CLEF 2020 Labs and Workshops, Notebook Papers, CEUR-WS.org

  40. Noureen J, Asif M (2017) Crowdsensing: socio-technical challenges and opportunities. IJACSA 8:363–369

    Google Scholar 

  41. Pamungkas EW, Basile V, Patti V (2019) Stance classification for rumour analysis in Twitter: Exploiting affective information and conversation structure, ar**v:1901.01911

  42. Parikh AP, Täckström O, Das D, Uszkoreit J (2016) A decomposable attention model for natural language inference. ar**v:1606.01933

  43. Pasunuru R, Bansal M (2017) Reinforced video captioning with entailment rewards. CoRR, ar**v:1708.02300

  44. Pennington J, Socher R, Manning CD (2014) Glove: Global vectors for word representation. Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), pp 1532–1543

  45. Pradhan A (2012) Support vector machine-a survey, vol 2

  46. Reddy H, Raj N, Gala M, Basava A (2020) Text-mining-based Fake News Detection Using Ensemble Methods. International journal of automation and computing, pp 1–12 Springer

  47. Ross QJ. (1986) Induction of decision trees. Mach Learn 1:81–106. Springer

  48. Sadeghi F, Bidgoly AJ, Amirkhani H (2020) FNID: Fake News Inference Dataset. IEEE Dataport. https://doi.org/10.21227/fbzd-sw81

  49. Shabani S, Sokhn M (2018) Hybrid machine-crowd approach for fake news detection. 2018 IEEE 4th International Conference on Collaboration and Internet Computing (CIC), pp 299–306. IEEE

  50. Shu K, Mahudeswaran D, Wang S, Lee D, Liu H (2018) FakeNewsNet: A data repository with news content, social context and dynamic information for studying fake news on social media, ar**v:1809.01286

  51. Shu K, Mahudeswaran D, Liu H (2019) Fakenewstracker: a tool for fake news collection, detection, and visualization. Comput Math Organ Theory 25:60–71. Springer

  52. Shu K, Zhou X, Wang S, Zafarani R, Liu H (2019) The role of user profiles for fake news detection. Proceedings of the 2019 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, pp 436–439

  53. Silverman C, Strapagiel L, Shaban H, Hall E, Singer-Vine J (2016) Hyperpartisan Facebook pages are publishing false and misleading information at an alarming rate. Buzzfeed News 20

  54. Talman A, Yli-Jyrä A, Tiedemann J (2018) Natural language inference with hierarchical bilstm max pooling architecture, ar**v:1808.08762

  55. Thorne J, Vlachos A, Cocarascu O, Christodoulopoulos C, Mittal A (2018) The fact extraction and VERification (FEVER) shared task proceedings of the first workshop on fact extraction and VERification (FEVER). Assoc Comput Linguist:1–9

  56. Trivedi H, Kwon H, Khot T, Sabharwal A, Balasubramanian N (2019) Repurposing Entailment for Multi-Hop Question Answering Tasks, Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). Assoc Comput Linguist:2948–2958

  57. Wang W, Yang L (2017) Liar pants on fire: A new benchmark dataset for fake news detection, ar**v:1705.00648

  58. Wang Y, Ma F, ** Z, Yuan Y, Xun G, Jha K, Su L, Gao J (2018) Eann: Event adversarial neural networks for multi-modal fake news detection

  59. Vlachos A, Riedel S (2014) Fact checking: Task definition and dataset construction. Proceedings of the ACL 2014 Workshop on Language Technologies and Computational Social Science, pp 18–22

  60. Williams A, Nangia N, Bowman SR (2017) A broad-coverage challenge corpus for sentence understanding through inference, ar**v:1704.05426

  61. Yang Z, Dai Z, Yang Y, Carbonell J, Salakhutdinov RR, Le QV (2019) Xlnet: Generalized autoregressive pretraining for language understanding. Advances in Neural Inf Process Syst:5753–5763

  62. Zhou X, Zafarani R (2018) A survey of fake news: Fundamental theories, Detection Methods, and Opportunities, ar**v:1812.00315

  63. Zubiaga A, Aker A, Bontcheva K, Liakata M, Procter R (2018) Detection and resolution of rumours in social media: A survey, ACM Computing Surveys (CSUR), vol 51. ACM, New York, pp 1–36

  64. Zhao Z, Zhao J, Sano Y, Levy O, Takayasu H, Takayasu M, Li D, Wu J, Havlin S (2020) Fake news propagates differently from real news even at early stages of spreading. EPJ Data Sci 9:11–14. SpringerOpen

  65. Zhou X, Zafarani R (2019) Network-based Fake News Detection: A Pattern-driven Approach. ACM SIGKDD Explor Newslett 21, 2, 48–60. ACM, New York

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Amir Jalaly Bidgoly.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Sadeghi, F., Bidgoly, A.J. & Amirkhani, H. Fake news detection on social media using a natural language inference approach. Multimed Tools Appl 81, 33801–33821 (2022). https://doi.org/10.1007/s11042-022-12428-8

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-022-12428-8

Keywords

Navigation