Log in

Regional Sentiment Bias in Social Media Reporting During Crises

  • Published:
Information Systems Frontiers Aims and scope Submit manuscript

Abstract

Crisis events such as terrorist attacks are extensively commented upon on social media platforms such as Twitter. For this reason, social media content posted during emergency events is increasingly being used by news media and in social studies to characterize the public’s reaction to those events. This is typically achieved by having journalists select ‘representative’ tweets to show, or a classifier trained on prior human-annotated tweets is used to provide a sentiment/emotion breakdown for the event. However, social media users, journalists and annotators do not exist in isolation, they each have their own context and world view. In this paper, we ask the question, ‘to what extent do local and international biases affect the sentiments expressed on social media and the way that social media content is interpreted by annotators’. In particular, we perform a multi-lingual study spanning two events and three languages. We show that there are marked disparities between the emotions expressed by users in different languages for an event. For instance, during the 2016 Paris attack, there was 16% more negative comments written in the English than written in French, even though the event originated on French soil. Furthermore, we observed that sentiment biases also affect annotators from those regions, which can negatively impact the accuracy of social media labelling efforts. This highlights the need to consider the sentiment biases of users in different countries, both when analysing events through the lens of social media, but also when using social media as a data source, and for training automatic classification models.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1

Similar content being viewed by others

Notes

  1. https://www.ibm.com/watson/alchemy-api.html

  2. https://www.brandwatch.com/

  3. https://www.crimsonhexagon.com/

  4. http://alt.qcri.org/semeval2017/task4/

  5. Rather than the user’s self-defined language, which is less accurate.

  6. Mean/expectation was set to 25 and the standard deviation was set to 20.

  7. http://www.crowdflower.com

  8. Minor changes were made to the instructions for labelling the Berlin dataset to clarify a small number of situations that arose when labelling the Paris dataset.

  9. Given that the attack in question took place in Paris, we make the presumption that the reaction amongst French-speaking Twitter users is representative of reactions from that region. We recognize that using language as a geographic indicator does not strictly hold. However, pinpointing locations in Twitter is problematic (Magdy et al. 2016) and at best partial.

  10. Manual Translation: The Muslim community condemn the attacks in Paris.

  11. http://www.scikit-learn.org

References

  • Agarwal, A, **e, B, Vovsha, I, Rambow, O, & Passonneau, R. (2011). Sentiment analysis of twitter data. In Proceedings of the workshop on languages in social media, association for computational linguistics, Stroudsburg, PA, USA, LSM ’11 (pp. 30–38). http://dl.acm.org/citation.cfm?id=2021109.2021114.

  • Balahur, A, & Turchi, M. (2012). Multilingual sentiment analysis using machine translation? In Proceedings of the 3rd workshop in computational approaches to subjectivity and sentiment analysis, association for computational linguistics, Stroudsburg, PA, USA, WASSA ’12 (pp. 52–60). http://dl.acm.org/citation.cfm?id=2392963.2392976.

  • Bontcheva, K, Derczynski, L, Funk, A, Greenwood, MA, Maynard, D, & Aswani, N. (2013). Twitie: an open-source information extraction pipeline for microblog text. In G. Angelova, K. Bontcheva, & R. Mitkov (Eds.), Recent advances in natural language processing, RANLP 2013, 9–11 September, 2013 (pp. 83–90). Hissar: Organising Committee/ACL, http://aclweb.org/anthology/R/R13/R13-1011.pdf.

  • Darwish, K, & Magdy, W. (2015). Attitudes towards refugees in light of the paris attacks. CoRR ar**v:1512.04310 .

  • De Choudhury, M, Diakopoulos, N, & Naaman, M. (2012). Unfolding the event landscape on twitter: classification and exploration of user categories. In Proceedings of the ACM 2012 conference on Computer Supported Cooperative Work (pp. 241–244). ACM.

  • Dong, L, Wei, F, Tan, C, Tang, D, Zhou, M, & Xu, K. (2014). Adaptive recursive neural network for target-dependent twitter sentiment classification. In Proceedings of the 52nd annual meeting of the association for computational linguistics, ACL 2014, June 22–27, 2014, Baltimore, MD, USA (Vol. 2: Short Papers, pp. 49–54). http://aclweb.org/anthology/P/P14/P14-2009.pdf.

  • Dredze, M, Paul, MJ, Bergsma, S, & Tran, H. (2013). Carmen: a twitter geolocation system with applications to public health. In AAAI workshop on expanding the boundaries of health informatics using AI (HIAI) (pp. 20–24).

  • Hermida, A. (2010). Twittering the news: the emergence of ambient journalism. Journalism Practice, 4(3), 297–308.

    Article  Google Scholar 

  • Hsueh, PY, Melville, P, & Sindhwani, V. (2009). Data quality from crowdsourcing: a study of annotation selection criteria. In Proceedings of the NAACL HLT 2009 workshop on active learning for natural language processing (pp. 27–35). Association for Computational Linguistics.

  • Jiang, L, Yu, M, Zhou, M, Liu, X, & Zhao, T. (2011). Target-dependent twitter sentiment classification. In Proceedings of the 49th annual meeting of the association for computational linguistics: human language technologies (Vol. 1, pp. 151–160). Association for Computational Linguistics.

  • Kiritchenko, S, & Mohammad, SM. (2016). Capturing reliable fine-grained sentiment associations by crowdsourcing and best-worst scaling. In American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT) (pp. 811–817).

  • Kraaij, W, & Spitters, M. (2003). Language models for topic tracking. In Language modeling for information retrieval (pp. 95–123). Berlin: Springer.

  • Kwak, H, Lee, C, Park, H, & Moon, S. (2010). What is twitter, a social network or a news media? In Proceedings of the 19th international conference on world wide web (pp. 591–600). ACM.

  • Magdy, W, Darwish, K, & Abokhodair, N. (2015). Quantifying public response towards islam on twitter after paris attacks. CoRR ar**v:1512.04570.

  • Magdy, W, Darwish, K, Abokhodair, N, Rahimi, A, & Baldwin, T. (2016). #isisisnotislam or #deportallmuslims?: predicting unspoken views. In Proceedings of the 8th ACM conference on web science. WebSci ’16 (pp. 95–106). New York: ACM. https://doi.org/10.1145/2908131.2908150.

  • Marcheggiani, D, Täckström, O, Esuli, A, & Sebastiani, F. (2014). Hierarchical multi-label conditional random fields for aspect-oriented opinion mining. In Advances in information retrieval (pp. 273–285). Berlin: Springer.

  • Maynard, D, & Bontcheva, K. (2016). Challenges of evaluating sentiment analysis tools on social media. In N. Calzolari, K. Choukri, T. Declerck, S. Goggi, M. Grobelnik, B. Maegaard, J. Mariani, H. Mazo, A. Moreno, J. Odijk, S. Piperidis (Eds.), Proceedings of the tenth international conference on language resources and evaluation LREC 2016, Portorož, Slovenia, May 23-28, 2016. European Language Resources Association (ELRA). http://www.lrec-conf.org/proceedings/lrec2016/summaries/188.html.

  • Maynard, D, & Greenwood, MA. (2014). Who cares about sarcastic tweets? investigating the impact of sarcasm on sentiment analysis. In N. Calzolari, K. Choukri, T. Declerck, H. Loftsson, B. Maegaard, J. Mariani, A. Moreno, J. Odijk, S. Piperidis (Eds.), Proceedings of the ninth international conference on language resources and evaluation, LREC 2014, Reykjavik, Iceland, May 26–31, 2014 (pp. 4238–4243). European Language Resources Association (ELRA). http://www.lrec-conf.org/proceedings/lrec2014/summaries/67.html.

  • Maynard, D, & Hare, JS. (2015). Entity-based opinion mining from text and multimedia. In M.M. Gaber, M. Cocea, N. Wiratunga, & A. Göker (Eds.), Advances in social media analysis, studies in computational intelligence (Vol. 602, pp. 65–86). Berlin: Springer, DOI https://doi.org/10.1007/978-3-319-18458-6_4, (to appear in print).

  • McCreadie, R, Macdonald, C, & Ounis, I. (2013). Identifying top news using crowdsourcing. Information Retrieval, 16(2), 179–209.

    Article  Google Scholar 

  • Moilanen, K, & Pulman, S. (2009). Multi-entity sentiment scoring. In RANLP (pp. 258–263).

  • Mozetič, I, Grčar, M, & Smailović, J. (2016). Multilingual twitter sentiment classification: the role of human annotators. PLoS ONE, 11(5), 1–26. https://doi.org/10.1371/journal.pone.0155036.

    Google Scholar 

  • Nagy, A, & Stamberger, J. (2012). Crowd sentiment detection during disasters and crises. In Proceedings of the 9th international ISCRAM conference (pp. 1–9).

  • Narr, S, De Luca, EW, & Albayrak, S. (2011). Extracting semantic annotations from twitter. In Proceedings of the fourth workshop on exploiting semantic annotations in information retrieval, ESAIR ’11 (pp. 15–16). New York: ACM. https://doi.org/10.1145/2064713.2064723.

  • Ounis, I, Amati, G, Plachouras, V, He, B, Macdonald, C, & Lioma, C. (2006). Terrier: a high performance and scalable information retrieval platform. In Proceedings of the OSIR workshop (pp. 18–25).

  • Ounis, I, Macdonald, C, & Soboroff, I. (2008). Overview of the trec-2008 blog track. Tech. rep. Glasgow University (United Kingdom).

  • Purver, M, & Battersby, S. (2012). Experimenting with distant supervision for emotion classification. In Proceedings of the 13th conference of the European chapter of the association for computational linguistics (pp. 482–491). Association for Computational Linguistics.

  • Schulz, A, Thanh, T, Paulheim, H, & Schweizer, I. (2013). A fine-grained sentiment analysis approach for detecting crisis related microposts. In Conference on Information Systems for Crisis Response and Management (ISCRAM).

  • Stieglitz, S, & Dang-Xuan, L. (2013). Emotions and information diffusion in social mediasentiment of microblogs and sharing behavior. Journal of Management Information Systems, 29(4), 217–248.

    Article  Google Scholar 

  • Tang, D, Wei, F, Qin, B, Liu, T, & Zhou, M. (2014). Coooolll: a deep learning system for twitter sentiment classification. In Proceedings of the 8th international workshop on semantic evaluation (SemEval 2014) (pp. 208–212). Dublin: Association for Computational Linguistics and Dublin City University. http://www.aclweb.org/anthology/S14-2033.

  • Tang, D, Qin, B, Feng, X, & Liu, T. (2016a). Effective lstms for target-dependent sentiment classification. In N. Calzolari, Y. Matsumoto, & R. Prasad (Eds.), COLING 2016, 26th international conference on computational linguistics, proceedings of the conference: technical papers, December 11–16, 2016. http://aclweb.org/anthology/C/C16/C16-1311.pdf (pp. 3298–3307). Osaka: ACL.

  • Tang, D, Wei, F, Qin, B, Yang, N, Liu, T, & Zhou, M. (2016b). Sentiment embeddings with applications to sentiment analysis. IEEE Transactions on Knowledge and Data Engineering, 28(2), 496–509. https://doi.org/10.1109/TKDE.2015.2489653.

  • Thelwall, M, Buckley, K, & Paltoglou, G. (2011). Sentiment in twitter events. Journal of the American Society for Information Science and Technology, 62(2), 406–418. https://doi.org/10.1002/asi.21462.

    Article  Google Scholar 

  • Tromp, E. (2012). Multilingual sentiment analysis on social media: an extensive study on multilingual sentiment analysis performed on three different social media. LAP Lambert Academic Publishing. http://books.google.nl/books?id=ut4yLgEACAAJ.

  • Vargas, S, McCreadie, R, Macdonald, C, & Ounis, I. (2016). Comparing overall and targeted sentiments in social media during crises. In Tenth international AAAI conference on web and social media.

  • Verma, S, Vieweg, S, Corvey, WJ, Palen, L, Martin, JH, Palmer, M, Schram, A, & Anderson, KM. (2011). Natural language processing to the rescue? Extracting “situational awareness” tweets during mass emergency. In International AAAI Conference on Web and Social Media (ICWSM).

  • Wang, H, Can, D, Kazemzadeh, A, Bar, F, & Narayanan, S. (2012). A system for real-time twitter sentiment analysis of 2012 us presidential election cycle. In Proceedings of the ACL 2012 system demonstrations (pp. 115–120). Association for Computational Linguistics.

Download references

Acknowledgements

This work was supported by the EC co-funded SUPER (FP7-606853) project.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Richard McCreadie.

Additional information

Author Note: This article is an expansion of a previously published paper entitled ‘Analyzing Disproportionate Reaction via Comparative Multilingual Targeted Sentiment in Twitter’ at the international conference on Advances in Social Network Analysis and Mining (ASONAM 2017).

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Smith, K.S., McCreadie, R., Macdonald, C. et al. Regional Sentiment Bias in Social Media Reporting During Crises. Inf Syst Front 20, 1013–1025 (2018). https://doi.org/10.1007/s10796-018-9827-x

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10796-018-9827-x

Keywords

Navigation