Log in

A Natural Language Processing Analysis of Newspapers Coverage of Hong Kong Protests Between 1998 and 2020

  • Original Research
  • Published:
Social Indicators Research Aims and scope Submit manuscript

Abstract

This article investigates how the SCMP, the China Daily-and western-based newspapers cover protests in Hong Kong in an effort to identify changes in journalistic practices between 1998 and 2020. It combines natural language processing (NLP) with a qualitative investigation of a novel corpus of newspaper articles spanning 22 years. It  enlists topic modeling to contrast the treatment of protests in Hong Kong diachronically and across news sources. Through comparison of lexical frequency and lexical usage it  showcases preferences and discrepancies in the use of protest-relevant keywords in the newspapers’ articles. Embedding neighborhood comparisons strengthens our understanding of how words are used differently between the SCMP, the China Daily and western-based newspapers, and also how the context of protest-related keywords may differ across news sources over time. Finally, computational sentiment analysis measures the tone and connotations of articles. The article fills a gap in the literature on Hong Kong media and  its methodology broadens the application of NLP techniques to the social sciences.   

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price includes VAT (Canada)

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

Notes

  1. The National Security (Legislative Provisions) Bill 2003 was a proposed bill which aimed to “amend the Crimes Ordinance, the Official Secrets Ordinance and the Societies Ordinance pursuant to the obligation imposed by Article 23 of the Basic Law of the Hong Kong Special Administrative Region to provide for related, incidental and consequential amendments.” (Hong Kong Government, press release, 24 February 2003).

  2. https://socialmovements.trinity.duke.edu/groups/scholarism

  3. http://english.www.gov.cn/archive/white_paper/2014/08/23/content_281474982986578.htm

  4. https://carnegieendowment.org/2016/10/17/implications-of-sixth-hong-kong-legislative-election-for-relations-with-bei**g-pub-64872

  5. https://www.legco.gov.hk/yr18-19/english/bills/b201903291.pdf

  6. “The idea of one country, two systems originated in 1979, when China offered to allow Taiwan to keep its economic and social systems, government, and even military in return for acknowledging that it was part of the People’s Republic. Taiwan rejected that proposal. [Then-Premier] Deng ** next used the idea to resolve an emergent crisis over Hong Kong. The biggest section of Hong Kong, the New Territories, was scheduled to revert to mainland rule in 1997, and real-estate investors feared they would lose everything in the reversion. Those concerns led to a historic confrontation between Deng and [British Prime Minister] Margaret Thatcher in December 1984 and the 1985 Sino-British Joint Declaration which promised to preserve the judicial system, legislative and executive autonomy, and all the key freedoms to which Hong Kong people had become accustomed for 50 years.” In Overholt, W. (2019). Hong Kong: The Rise and Fall of “One Country, Two Systems, Boston: Harvard Kennedy School, p.1.

  7. https://www.chinadaily.com.cn/cd/introduction.html, accessed on 03/25/2023.

  8. Following the recommendation of Wendlandt et al. (2018) and Gonen et al. (2020), we use 1000 nearest neighbors.

  9. Research on news values implies answering a fundamental question: “What is news?” and journalism and communication-related disciplines have put significant effort in trying to answer this question. See Galtung and Ruge, 1965; Eilders, 2006; Welbers et al. 2016.

  10. https://www.britannica.com/event/Tiananmen-Square-incident

  11. carnegie-mec.org/2016/10/17/implications-of-sixth-hong-kong-legislative-election-for-relations-with-bei**g-pub-64872

  12. TADA 2021, 11th Annual Conference in New Directions in Analyzing Text as Data. Panel on Longitudinal Studies of Language, with Philip Resnik as discussant. https://tada2021.org

  13. See page 2–4.

  14. The China Daily, August 16, 2019.

  15. The South China Morning Post, June 7, 2019.

  16. The China Daily, September 23, 2019.

  17. The South China Morning Post, November 28, 2019.

  18. https://www.info.gov.hk/gia/general/202001/08/P2020010800638p.htm

  19. For the purpose of this research, the term keyword is used in the information retrieval rather than the corpus linguistic sense, meaning a term that is statistically characteristic in a text. See also Douglas Biber and Randi Reppen (Eds.). (2015). The Cambridge Handbook of English Corpus Linguistics. Cambridge: Cambridge University Press, pp. 90–105.

  20. https://www.npr.org/2019/07/01/737761290/looking-back-22-years-to-the-handover-of-hong-kong-from-britain-to-china

References

  • Aarøe, L., & Petersen, M. (2018). Cognitive biases and communication strength in social net- works: The case of episodic frames. British Journal of Political Science, 50, 1–21.

    Google Scholar 

  • Bhatia, A. (2015). Construction of discursive illusions in the ‘umbrella movement.’ Discourse & Society, 26(4), 407–427.

    Article  Google Scholar 

  • Andrew M. (2002). Mallet: A machine learning for language toolkit. http://mallet.cs.umass.edu.

  • Baden, C., Christian, P., Martijn, S., Mariken, A. C., & van der Velden, G. (2021). Three gaps in computational text analysis methods for social sciences: A research agenda. Communication Methods and Measures, 16, 1–18.

    Article  Google Scholar 

  • Barrault L., Ondřej B., Marta R. C.-J., Christian F., Mark F., Yvette G., Barry H., Matthias H., Philipp K., Shervin M., Christof M., Mathias M., Santanu P., Matt P., and Marcos Z. (2019). Findings of the 2019 Conference on Machine Translation (WMT19). In Proceedings of the Fourth Conference on Machine Translation (Volume 2: Shared Task Papers, Day 1), pages 1–61, Florence, Italy. Association for Computational Linguistics.

  • Bednarek, M., Caple, H., & Huan, C. (2021). Computer-based analysis of news values: A case study on National day. Reporting, Journalism Studies, 22, 6.

    Google Scholar 

  • Bender E. M. 2009. Linguistically Naïve! Language Independent: Why NLP Needs Linguistic Typology. In Proceedings of the EACL 2009 Workshop on the interaction between linguistics and computational linguistics: Virtuous, vicious or vacuous? (pp. 26–32). Association for Computational Linguistics.

  • Bergsma S, Matt P and David Y. (2012). Stylometric analysis of scientific articles. HLT-NAACL, 327–337.

  • Oliver, M.B, Raney, A., & Bryant, J. (2019). Media effects: Advances in theory and research. Routledge.

    Book  Google Scholar 

  • Biber, D., & Reppen, R. (Eds.). (2015). The Cambridge handbook of English corpus linguistics. Cambridge: Cambridge University Press

  • Blei, D. M., Andrew, Y. N., & Michael, I. J. (2003). Latent Dirichlet allocation. Journal of Machine Learning Research, 3(null), 993–1022.

    Google Scholar 

  • Blei David M. and John D. Lafferty. 2006. Dynamic topic models. In Proceedings of the 23rd international conference on Machine learning (ICML '06). Association for Computing Machinery, pages: 113–120.

  • Boydstun, A. E., Gross, J. H., Resnik, P., & Smith, N. A. (2013). Identifying media frames and frame dynamics within and across policy issues. In New Directions in Analyzing Text as Data Workshop.

    Google Scholar 

  • Boyle, M. P., McLeod, D. M., & Armstrong, C. L. (2012). Adherence to the protest paradigm: The influence of protest goals and tactics on news coverage in U.S. and international newspapers. International Journal of Press/Politics, 17(2), 127–144.

    Article  Google Scholar 

  • Brady, H. E. (2019). The challenge of big data and data science. Annual Review of Political Science, 22(1), 297–323.

    Article  Google Scholar 

  • Burnard, P. (1991). A method of analysing interview transcripts in qualitative research. Nurse Educations Today, 11(6), 461–466.

    Article  Google Scholar 

  • Büyüköz, B., Hürriyetoğlu, A., and Özgür, A. 2020. Analyzing ELMo and DistilBERT on socio-political news classification. In Proceedings of the workshop on automated extraction of socio-political events from news (pp. 9–18). European Language Resources Association (ELRA).

  • Casalino, G., Del Buono, N., & Mencar, C. (2016). Nonnegative matrix factorizations for intelligent data analysis. In Ganesh R. Naik (Ed.), Non-negative matrix factorization techniques: Advances in theory and applications (pp. 49–74). Berlin, Heidelberg: Springer. https://doi.org/10.1007/978-3-662-48331-2_2

    Chapter  Google Scholar 

  • Catanzaro, M. (1988). Using qualitative analytical techniques. In N. Woods & M. Catanzaro (Eds.), Nursing research: Theory and practice (pp. 437–456). St Louis: Mosby Incorporated.

    Google Scholar 

  • Chan, J. M., & Lee, C.-C. (1991). Mass media and political transition. Guilford Press.

    Google Scholar 

  • Chan, J. M., & Lee, F. L. F. (2007). Media and large-scale demonstrations: The pro-democracy movement in post-handover Hong Kong. Asian Journal of Communication, 17(2), 215–228.

    Article  Google Scholar 

  • Chan, Joseph M., Paul, S. N., & Chin-Chuan, L. (1996). Hong Kong journalists in transition. Hong Kong: Hong Kong Institute of Asia Pacific Studies.

    Google Scholar 

  • Chan, C.-H., Zeng, J., Wessler, H., Jungblut, M., Welbers, K., Bajjalieh, J. W., van Atteveldt, W., & Althaus, S. L. (2020). Reproducible extraction of cross-lingual topics (rectr). Communication Methods and Measures, 14(4), 285–305.

    Article  Google Scholar 

  • Chan Chi Kit. (2015). Contested news values and media performance during the umbrella movement. Chinese Journal of Communication, 8(4), 420–428.

    Article  Google Scholar 

  • Chan Joseph Man and Francis L. F. Lee. 2011. Media, Social Mobilization and Mass Protests in Post-colonial Hong Kong: The Power of a Critical Event (Media, culture and social change in Asia; v. 22). (Routledge).

  • Cheng, J. Y. (2014). The emergence of radical politics in Hong Kong: Causes and impact. The China Review, 14(1), 199–232.

    Google Scholar 

  • Cheung Anne SY. (2003). Hong Kong press coverage of China–Taiwan cross-straits tension. In Hong Kong in transition. pp 219–234. (Routledge)

  • Yeung, C. (2000). Hong Kong: A handover of freedom? In R. Rich & L. Williams (Eds.), Losing control: Freedom of the press in Asia (pp. 58–73). Canberra: Asia Pacific Press.

    Google Scholar 

  • Christian, B., & Lecheler, S. (2012). Fleeting, fading, or far-reaching? A knowledge-based model of the persistence of framing effects. Communication Theory, 22(4), 359–382.

    Article  Google Scholar 

  • Daniel, L., & Sebastain Seung, H. (1999). Learning the parts of objects by non-negative matrix factorization. Nature, 401, 788–791.

    Article  Google Scholar 

  • Dehler-Holland, J., Schumacher, K., & Fichtner, W. (2021). Topic modeling uncovers shifts in media framing of the German renewable energy act. Patterns (n y), 2(1), 100169.

    Article  Google Scholar 

  • Denis, McQuail, & Windahl, S. (1993). Communication models for the study of mass communications. Routledge.

    Google Scholar 

  • Dennis, C., & Druckman, J. N. (2007). Framing theory. Annual Review of Political Science, 10(1), 103–126.

    Article  Google Scholar 

  • Devlin J., Ming-Wei C., Kenton L., and Kristina T. (2019). BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pages 4171–4186, Minneapolis, Minnesota. Association for Computational Linguistics.

  • Djankov, S., McLiesh, C., Nenova, T., & Shleifer, A. (2003). Who owns the media? Journal of Law and Economics, 46(2), 341–381.

    Article  Google Scholar 

  • Dore, G. M. D. (2022). Business as usual? The role of business elites in Hong Kong’s evolving political identity. Translocal Chinese: East Asian Perspectives, 16, 56–78.

    Article  Google Scholar 

  • Du, Y., Zhu, L., & Yang, F. (2018). A movement of varying faces: How “occupy central” was framed in the news in Hong Kong, Taiwan, mainland China, the UK, and the US. International Journal of Communication, 12, 12.

    Google Scholar 

  • Eilders, C. (2006). News factors and news decisions. Theoretical and methodological advances in Germany. Communications. 31, 5–24.

    Article  Google Scholar 

  • Entman, R. (1993). Framing: Toward clarification of a fractured paradigm. Journal of Communication, 43(4), 51–58.

    Article  Google Scholar 

  • Entman, R. M. (2007). Framing Bias: Media in the distribution of power. Journal of Communication, 57(1), 163–173.

    Article  Google Scholar 

  • Felix, W. (2018). Cultural co-orientation revisited: The case of the South China Morning Post. Global Media and China, 3(1), 32–50.

    Article  Google Scholar 

  • Fernández, I., Igartua, J.-J., Moral, F., Palacios, E., Acosta, T., & Muñoz, D. (2013). Language use depending on news frame and immigrant origin. International Journal of Psychology, 48(5), 772–784. https://doi.org/10.1080/00207594.2012.723803

    Article  Google Scholar 

  • Field, Anjalie Doron Kliger, Shuly Wintner, Jennifer Pan, Dan Jurafsky, and Yulia Tsvetkov. 2018. Framing and agenda-setting in Russian news: a computational analysis of intricate political strategies. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pages 3570– 3580, Brussels, Belgium. Association for Computational Linguistics.

  • Lee, F. L. (2007). Strategic interaction, cultural co-orientation, and press freedom in Hong Kong. Asian Journal of Communication, 17(2), 134–147.

    Article  Google Scholar 

  • Lee, F. L. (2014). Triggering the protest paradigm: Examining factors affecting news coverage of protests. International Journal of Communication, 8, 22.

    Google Scholar 

  • Lee, F. L., & Angel, M. Y. L. (2006). Newspaper editorial strategies and politics of self-censorship in Hong Kong. Discourse & Society, 17(3), 311–358.

    Article  Google Scholar 

  • Lee, F., Joseph, M. C., & Clement, Y. K. S. (2005). Evaluation of media and understanding of politics: The role of education among Hong Kong citizens. Asian Journal of Communication, 15(1), 37–56.

    Article  Google Scholar 

  • Fung, A. Y. H., & Lee, C.C. (1994). Hong Kong’s changing media ownership: Uncertainly and dilemma. Gazette, 53, 127–133.

    Article  Google Scholar 

  • Galtung, J. & Ruge, M. H. (1965). The structure of foreign news. Journal of Peace Research, 2(1), 64–91.

    Article  Google Scholar 

  • Gans, H. J. (1979). Deciding what’s news: Story suitability. Social, 16, 65–77.

    Article  Google Scholar 

  • Gans Herbert, J. (2004). Democracy and the News. Oxford University Press.

    Google Scholar 

  • Gehlbach, S., & Konstantin, S. (2014). Government control of the media. Journal of Public Economics, 118(3), 163–171.

    Article  Google Scholar 

  • Glasser, T. L. (1992). Objectivity precludes responsibility. In E. D. Cohen (Ed.), Philosophical issues in journalism (pp. 166–175). Oxford University Press.

    Google Scholar 

  • Gonen, H., Jawahar, G., Seddah, D., & Goldberg, Y. (2021). Simple, interpretable and stable method for detecting words with usage change across corpora. ar**v preprint ar**v:https://doi.org/10.48550/ar**v.2112.14330

  • Graneheim, U. H., & Lundman, B. (2004). Qualitative content analysis in nursing research: concepts, procedures and measures to achieve trustworthiness. Nurse Educ Today., 24(2), 105–12.

    Article  Google Scholar 

  • Grimmer, J., & Stewart, B. M. (2013). Text as data: The promise and pitfalls of automatic content analysis methods for political texts. Political Analysis, 21(3), 267–297.

    Article  Google Scholar 

  • Hallin, D. and Mancini, P. (2004). Comparing media systems: Three models of media and politics. Cambridge University Press.

  • Hamilton W. L., Jure L., and Dan J. (2016). Diachronic word embeddings reveal statistical laws of semantic change. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp 1489–1501, Berlin, Germany. Association for Computational Linguistics.

  • Hartig, F. (2019). Rethinking China’s global “propaganda” blitz. Global Media and Communication., 16(1), 3–18.

    Article  Google Scholar 

  • Hartig Falk. 2017. China Daily - Bei**g's Global Voice? In Thussu Daya Kishan, Hugo De Burgh, and Anbin Shi (Eds.). 2017. China's Media Go Global. Routledge.

  • Hilbert, M., Barnett, G., Blumenstock, J., Contractor, N., Diesner, J., Frey, S., González-Bailón, S., Lamberson, P. J., Pan, J., Peng, T.-Q., Shen, C., Smaldino, P. E., van Atteveldt, W., Waldherr, A., Zhang, J., Zhu, J. H., et al. (2019). Computational communication science: A methodological catalyzer for a maturing discipline. International Journal of Communication, 13, 1–23.

    Google Scholar 

  • Hofmann T. (1999). Probabilistic Latent Semantic Analysis, Proceedings of the XV Conference on Uncertainty in Artificial Intelligence (UAI1999).

  • Hoyle A. M., Pranav G., Denis P., Andrew H.-C., Jordan L. B.-G., and Philip R. (2021). Is Automated topic model evaluation broken? The incoherence of coherence. Ar**v, abs/2107.02173.

  • Hürriyetoğlu, A., Yörük, E., Yüret, D., Yoltar, G. C. B., Durus, F., Mutlu, O., and Akdemir, A. (2019). Overview of CLEF 2019 lab protest news: Extracting protests from news in a cross-context setting. In International conference of the cross-language evaluation forum for European languages (pp. 425–432). Springer

  • Ibrahim, Z., & Lam, J. (2020). Rebel City: Hong Kong’s year of water and fire. World Scientific. https://doi.org/10.1142/11777

    Book  Google Scholar 

  • Jacobi, C., van Atteveldt, W., & Welbers, K. (2016). Quantitative analysis of large amounts of journalistic texts using topic modelling. Digital Journalism, 4(1), 89–106.

    Article  Google Scholar 

  • Jennifer, E., Martin, A., McCarthy, J. D., & Soule, S. A. (2004). The use of newspaper data in the study of collective action. Annual Review of Sociology, 30(1), 65–80.

    Article  Google Scholar 

  • Agnone, J. (2007). Amplifying public opinion: The policy impact of the U.S. environmental movement. Social Forces, 85(4), 1593–1620.

    Article  Google Scholar 

  • King, G. (2011). The tactical disruptiveness of social movements: Sources of market and mediated disruption in corporate boycotts. Social Problems, 58(4), 491–517.

    Article  Google Scholar 

  • Lam W-M. (2004). Understanding the Political Culture of Hong Kong: The Paradox of Activism and Depoliticization. Armonk and London

  • Lau T.-Y. and Yiu-ming T. (2002). Walking a tight rope: Hong Kong’s media facing political and economic challenges since sovereignty transfer, in Ming K.C. and Alvin Y. so, editors, Crisis and Transformation in China’s Hong Kong, page 322. (Hong Kong University Press).

  • Lau J. H., David N., and Timothy B. (2014). Machine reading tea leaves: Automatically evaluating topic coherence and topic model quality. In proceedings of the 14th conference of the European chapter of the association for computational linguistics, pages 530–539, Gothenburg, Sweden. Association for Computational Linguistics.

  • Wendlandt L, Kummerfeld J. K., and Mihalcea R. (2018). Factors influencing the surprising instability of word embeddings, in Proceedings of the 2018 conference of the North American chapter of the association for computational linguistics: human language technologies, Volume 1 (Long Papers), pages 2092–2102, New Orleans, Louisiana. Association for Computational Linguistics.

  • Lawrence, S. V., & Martin, M. F. (2020). China’s National Security Law for Hong Kong: Issues for congress. Congressional Research Service. https://crsreports.congress.gov. R46473.

  • Lee, F. L. F. (2006). Poll reporting and journalistic paradigm: A study of popularity poll coverage in newspaper. Asian Journal of Communication, 16(2), 132–151.

    Article  Google Scholar 

  • Lee C-C. (2000). The paradox of political economy: Media structure, press freedom, and regime change in Hong Kong. Power, Money, and Media, pages 288–336.

  • Li W. and McCallum A. (2006). Pachinko allocation: DAG-structured mixture models of topic correlations. In: Proceedings of the 23rd international conference on Machine learning (ICML '06). Association for Computing Machinery, New York, NY, USA, pp 577–584.

  • Lucy, L., Demszky, D., Bromley, P., & Jurafsky, D. (2020). Content analysis of textbooks via natural language processing: Findings on gender, race, and ethnicity in Texas US history textbooks. AERA Open, 6(3), 233.

    Article  Google Scholar 

  • Manfred S. and Patz R. (2021). The Climate Change Debate and Natural Language Processing. In Proceedings of the 1st Workshop on NLP for Positive Impact, pages 8–18 .Bangkok, Thailand (online). ©2021 Association for Computational Linguistics

  • Maxwell M. (2002). The agenda-setting role of the mass media in the sha** of public opinion. In Mass Media Economics 2002 Conference, London School of Economics.

  • McCallum A. K. (2002). MALLET: A machine learning for language toolkit. URL (last checked 26 June 2012). http://mallet.cs.umass.edu

  • McCarthy Arya D., Giovanna Maria Dora Dore, James A. Scharf. (2021). A mixed-methods analysis of western and Hong Kong–based reporting on the 2019–2020 protests. In Proceedings of the 5th Joint SIGHUM Workshop on Computational Linguistics for Cultural Heritage, Social Sciences, Humanities and Literature, pages 178–188, Punta Cana, Dominican Republic (online). Association for Computational Linguistics.

  • McCluskey, M., Stein, S., Boyle, M., & Mcleod, D. (2009). Community structure and social pro-test: Influences on newspaper coverage. Mass Communication and Society, 12, 353–371.

    Article  Google Scholar 

  • McLeod Douglas, M., Hertog Detenber, Ben. (1999). Framing effects of television news coverage of social protest. Journal of Communication 49:3–23.

  • Mimno D., Hanna W., Edmund T., Miriam L., and McCallum A. (2011). Optimizing semantic coherence in topic models. In Proceedings of the 2011 conference on empirical methods in natural language processing, pp 262–272, Edinburgh, Scotland, UK. Association for Computational Linguistics.

  • Ng, M. K. (2020). The making of ‘violent’ Hong Kong: A centennial dream? A fight for democracy? A challenge to humanity? Planning Theory & Practice, 21(3), 483–494.

    Article  Google Scholar 

  • Ngok, K. (2007). Chinese education policy in the context of decentralization and marketization: Evolution and implications. Asia Pacific Education Review, 8, 142–157.

    Article  Google Scholar 

  • Nguyen V.-A., Jordan B-G., Philip R., and Kristina M. (2015). Tea party in the house: A hierarchical ideal point topic model and its application to republican legislators in the 112th Congress. Proceedings of the 53rd annual meeting of the association for computational linguistics and the 7th international joint conference on natural language processing (Volume 1: Long Papers), pages: 1438–1448.

  • Paul, S. N. L., & Leonard, C. (1998). Inherent dependence on power: The Hong Kong press in political transition. Media, Culture & Society, 20, 59–77.

    Article  Google Scholar 

  • Paul, D., Nag, M., & Blei, D. (2013). Exploiting affinities between topic modeling and the sociological perspective on culture: Application to newspaper coverage of US government arts funding. Poetics, 41(6), 570–606.

    Article  Google Scholar 

  • Pires T, Eva S, and Dan G. (2019). How multilingual is multilingual BERT? In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 4996–5001, Florence, Italy. Association for Computational Linguistics.

  • Price, V., Tewksbury, D., & Powers, E. (1997). Switching trains of thought: The impact of news frames on readers’ cognitive responses. Communication Research, 24(5), 481–506.

    Article  Google Scholar 

  • Quinn, K. M., Monroe, B. L., Colaresi, M., Crespin, M. H., & Radev, D. R. (2010). How to analyze political attention with minimal assumptions and costs. American Journal of Political Science, 54(1), 209–228.

    Article  Google Scholar 

  • Reber, U. (2019). Overcoming language barriers: Assessing the potential of machine translation and topic modeling for the comparative analysis of multilingual text corpora. Communication Methods and Measures, 13(2), 102–125.

    Article  Google Scholar 

  • Rudolph, M., and Blei, D. (2018). Dynamic embeddings for language evolution. In WWW 2018: The 2018 Web conference, April 23–27, 2018. ACM.

  • Saif M. (2018). Examining gender and race bias in two hundred sentiment analysis systems. In proceedings of the seventh joint conference on lexical and computational semantics, pages 43–53, New Orleans, Louisiana. Association for Computational Linguistics.

  • Salton, G., Wong, A., & Yang, C. S. (1975). A vector space model for automatic indexing. Communications of the ACM, 18(11), 613–620.

    Article  Google Scholar 

  • Sanh V., Lysandre D., Julien C., and Thomas W. (2019). DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter. CoRR, abs/1910.01108.

  • Schulz, A., Wettstein, M., and Wirth, W. 2014. All hostile media: Group membership, in group identification and consonance of news reporting as moderator of the hostile media effect. Paper presented at the annual meeting of the international communication association (ICA).

  • Scott, D., Dumais, S. T., Furnas, G. W., Landauer, T. K., & Harshman, R. (1990). Indexing by latent semantic analysis. Journal of the American Society for Information Science, 41(6), 391–407.

    Article  Google Scholar 

  • Shen, F. (2004). Chronic accessibility and individual cognitions: Examining the effects of message frames in political advertisements. Journal of Communication, 54(1), 123–137.

    Article  Google Scholar 

  • Shoemaker, P., & Reese, S. (1996). Mediating the message: theories of influence on mass media content (2nd ed.). Longman.

    Google Scholar 

  • Sinclair, K., & Ng, N. K.-C. (1997). Asia’s finest marches on: Policing Hong Kong from 1841 into the 21st century. Kevin Sinclair Associates Ltd.

  • Socher R., Alex P., Jean W., Jason C., Christopher D. M., Andrew N, and Christopher P. (2013). Recursive deep models for semantic compositionality over a sentiment treebank. In Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, pages 1631–1642, Seattle, Washington, USA. Association for Computational Linguistics.

  • Solomon, W. S. (1996). Covering dissent: the media and the Anti-Vietnam war movement. Journal of American History, 82(4), 1651–1651.

    Article  Google Scholar 

  • Sparks, C. (2015). Business as usual: The UK national daily press and the occupy central movement. Chinese Journal of Communication, 8(4), 429–446.

    Article  Google Scholar 

  • Stephens, K. K., & Rains, S. A. (2011). Information and communication technology sequences and message repetition in interpersonal interaction. Communication Research, 38(1), 101–122.

    Article  Google Scholar 

  • Susan, T. (2004). The penny press: the origins of the modern news media (pp. 1833–1861). Northport, AL: Vision Press.

    Google Scholar 

  • Purbrick M. (2019) A report of the 2019 hong kong protests. Asian Affairs, 50(4), 465–487.

  • Tsfati Yariv and Nathan Walter. 2019. The world of news and politics. Media Effects: Advances in Theory and Research.

  • Tuukka, Y.-A., Eranti, V., & Kukkonen, A. (2022). Topic modeling for frame analysis: A study of media debates on climate change in India and USA. Global Media and Communication, 18(1), 91–112.

    Article  Google Scholar 

  • Valkenburg, P., Marina, K., Allerd, P., & Marseille, N. (1999). Develo** a scale to assess three styles of television mediation: “Instructive mediation”, “restrictive mediation”, and “social coviewing.” Communication Research, 43, 52–66.

    Google Scholar 

  • Wang, K. J. Y. (2017). Mobilizing resources to the square: Hong Kong’s anti-moral and national education movement as precursor to the umbrella movement. International Journal of Cultural Studies, 20(2), 127–145.

    Article  Google Scholar 

  • Wijaya, DT, and Yeniterzi, R (2011) Understanding semantic change of words over centuries, in Proceedings of the 2011 international workshop on DETecting and exploiting cultural diver- siTy on the social web (DETECT ’11) (pp 35–40). Association for Computing Machinery

  • Wong, HT, and Shih-Diing L (2018) Cultural activism during the Hong Kong umbrella movement. Journal of Creative Communications, 13(2), 157–165.

    Article  Google Scholar 

  • Wouter, A., & Peng, T.-Q. (2018). When communication meets computation: Opportunities, challenges, and pitfalls in computational communication science. Communication Methods and Measures, 12(2–3), 81–92.

    Google Scholar 

  • Wueest, B., Rothenhäusler, K., & Hutter, S. (2013). Using computational linguistics to enhance protest event analysis. ENCoRe workshop ‘Tools and techniques for conflict event data collection’.

  • Xu Wei, **n Liu, and Yihong Gong. 2003. Document clustering based on non-negative matrix factorization. Proceedings of the 26th annual international ACM SIGIR conference on Research and development in information retrieval. July. Pages 267–273.

  • Yu, M. (2015). Framing occupy central: A content analysis of Hong Kong, American and British newspaper coverage. University of South Florida, Graduate Theses and Dissertations.

  • Weaver, D. A., & Scacco, J. M. (2013). Revisiting the Protest Paradigm The Tea Party as Filtered through Prime-Time Cable News. The International Journal of Press/Politics, 18, 61–84. https://doi.org/10.1177/1940161212462872

    Article  Google Scholar 

  • Welbers, K., van Atteveldt, W., Kleinnijenhuis, J., Ruigrok, N., & Schaper, J. (2016). News selection criteria in the digital age: Professional norms versus online audience metrics. Journalism, 17(8), 1037–1053.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Giovanna Maria Dora Dore.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Dore, G.M.D. A Natural Language Processing Analysis of Newspapers Coverage of Hong Kong Protests Between 1998 and 2020. Soc Indic Res 169, 143–166 (2023). https://doi.org/10.1007/s11205-023-03147-0

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11205-023-03147-0

Keywords

Navigation