Abstract
To help guide a just transition to a sustainable society and onboard the local communities, researchers can identify events of public interest through access to data from community engagement activities and social media content. However, novel analytic methods are required to process and analyse data in unstructured formats (e.g. transcripts, text and images) and to extract useful information for decision-making. This paper proposes an analytics pipeline combining latent Dirichlet allocation and hidden Markov models for automatically detecting multiple latent changepoints in topics over time, without prior knowledge of their occurrence. Analysing social media content (i.e., tweets) related to Glasgow, we identified events that captured social media users’ public interest, demonstrating the potential of our method to inform timely and relevant policy making.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Griffiths, T.L., Steyvers, M.: Finding scientific topics. Proc. Natl. Acad. Sci. 101(1), 5228–5235 (2004). https://doi.org/10.1073/pnas.0307752101
Killick, R., Eckley, I.A.: changepoint: an R package for changepoint analysis. J. Stat. Softw. 58(3), 1–19 (2014). https://doi.org/10.18637/jss.v058.i03
Ko, S.I.M., Chong, T.T.L., Ghosh, P.: Dirichlet process hidden Markov multiple change-point model. Bayesian Anal. 10(2), 275–296 (2015). https://doi.org/10.1214/14-BA910
Wallach, H.M., Mimno, D.M., McCallum, A.: Rethinking LDA: why priors matter. Adv. Neural Inf. Process. Syst. 23, 1973–1981. (2009). https://proceedings.neurips.cc/paper_files/paper/2009/file/0d0871f0806eae32d30983b62252da50Paper.pdf
Zhong, N., Schweidel, D.A.: Capturing changes in social media content: a multiple latent changepoint topic model. Mark. Sci. 39(4), 827–846 (2020). https://doi.org/10.1287/mksc.2019.1212
Acknowledgments
GALLANT is funded by the Natural Environment Research Council as part of the Changing the Environment Programme [grant number NE/W005042/1] (https://www.gla.ac.uk/research/az/sustainablesolutions/ourprojects/gallant/https://www.gla.ac.uk/research/az/sustainablesolutions/ourprojec-ts/gallant/). Special thanks to Cris Hasan, the data analytics and community engagement colleagues for their insights.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Cao Pinna, L., Miller, C., Scott, M. (2024). Latent Dirichlet Allocation and Hidden Markov Models to Identify Public Perception of Sustainability in Social Media Data. In: Einbeck, J., Maeng, H., Ogundimu, E., Perrakis, K. (eds) Developments in Statistical Modelling. IWSM 2024. Contributions to Statistics. Springer, Cham. https://doi.org/10.1007/978-3-031-65723-8_3
Download citation
DOI: https://doi.org/10.1007/978-3-031-65723-8_3
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-65722-1
Online ISBN: 978-3-031-65723-8
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)