Abstract
Social media play a decisive role in communicating and spreading information during global events. In particular, real-time microblogging platforms such as Twitter have become prevalent. Researchers have used microblogging for a number of tasks, including past events analysis, predictions, and information retrieval. Nevertheless, little attention has been given to quantitative data extraction. In this paper, we address two questions: can we develop a mechanism to extract quantitative data from a collection of tweets, and can we use the salient findings to describe an event? To answer the first question, we introduce Raimond, a virtual text curator, specialized in quantitative data extraction from Twitter. To address the second question, we use our system on three events and evaluate its output using a crowdsourcing strategy. We demonstrate the effectiveness of our approach with a number of real world examples.
Chapter PDF
Similar content being viewed by others
References
Ahmed, A., Ho, Q., Eisenstein, J., **ng, E., Smola, A.J., Teo, C.H.: Unified analysis of streaming news. In: Proc. WWW, pp. 267–276 (2011)
Allan, J., Gupta, R., Khandelwal, V.: Temporal summaries of new topics. In: Proc. SIGIR, pp. 10–18. ACM (2001)
Alonso, O., Marshall, C.C., Najork, M.: Are some tweets more interesting than others? #hardquestion. In: HCIR, p. 2. ACM (2013)
Alonso, O., Shiells, K.: Timelines as summaries of popular scheduled events. In: Proc. WWW, pp. 1037–1044 (2013)
Benson, E., Haghighi, A., Barzilay, R.: Event discovery in social media feeds. In: Proc. ACL, pp. 389–398. Association for Computational Linguistics (2011)
Chakrabarti, D., Punera, K.: Event summarization using tweets. In: Proc. ICWSM, pp. 66–73. AAAI Press (2011)
Diakopoulos, N.: Diamonds in the rough: Social media visual analytics for journalistic inquiry. In: Proc. VAST, pp. 115–122. IEEE (2010)
Hotho, A., Staab, S., Stumme, G.: Ontologies improve text document clustering. In: Proc. ICDM, pp. 541–544. IEEE (2003)
Imran, M., Castillo, C., Diaz, F., Vieweg, S.: Processing social media messages in mass emergency: A survey. In: CoRR. ar**v preprint: 1407.7071 (2014)
Imran, M., Elbassuoni, S., Castillo, C.: Practical extraction of disaster-relevant information from social media. In: Proc. WWW, pp. 1021–1024 (2013)
Luhn, H.: The automatic creation of literature abstracts. IBM Journal of Research and Development, 159–165 (1958)
Marcus, A., Bernstein, M., Badar, O.: Twitinfo: aggregating and visualizing microblogs for event exploration. In: Proc. CHI, pp. 227–236. ACM (2011)
Miller, G.A.: Wordnet: a lexical database for english. In: CACM, vol. 38, pp. 39–41. ACM (1995)
Petrović, S., Osborne, M., Lavrenko, V.: Streaming first story detection with application to twitter. In: NAACL, pp. 181–189. Association for Computational Linguistics (2010)
Phan, X.H., Nguyen, L.M., Horiguchi, S.: Learning to classify short and sparse text & web with hidden topics from large-scale data collections. In: Proc. WWW, pp. 91–100 (2008)
Popescu, A.M., Pennacchiotti, M.: Detecting controversial events from twitter. In: Proc. CIKM, p. 1873. ACM (2010)
Popescu, A.M., Pennacchiotti, M., Paranjpe, D.: Extracting events and event descriptions from Twitter. In: Proc. WWW, p. 105 (2011)
Ritter, A., Etzioni, O., Clark, S.: Open domain event extraction from twitter. In: KDD, p. 1104. ACM (2012)
Sakaki, T., Okazaki, M., Matsuo, Y.: Earthquake shakes Twitter users: real-time event detection by social sensors. In: Proc. WWW, pp. 851–860 (2010)
Sayyadi, H., Hurst, M., Maykov, A.: Event detection and tracking in social streams. In: Proc. ICWSM, pp. 311–314. AAAI Press (2009)
Sokal, R.R.: A statistical method for evaluating systematic relationships. U. Kansas Scientific Bulletin 38, 1409–1438 (1958)
Suen, C., Huang, S., Eksombatchai, C., Sosic, R., Leskovec, J.: Nifty: a system for large scale information flow tracking and clustering. In: Proc. WWW, pp. 1237–1248 (2013)
Tufte, E.: The visual display of quantitative information. Graphics Press Cheshire, CT (1983)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Sellam, T., Alonso, O. (2015). Raimond: Quantitative Data Extraction from Twitter to Describe Events. In: Cimiano, P., Frasincar, F., Houben, GJ., Schwabe, D. (eds) Engineering the Web in the Big Data Era. ICWE 2015. Lecture Notes in Computer Science(), vol 9114. Springer, Cham. https://doi.org/10.1007/978-3-319-19890-3_17
Download citation
DOI: https://doi.org/10.1007/978-3-319-19890-3_17
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-19889-7
Online ISBN: 978-3-319-19890-3
eBook Packages: Computer ScienceComputer Science (R0)