Abstract
Social media has been utilized as a measurement tool of population health outcomes. Popular social media platforms are heterogeneous that allow users to share multiple types of links and data. However, existing studies on social media-based population analysis largely focus on leveraging user-generated text, while other data sources are under-explored. In this work, we examine the use of two typical data sources from online heterogeneous social networks: text as conventional and images as a novel sensor in understanding population health. To make use of visual content, we propose two types of population health representations extracted from shared images, in particular color histograms—a set of hand-crafted visual features, and automatic features learned from a deep convolutional neural network. For textual data, we adopt two well-known textual feature types: language style features and content-based features. To deal with the problem of weakly labeled data, we apply the multi-instance learning technique to reduce probable classification bias. For evaluation, we benchmark the proposed methods on the task of predicting the US county health outcomes through a large-scale dataset collected from Flickr. The experimental results suggest that social media visual content, in addition to data, is an informative indicator for population health outcomes. These experiments along with in-depth analysis will serve as a technical documentation for future research on population health analysis through the lens of heterogeneous social media.
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs42979-023-01718-z/MediaObjects/42979_2023_1718_Fig1_HTML.jpg)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs42979-023-01718-z/MediaObjects/42979_2023_1718_Fig2_HTML.png)
Similar content being viewed by others
Data availability
The data that support the findings of this study are not openly available due to human data and are available from the corresponding author upon reasonable request.
Notes
References
Abdel-Hamid Ossama, Mohamed Abdel-rahman, Jiang Hui, Deng Li, Penn Gerald, Dong Yu. Convolutional neural networks for speech recognition. IEEE/ACM Trans Audio Speech Lang Process. 2014;22(10):1533–45.
Allison Malorye. Can web 2.0 reboot clinical trials? Nat Biotechnol. 2009;27(10):895–902.
Andalibi N, Ozturk P, Forte A. Depression-related imagery on Instagram. In: Proceedings of the ACM conference companion on computer supported cooperative work and social computing, Vancouver, Canada, March 2015. ACM Digital Library. p. 231–4.
Aramaki E, Maskawa S, Morita M. Twitter catches the flu: detecting influenza epidemics using Twitter. In: Proceedings of the conference on empirical methods in natural language processing, Edinburgh, UK, July 2011. Association for Computational Linguistics. p. 1568–76.
Bahner David P, Adkins Eric, Patel Nilesh, Donley Chad, Nagel Rollin, Kman Nicholas E. How we use social media to supplement a novel curriculum in medical education. Med Teach. 2012;34(6):439–44.
Barrick CB, Taylor D, Correa EI. Color sensitivity and mood disorders: biology or metaphor? J Affect Disord. 2002;68(1):67–71.
Blei DM, Ng AY, Jordan MI. Latent Dirichlet allocation. J Mach Learn Res. 2003;3:993–1022.
Bourgeault I, Dingwall R, De Vries R. The SAGE handbook of qualitative methods in health research. Thousand Oaks: Sage; 2010.
Boyatzis CJ, Varghese R. Children’s emotional associations with colors. J Genet Psychol. 1994;155(1):77–85.
Brownstein JS, Freifeld CC, Chan EH, Keller M, Sonricker AL, Mekaru SR, Buckeridge DL. Information technology and global surveillance of cases of 2009 H1N1 influenza. N Engl J Med. 2010;362(18):1731–5.
Bull SS, Breslin LT, Wright EE, Black SR, Levine D, Santelli JS. Case study: an ethics case study of HIV prevention research on Facebook: the just/us study. J Pediatr Psychol. 2011;36(10):1082–92.
Carruthers HR, Morris J, Tarrier N, Whorwell PJ. The Manchester color wheel: development of a novel way of identifying color choice and its validation in healthy, anxious and depressed individuals. BMC Med Res Methodol. 2010;10(1):12.
Chretien KC, Kind T. Social media and clinical care. Circulation. 2013;127(13):1413–21.
Collobert R, Weston J. A unified architecture for natural language processing: deep neural networks with multitask learning. In: Proceedings of the international conference on machine learning, New York, United States, July 2008. ACM Digital Library. p. 160–7.
Culotta A. Estimating county health statistics with Twitter. In: Proceedings of the SIGCHI conference on human factors in computing systems, Toronto, Canada, April 2014. ACM Digital Library. p. 1335–44.
De Choudhury M, Counts S, Horvitz E. Social media as a measurement tool of depression in populations. In: Proceedings of the annual ACM web science conference, Paris, France, May 2013. New York, United States: Association for Computing Machinery. p. 47–56.
De Choudhury M, Gamon M, Counts S, Horvitz E. Predicting depression via social media. In: Proceedings of the international AAAI conference on weblogs and social media, Washington, USA, July 2013. p. 128–37.
Denecke K, Dolog P, Smrz P. Making use of social media data in public health. In: Proceedings of the international conference on World Wide Web, Lyon, France. New York, United States: Association for Computing Machinery; 2012. p. 243–6.
Dietterich TG, Lathrop RH, Lozano-Pérez T. Solving the multiple instance problem with axis-parallel rectangles. Artif Intell. 1997;89(1):31–71.
Dredze M, Paul MJ. Natural language processing for health and social media. IEEE Intell Syst. 2014;29(2):64–7.
Foulds James, Frank Eibe. A review of multi-instance learning assumptions. Knowl Eng Rev. 2010;25(1):1–25.
Gehring J, Auli M, Grangier D, Yarats D, Dauphin YN. Convolutional sequence to sequence learning. ar**v:1705.03122 (2017).
George DR, Dellasega C, Whitehead MM, Bordon A. Facebook-based stress management resources for first-year medical students: a multi-method evaluation. Comput Hum Behav. 2013;29(3):559–62.
Ginsberg J, Mohebbi MH, Patel RS, Brammer L, Smolinski MS, Brilliant L. Detecting influenza epidemics using search engine query data. Nature. 2009;457(7232):1012–4.
Greaves Felix, Millett Christopher. Consistently increasing numbers of online ratings of healthcare in England. J Med Internet Res. 2012;14(3):e94.
Greaves Felix, Ramirez-Cano Daniel, Millett Christopher, Darzi Ara, Donaldson Liam. Harnessing the cloud of patient experience: using social media to detect poor quality healthcare. BMJ Qua Saf. 2013;22(3):251–5.
Hemphill Michael. A note on adults’ color-emotion associations. J Genet Psychol. 1996;157(3):275–80.
Kapp JM, LeMaster JW, Lyon MB, Zhang B, Hosokawa MC. Updating public health teaching methods in the era of social media. Public Health Rep. 2009;124(6):775.
Karpathy A, Toderici G, Shetty S, Leung T, Sukthankar R, Fei-Fei L. Large-scale video classification with convolutional neural networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, Columbus, OH, USA, June 2014. IEEE. p. 1725–32.
Katikalapudi Raghavendra, Chellappan Sriram, Montgomery Frances, Wunsch Donald, Lutzen Karl. Associating Internet usage with depressive behavior among college students. IEEE Technol Soc Mag. 2012;31(4):73–80.
Kayyali Basel, Knott David, Van Kuiken Steve. The big-data revolution in US health care: accelerating value and innovation. Mc Kinsey & Company. 2013;2(8):1–13.
Korda Holly, Itani Zena. Harnessing social media for health promotion and behavior change. Health Promot Pract. 2013;14(1):15–23.
Laranjo L, Arguel A, Neves AL, Gallagher AM, Kaplan R, Mortimer N, Mendes GA, Lau AYS. The influence of social networking sites on health behavior change: a systematic review and meta-analysis. J Am Med Inform Assoc. 2014;22(1):243–56.
Larsen ME, Boonstra TW, Batterham PJ, O’Dea B, Paris C, Christensen H. We feel: map** emotion on Twitter. IEEE J Biomed Health Inform. 2015;19(4):1246–52.
Le H, Nguyen H. On the use of textual and visual data from online social networks for predicting community health. In: 2020 international conference on advanced computing and applications (ACOMP). 2020. p. 55–62.
LeCun Yann, Bengio Yoshua, et al. Convolutional networks for images, speech, and time series. Handb Brain Theory Neural Netw. 1995;3361(10):1995.
Liang BA, Mackey T. Direct-to-consumer advertising with interactive Internet media: global regulation and public health issues. J Am Med Assoc. 2011;305(8):824–5.
Manikonda L, De Choudhury M. Modeling and understanding visual attributes of mental health disclosures in social media. In: Proceedings of the CHI conference on human factors in computing systems, Denver, USA, May 2017. New York, United States: Association for Computing Machinery. p. 170–81.
Moreno MA, Jelenchick LA, Egan KG, Cox E, Young H, Gannon KE, Becker T. Feeling bad on Facebook: depression disclosures by college students on a social networking site. Depress Anxiety. 2011;28(6):447–55.
Nguyen T, Nguyen DT, Larsen ME, O’Dea B, Yearwood J, Phung D, Venkatesh S, Christensen H. Prediction of Population health indices from social media using kernel-based textual and temporal features. In: Proceedings of the international conference on World Wide Web companion, Perth, Australia, April 2017. Switzerland: International World Wide Web Conferences Steering Committee. p. 99–107.
Parrish RG. Peer reviewed: measuring population health outcomes. Prev Chronic Dis. 2010;7(4):1–11.
Paul MJ, Dredze M. you are what you tweet: analysing twitter for public health. In: Processing of the international AAAI conference on weblogs and social media, Barcelona, Spain, July 2011. The AAAI Press.
Paul MJ, Dredze M. A model for mining public health topics from Twitter. Health. 2012;11:16–6.
Pennebaker JW, Beall SK. Confronting a traumatic event: toward an understanding of inhibition and disease. J Abnorm Psychol. 1986;95(3):274.
Pennebaker JW, Booth RJ, Boyd RL, Francis ME. Linguistic inquiry and word count: LIWC 2015 [computer software]. Austin: Pennebaker Conglomerates, Inc.; 2015.
Pennebaker JW, Francis ME, Booth RJ. Linguistic inquiry and word count: LIWC 2001. Mahway: Lawrence Erlbaum Associates; 2001. p. 71.
Reece AG, Danforth CM. Instagram photos reveal predictive markers of depression. EPJ Data Sci. 2017;6(1):15.
Russakovsky Olga, Deng Jia, Hao Su, et al. ImageNet large scale visual recognition challenge. Int J Comput Vis. 2015;115(3):211–52.
Salathe M, Linus B, Bodnar TJ, Brewer DD, Brownstein JS, Buckee C, Campbell EM, Cattuto C, Khandelwal S, Mabry PL, et al. Digital epidemiology. PLoS Comput Biol. 2012;8(7):e1002616.
Schwartz HA, Eichstaedt JC, Kern ML, Dziurzynski L, Lucas RE, Agrawal M, Park GJ, Lakshmikanth SK, Jha S, Seligman MEP, Ungar L. Characterizing geographic variation in well-being using tweets. In: Proceedings of the International AAAI conference on weblogs and social media, Cambridge, Massachusetts, USA, July 2013. The AAAI Press. p. 583–91.
Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition. ar**v:1409.1556 (2014).
Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A. Going deeper with convolutions. In: Proceedings of the IEEE Conference on computer vision and pattern recognition, Boston, Massachusetts, USA, June 2015. IEEE. p. 1–9.
Veale HJ, Sacks-Davis R, Weaver ERN, Pedrana AE, Stoové MA, Hellard ME. The use of social networking platforms for sexual health promotion: identifying key strategies for successful user engagement. BMC Public Health. 2015;15(1):85–96.
Wakefield MA, Loken B, Hornik RC. Use of mass media campaigns to change health behaviour. Lancet. 2010;376(9748):1261–71.
Weber GM, Mandl KD, Kohane IS. Finding the missing link for big biomedical data. J Am Med Assoc. 2014;311(24):2479–80.
Wong WW, Gupta SC. Plastic surgery marketing in a generation of ‘Tweeting’. Aesthet Surg J. 2011;31(8):972–6.
Zhou Z-H, Zhang M-L. Multi-instance multi-label learning with application to scene classification. In: Proceedings of advances in neural information processing systems, Vancouver, Canada, December 2007. p. 1609–16.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
This article is part of the topical collection “Advanced Computing Systems and Analytics for IoT-enabled, AI-powered Smart Society” guest edited by Lam-Son Lê and Maurizio Marchese.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Nguyen, H., Le, H. Predicting Community Health Through Heterogeneous Social Networks. SN COMPUT. SCI. 4, 227 (2023). https://doi.org/10.1007/s42979-023-01718-z
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s42979-023-01718-z