Log in

Predicting Community Health Through Heterogeneous Social Networks

  • Original Research
  • Published:
SN Computer Science Aims and scope Submit manuscript

Abstract

Social media has been utilized as a measurement tool of population health outcomes. Popular social media platforms are heterogeneous that allow users to share multiple types of links and data. However, existing studies on social media-based population analysis largely focus on leveraging user-generated text, while other data sources are under-explored. In this work, we examine the use of two typical data sources from online heterogeneous social networks: text as conventional and images as a novel sensor in understanding population health. To make use of visual content, we propose two types of population health representations extracted from shared images, in particular color histograms—a set of hand-crafted visual features, and automatic features learned from a deep convolutional neural network. For textual data, we adopt two well-known textual feature types: language style features and content-based features. To deal with the problem of weakly labeled data, we apply the multi-instance learning technique to reduce probable classification bias. For evaluation, we benchmark the proposed methods on the task of predicting the US county health outcomes through a large-scale dataset collected from Flickr. The experimental results suggest that social media visual content, in addition to data, is an informative indicator for population health outcomes. These experiments along with in-depth analysis will serve as a technical documentation for future research on population health analysis through the lens of heterogeneous social media.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price includes VAT (Germany)

Instant access to the full article PDF.

Fig. 1
Fig. 2

Similar content being viewed by others

Data availability

The data that support the findings of this study are not openly available due to human data and are available from the corresponding author upon reasonable request.

Notes

  1. https://www.cdc.gov/chronicdisease/resources/publications/aag/brfss.htm.

  2. https://www.cdc.gov/brfss/.

  3. https://www.cdc.gov/brfss/index.html.

  4. http://www.okilab.es/how-big-data-is-changing-healthcare/.

  5. https://www.flickr.com/services/api/.

  6. https://www.census.gov/geo/reference/codes/cou.html.

  7. https://www.cdc.gov/brfss/index.html.

  8. https://www.cdc.gov/brfss/questionnaires/pdf-ques/2014_BRFSS.pdf.

  9. https://keras.io.

  10. https://www.tensorflow.org.

  11. http://deeplearning.net/software/theano.

References

  1. Abdel-Hamid Ossama, Mohamed Abdel-rahman, Jiang Hui, Deng Li, Penn Gerald, Dong Yu. Convolutional neural networks for speech recognition. IEEE/ACM Trans Audio Speech Lang Process. 2014;22(10):1533–45.

    Article  Google Scholar 

  2. Allison Malorye. Can web 2.0 reboot clinical trials? Nat Biotechnol. 2009;27(10):895–902.

    Article  Google Scholar 

  3. Andalibi N, Ozturk P, Forte A. Depression-related imagery on Instagram. In: Proceedings of the ACM conference companion on computer supported cooperative work and social computing, Vancouver, Canada, March 2015. ACM Digital Library. p. 231–4.

  4. Aramaki E, Maskawa S, Morita M. Twitter catches the flu: detecting influenza epidemics using Twitter. In: Proceedings of the conference on empirical methods in natural language processing, Edinburgh, UK, July 2011. Association for Computational Linguistics. p. 1568–76.

  5. Bahner David P, Adkins Eric, Patel Nilesh, Donley Chad, Nagel Rollin, Kman Nicholas E. How we use social media to supplement a novel curriculum in medical education. Med Teach. 2012;34(6):439–44.

    Article  Google Scholar 

  6. Barrick CB, Taylor D, Correa EI. Color sensitivity and mood disorders: biology or metaphor? J Affect Disord. 2002;68(1):67–71.

    Article  Google Scholar 

  7. Blei DM, Ng AY, Jordan MI. Latent Dirichlet allocation. J Mach Learn Res. 2003;3:993–1022.

    MATH  Google Scholar 

  8. Bourgeault I, Dingwall R, De Vries R. The SAGE handbook of qualitative methods in health research. Thousand Oaks: Sage; 2010.

    Book  Google Scholar 

  9. Boyatzis CJ, Varghese R. Children’s emotional associations with colors. J Genet Psychol. 1994;155(1):77–85.

    Article  Google Scholar 

  10. Brownstein JS, Freifeld CC, Chan EH, Keller M, Sonricker AL, Mekaru SR, Buckeridge DL. Information technology and global surveillance of cases of 2009 H1N1 influenza. N Engl J Med. 2010;362(18):1731–5.

    Article  Google Scholar 

  11. Bull SS, Breslin LT, Wright EE, Black SR, Levine D, Santelli JS. Case study: an ethics case study of HIV prevention research on Facebook: the just/us study. J Pediatr Psychol. 2011;36(10):1082–92.

    Article  Google Scholar 

  12. Carruthers HR, Morris J, Tarrier N, Whorwell PJ. The Manchester color wheel: development of a novel way of identifying color choice and its validation in healthy, anxious and depressed individuals. BMC Med Res Methodol. 2010;10(1):12.

    Article  Google Scholar 

  13. Chretien KC, Kind T. Social media and clinical care. Circulation. 2013;127(13):1413–21.

    Article  Google Scholar 

  14. Collobert R, Weston J. A unified architecture for natural language processing: deep neural networks with multitask learning. In: Proceedings of the international conference on machine learning, New York, United States, July 2008. ACM Digital Library. p. 160–7.

  15. Culotta A. Estimating county health statistics with Twitter. In: Proceedings of the SIGCHI conference on human factors in computing systems, Toronto, Canada, April 2014. ACM Digital Library. p. 1335–44.

  16. De Choudhury M, Counts S, Horvitz E. Social media as a measurement tool of depression in populations. In: Proceedings of the annual ACM web science conference, Paris, France, May 2013. New York, United States: Association for Computing Machinery. p. 47–56.

  17. De Choudhury M, Gamon M, Counts S, Horvitz E. Predicting depression via social media. In: Proceedings of the international AAAI conference on weblogs and social media, Washington, USA, July 2013. p. 128–37.

  18. Denecke K, Dolog P, Smrz P. Making use of social media data in public health. In: Proceedings of the international conference on World Wide Web, Lyon, France. New York, United States: Association for Computing Machinery; 2012. p. 243–6.

  19. Dietterich TG, Lathrop RH, Lozano-Pérez T. Solving the multiple instance problem with axis-parallel rectangles. Artif Intell. 1997;89(1):31–71.

    Article  MATH  Google Scholar 

  20. Dredze M, Paul MJ. Natural language processing for health and social media. IEEE Intell Syst. 2014;29(2):64–7.

    Google Scholar 

  21. Foulds James, Frank Eibe. A review of multi-instance learning assumptions. Knowl Eng Rev. 2010;25(1):1–25.

    Article  Google Scholar 

  22. Gehring J, Auli M, Grangier D, Yarats D, Dauphin YN. Convolutional sequence to sequence learning. ar**v:1705.03122 (2017).

  23. George DR, Dellasega C, Whitehead MM, Bordon A. Facebook-based stress management resources for first-year medical students: a multi-method evaluation. Comput Hum Behav. 2013;29(3):559–62.

    Article  Google Scholar 

  24. Ginsberg J, Mohebbi MH, Patel RS, Brammer L, Smolinski MS, Brilliant L. Detecting influenza epidemics using search engine query data. Nature. 2009;457(7232):1012–4.

    Article  Google Scholar 

  25. Greaves Felix, Millett Christopher. Consistently increasing numbers of online ratings of healthcare in England. J Med Internet Res. 2012;14(3):e94.

    Article  Google Scholar 

  26. Greaves Felix, Ramirez-Cano Daniel, Millett Christopher, Darzi Ara, Donaldson Liam. Harnessing the cloud of patient experience: using social media to detect poor quality healthcare. BMJ Qua Saf. 2013;22(3):251–5.

    Article  Google Scholar 

  27. Hemphill Michael. A note on adults’ color-emotion associations. J Genet Psychol. 1996;157(3):275–80.

    Article  Google Scholar 

  28. Kapp JM, LeMaster JW, Lyon MB, Zhang B, Hosokawa MC. Updating public health teaching methods in the era of social media. Public Health Rep. 2009;124(6):775.

    Article  Google Scholar 

  29. Karpathy A, Toderici G, Shetty S, Leung T, Sukthankar R, Fei-Fei L. Large-scale video classification with convolutional neural networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, Columbus, OH, USA, June 2014. IEEE. p. 1725–32.

  30. Katikalapudi Raghavendra, Chellappan Sriram, Montgomery Frances, Wunsch Donald, Lutzen Karl. Associating Internet usage with depressive behavior among college students. IEEE Technol Soc Mag. 2012;31(4):73–80.

    Article  Google Scholar 

  31. Kayyali Basel, Knott David, Van Kuiken Steve. The big-data revolution in US health care: accelerating value and innovation. Mc Kinsey & Company. 2013;2(8):1–13.

    Google Scholar 

  32. Korda Holly, Itani Zena. Harnessing social media for health promotion and behavior change. Health Promot Pract. 2013;14(1):15–23.

    Article  Google Scholar 

  33. Laranjo L, Arguel A, Neves AL, Gallagher AM, Kaplan R, Mortimer N, Mendes GA, Lau AYS. The influence of social networking sites on health behavior change: a systematic review and meta-analysis. J Am Med Inform Assoc. 2014;22(1):243–56.

    Article  Google Scholar 

  34. Larsen ME, Boonstra TW, Batterham PJ, O’Dea B, Paris C, Christensen H. We feel: map** emotion on Twitter. IEEE J Biomed Health Inform. 2015;19(4):1246–52.

    Article  Google Scholar 

  35. Le H, Nguyen H. On the use of textual and visual data from online social networks for predicting community health. In: 2020 international conference on advanced computing and applications (ACOMP). 2020. p. 55–62.

  36. LeCun Yann, Bengio Yoshua, et al. Convolutional networks for images, speech, and time series. Handb Brain Theory Neural Netw. 1995;3361(10):1995.

    Google Scholar 

  37. Liang BA, Mackey T. Direct-to-consumer advertising with interactive Internet media: global regulation and public health issues. J Am Med Assoc. 2011;305(8):824–5.

    Article  Google Scholar 

  38. Manikonda L, De Choudhury M. Modeling and understanding visual attributes of mental health disclosures in social media. In: Proceedings of the CHI conference on human factors in computing systems, Denver, USA, May 2017. New York, United States: Association for Computing Machinery. p. 170–81.

  39. Moreno MA, Jelenchick LA, Egan KG, Cox E, Young H, Gannon KE, Becker T. Feeling bad on Facebook: depression disclosures by college students on a social networking site. Depress Anxiety. 2011;28(6):447–55.

    Article  Google Scholar 

  40. Nguyen T, Nguyen DT, Larsen ME, O’Dea B, Yearwood J, Phung D, Venkatesh S, Christensen H. Prediction of Population health indices from social media using kernel-based textual and temporal features. In: Proceedings of the international conference on World Wide Web companion, Perth, Australia, April 2017. Switzerland: International World Wide Web Conferences Steering Committee. p. 99–107.

  41. Parrish RG. Peer reviewed: measuring population health outcomes. Prev Chronic Dis. 2010;7(4):1–11.

    Google Scholar 

  42. Paul MJ, Dredze M. you are what you tweet: analysing twitter for public health. In: Processing of the international AAAI conference on weblogs and social media, Barcelona, Spain, July 2011. The AAAI Press.

  43. Paul MJ, Dredze M. A model for mining public health topics from Twitter. Health. 2012;11:16–6.

    Google Scholar 

  44. Pennebaker JW, Beall SK. Confronting a traumatic event: toward an understanding of inhibition and disease. J Abnorm Psychol. 1986;95(3):274.

    Article  Google Scholar 

  45. Pennebaker JW, Booth RJ, Boyd RL, Francis ME. Linguistic inquiry and word count: LIWC 2015 [computer software]. Austin: Pennebaker Conglomerates, Inc.; 2015.

    Google Scholar 

  46. Pennebaker JW, Francis ME, Booth RJ. Linguistic inquiry and word count: LIWC 2001. Mahway: Lawrence Erlbaum Associates; 2001. p. 71.

    Google Scholar 

  47. Reece AG, Danforth CM. Instagram photos reveal predictive markers of depression. EPJ Data Sci. 2017;6(1):15.

    Article  Google Scholar 

  48. Russakovsky Olga, Deng Jia, Hao Su, et al. ImageNet large scale visual recognition challenge. Int J Comput Vis. 2015;115(3):211–52.

    Article  MathSciNet  Google Scholar 

  49. Salathe M, Linus B, Bodnar TJ, Brewer DD, Brownstein JS, Buckee C, Campbell EM, Cattuto C, Khandelwal S, Mabry PL, et al. Digital epidemiology. PLoS Comput Biol. 2012;8(7):e1002616.

    Article  Google Scholar 

  50. Schwartz HA, Eichstaedt JC, Kern ML, Dziurzynski L, Lucas RE, Agrawal M, Park GJ, Lakshmikanth SK, Jha S, Seligman MEP, Ungar L. Characterizing geographic variation in well-being using tweets. In: Proceedings of the International AAAI conference on weblogs and social media, Cambridge, Massachusetts, USA, July 2013. The AAAI Press. p. 583–91.

  51. Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition. ar**v:1409.1556 (2014).

  52. Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A. Going deeper with convolutions. In: Proceedings of the IEEE Conference on computer vision and pattern recognition, Boston, Massachusetts, USA, June 2015. IEEE. p. 1–9.

  53. Veale HJ, Sacks-Davis R, Weaver ERN, Pedrana AE, Stoové MA, Hellard ME. The use of social networking platforms for sexual health promotion: identifying key strategies for successful user engagement. BMC Public Health. 2015;15(1):85–96.

    Article  Google Scholar 

  54. Wakefield MA, Loken B, Hornik RC. Use of mass media campaigns to change health behaviour. Lancet. 2010;376(9748):1261–71.

    Article  Google Scholar 

  55. Weber GM, Mandl KD, Kohane IS. Finding the missing link for big biomedical data. J Am Med Assoc. 2014;311(24):2479–80.

    Google Scholar 

  56. Wong WW, Gupta SC. Plastic surgery marketing in a generation of ‘Tweeting’. Aesthet Surg J. 2011;31(8):972–6.

    Article  Google Scholar 

  57. Zhou Z-H, Zhang M-L. Multi-instance multi-label learning with application to scene classification. In: Proceedings of advances in neural information processing systems, Vancouver, Canada, December 2007. p. 1609–16.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hung Nguyen.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

https://www.flickr.com.

This article is part of the topical collection “Advanced Computing Systems and Analytics for IoT-enabled, AI-powered Smart Society” guest edited by Lam-Son Lê and Maurizio Marchese.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Nguyen, H., Le, H. Predicting Community Health Through Heterogeneous Social Networks. SN COMPUT. SCI. 4, 227 (2023). https://doi.org/10.1007/s42979-023-01718-z

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s42979-023-01718-z

Keywords

Navigation