Survey on Social Networks Data Analysis

  • Conference paper
  • First Online:
Innovations for Community Services (I4CS 2020)

Abstract

Social networks are the most successful Web 2.0 applications, where users share and create over 2.5 quintillion bytes of data daily. This data can be exploited to retrieve many kinds of information which will be used in several applications. In fact, social networks have attracted considerable attention from researchers in different domains. This paper serves as an introduction to social network data analysis. In this work we present the recent and representative works in social network data analysis in an analytical fashion. We also highlight most important applications and used methods in the context of structural data analysis. Then, we list the major tasks and approaches proposed to analyse added-content in social media.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
EUR 29.95
Price includes VAT (Germany)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
EUR 42.79
Price includes VAT (Germany)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
EUR 53.49
Price includes VAT (Germany)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free ship** worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    A measure to capture the vague notion of importance in a graph, used to identify the most significant vertices.

  2. 2.

    https://www.flickr.com/.

  3. 3.

    https://www.facebook.com/.

  4. 4.

    https://en.wikipedia.org/wiki/Global_Positioning_System.

  5. 5.

    https://en.wikipedia.org/wiki/Geographic_information_system.

References

  1. Aci, M., İnan, C., Avci, M.: A hybrid classification method of k nearest neighbor, Bayesian methods and genetic algorithm. Expert Syst. Appl. 37(7), 5061–5067 (2010)

    Article  Google Scholar 

  2. Aggarwal, C.C.: Social Network Data Analytics, 1st edn. Springer, Heidelberg (2011). https://doi.org/10.1007/978-1-4419-8462-3

    Book  MATH  Google Scholar 

  3. Aggarwal, C.C., Han, J., Wang, J., Yu, P.S.: A framework for clustering evolving data streams. In: Proceedings of the 29th International Conference on Very Large Data Bases. VLDB 2003, vol. 29, pp. 81–92. VLDB Endowment, Germany (2003)

    Chapter  Google Scholar 

  4. Ames, M., Naaman, M.: Why we tag: motivations for annotation in mobile and online media. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, pp. 971–980. ACM, New York (2007)

    Google Scholar 

  5. Anderson, A., Huttenlocher, D., Kleinberg, J., Leskovec, J.: Effects of user similarity in social media. In: Proceedings of the Fifth ACM International Conference on Web Search and Data Mining, WSDM 2012, pp. 703–712. ACM, New York (2012)

    Google Scholar 

  6. Aouay, S., Jamoussi, S., Gargouri, F.: Feature based link prediction. In: 11th IEEE/ACS International Conference on Computer Systems and Applications, Qatar, AICCSA, pp. 523–527 (2014)

    Google Scholar 

  7. Bader, B., Harshman, R., Kolda, T.: Temporal analysis of semantic graphs using ASALSAN. In: Seventh IEEE International Conference on Data Mining (ICDM), USA, pp. 33–42 (2007)

    Google Scholar 

  8. Banerjee, A., Dhillon, I., Ghosh, J., Sra, S.: Expectation maximization for clustering on hyperspheres. Technical report, University of Texas at Austin, USA (2003)

    Google Scholar 

  9. Beach, A., et al.: Fusing mobile, sensor, and social data to fully enable context-aware computing. In: Proceedings of the Eleventh Workshop on Mobile Computing Systems and Applications, HotMobile 2010, pp. 60–65. ACM, New York (2010)

    Google Scholar 

  10. Cai, L., Hofmann, T.: Hierarchical document categorization with support vector machines. In: Proceedings of the Thirteenth ACM International Conference on Information and Knowledge Management, CIKM 2004, pp. 78–87. ACM, New York (2004)

    Google Scholar 

  11. Chang, M., Poon, C.K.: Using phrases as features in email classification. J. Syst. Softw. 82(6), 1036–1045 (2009)

    Article  Google Scholar 

  12. Chen, W., Wang, M.: A fuzzy c-means clustering-based fragile watermarking scheme for image authentication. Expert Syst. Appl. 36(2), 1300–1307 (2009)

    Article  Google Scholar 

  13. Chua, T., Tang, J., Hong, R., Li, H., Luo, Z., Zheng, Y.: NUS-WIDE: a real-world web image database from National University of Singapore. In: Proceedings of the ACM International Conference on Image and Video Retrieval, CIVR 2009, pp. 48:1–48:9. ACM, New York (2009)

    Google Scholar 

  14. Clauset, A., Moore, C., Newman, M.E.J.: Hierarchical structure and the prediction of missing links in networks. Nature 453, 98–101 (2008)

    Article  Google Scholar 

  15. Conover, M., Gonçalves, B., Ratkiewicz, J., Flammini, A., Menczer, F.: Predicting the political alignment of Twitter users, pp. 192–199, October 2011

    Google Scholar 

  16. Cooper, M., Foote, J., Girgensohn, A., Wilcox, L.: Temporal event clustering for digital photo collections. ACM Trans. Multimedia Comput. Commun. Appl. 1(3), 269–288 (2005)

    Article  Google Scholar 

  17. Crandall, D.J., Backstrom, L., Huttenlocher, D., Kleinberg, J.: Map** the world’s photos. In: Proceedings of the 18th International Conference on World Wide Web, WWW 2009, pp. 761–770. ACM, New York (2009)

    Google Scholar 

  18. Dai, W., Yang, Q., Xue, G., Yu, Y.: Self-taught clustering. In: Proceedings of the 25th International Conference on Machine Learning, ICML 2008, pp. 200–207. ACM, New York (2008)

    Google Scholar 

  19. Dakiche, N., Tayeb, F.B.S., Slimani, Y., Benatchba, K.: Tracking community evolution in social networks: a survey. Inf. Process. Manag. 56(3), 1084–1102 (2019)

    Article  Google Scholar 

  20. Daumé, I.I.I., Marcu, D.: Domain adaptation for statistical classifiers. J. Artif. Int. Res. 26(1), 101–126 (2006)

    MathSciNet  MATH  Google Scholar 

  21. Deng, J., Dong, W., Socher, R., Li, L., Li, K., Fei-fei, L.: ImageNet: a large-scale hierarchical image database. In: Computer Society Conference on Computer Vision and Pattern Recognition CVPR, USA, pp. 248–255 (2009)

    Google Scholar 

  22. Dong, J., Zhao, Y., Peng, T.: Ontology classification for semantic-web-based software engineering. IEEE Trans. Serv. Comput. 2, 303–317 (2009)

    Article  Google Scholar 

  23. Du, J., **an, Y., Yang, J.: A survey on social network visualization. In: International Symposium on Social Science (ISSS 2015), China, pp. 275–279 (2015). Atlantis Press

    Google Scholar 

  24. Fortunato, S.: Community detection in graphs. Phys. Rep. 486(3–5), 75–174 (2010)

    Article  MathSciNet  Google Scholar 

  25. Fouss, F., Pirotte, A., Renders, J., Saerens, M.: Random-walk computation of similarities between nodes of a graph with application to collaborative recommendation. IEEE Trans. Knowl. Data Eng. 19(3), 355–369 (2007)

    Article  Google Scholar 

  26. Freeman, L.C.: Visualizing social networks. J. Soc. Struct. 1, 4 (2000)

    Google Scholar 

  27. Gallagher, A., Joshi, D., Yu, J., Luo, J.: Geo-location inference from image content and user tags. In: 2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, USA, pp. 55–62 (2009)

    Google Scholar 

  28. Girvan, M., Newman, M.E.: Community structure in social and biological networks. Proc. Natl. Acad. Sci. 99(12), 7821–7826 (2002)

    Article  MathSciNet  Google Scholar 

  29. Goh, K.I., Oh, E., Kahng, B., Kim, D.: Betweenness centrality correlation in social networks. Phys. Rev. E 67, 017101 (2003)

    Article  Google Scholar 

  30. Hannachi, L., Asfari, O., Benblidia, N., Bentayeb, F., Kabachi, N., Boussaid, O.: Community extraction based on topic-driven-model for clustering users tweets. In: Zhou, S., Zhang, S., Karypis, G. (eds.) ADMA 2012. LNCS (LNAI), vol. 7713, pp. 39–51. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35527-1_4

    Chapter  Google Scholar 

  31. Hays, J., Efros, A.A.: IM2GPS: estimating geographic information from a single image. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1–8. IEEE Computer Society, Washington, D.C. (2008)

    Google Scholar 

  32. He, H., Wang, H., Yang, J., Yu, P.S.: Blinks: ranked keyword searches on graphs. In: Proceedings of the 2007 ACM SIGMOD International Conference on Management of Data, SIGMOD 2007, pp. 305–316. ACM, New York (2007)

    Google Scholar 

  33. Ho, K.T., Bui, Q.V., Bui, M.: Dynamic social network analysis using author-topic model. In: Hodoň, M., Eichler, G., Erfurth, C., Fahrnberger, G. (eds.) I4CS 2018. CCIS, vol. 863, pp. 47–62. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-93408-2_4

    Chapter  Google Scholar 

  34. Irfan, R., et al.: A survey on text mining in social networks. Knowl. Eng. Rev. 30, 157–170 (2015)

    Article  Google Scholar 

  35. Jaffali, S., Jamoussi, S.: Principal component analysis neural network for textual document categorization and dimension reduction. In: 6th International Conference on Sciences of Electronics. Technologies of Information and Telecommunications (SETIT), pp. 835–839. IEEE, Tunisia (2012)

    Google Scholar 

  36. Jain, A.K.: Data clustering: 50 years beyond k-means. Pattern Recogn. Lett. 31(8), 651–666 (2010)

    Article  Google Scholar 

  37. Jo, T.: Neural text categorizer for exclusive text categorization. Int. J. Inf. Sci. 34(1) (2010)

    Google Scholar 

  38. Joshi, D., Luo, J.: Inferring generic activities and events from image content and bags of geo-tags. In: Proceedings of the 2008 International Conference on Content-Based Image and Video Retrieval, CIVR 2008, pp. 37–46. ACM, New York (2008)

    Google Scholar 

  39. Kacholia, V., Pandit, S., Chakrabarti, S., Sudarshan, S., Desai, R., Karambelkar, H.: Bidirectional expansion for keyword search on graph databases. In: Proceedings of the 31st International Conference on Very Large Data Bases, VLDB 2005, pp. 505–516. VLDB Endowment, Norway (2005)

    Google Scholar 

  40. Kamvar, S., Haveliwala, T., Manning, C., Golub, G.: Exploiting the block structure of the web for computing pagerank. Technical report 2003–17, Stanford InfoLab, UK (2003)

    Google Scholar 

  41. Kashfia, S., Alhajj, R.: Emotion and sentiment analysis from Twitter text. J. Comput. Sci. 36, 101003 (2019)

    Google Scholar 

  42. Kavitha, V., Punithavalli, M.: Clustering time series data stream - a literature survey. Int. J. Comput. Sci. Inf. Secur. IJCSIS 8(1), 289–294 (2010)

    Google Scholar 

  43. Khalessizadeh, S.M., Zaefarian, R., Nasseri, S., Ardil, E.: Genetic mining: using genetic algorithm for topic based on concept distribution. Int. J. Math. Comput. Phys. Electr. Comput. Eng. 2(1), 35–38 (2008)

    Google Scholar 

  44. Khemakhem, I.T., Jamoussi, S., Hamadou, A.B.: POS tagging without a tagger: using aligned corpora for transferring knowledge to under-resourced languages. Computación y Sistemas 20(4), 667–679 (2016)

    Google Scholar 

  45. Leicht, E.A., Holme, P., Newman, M.E.J.: Vertex similarity in networks. Phys. Rev. E 73(2), 026–120 (2006)

    Article  Google Scholar 

  46. Li, X., Chen, H.: Recommendation as link prediction in bipartite graphs. Decis. Support Syst. 54(2), 880–890 (2013)

    Article  Google Scholar 

  47. Li, Z.L., Fang, X., Sheng, O.R.L.: A survey of link recommendation for social networks: methods, theoretical foundations, and future research directions. ACM Trans. Manag. Inf. Syst. 9(1), 1:1–1:26 (2017)

    Article  Google Scholar 

  48. Lin, Y., Chi, Y., Zhu, S., Sundaram, H., Tseng, B.L.: FacetNet: a framework for analyzing communities and their evolutions in dynamic networks. In: Proceedings of the 17th International Conference on World Wide Web, WWW 2008, pp. 685–694. ACM, New York (2008)

    Google Scholar 

  49. Liu, D., Hua, X.S., Yang, L., Wang, M., Zhang, H.: Tag ranking. In: Proceedings of the 18th International Conference on World Wide Web, WWW 2009, pp. 351–360. ACM, New York (2009)

    Google Scholar 

  50. Liu, H., Hu, Z., Haddadi, H., Tian, H.: Hidden link prediction based on node centrality and weak ties. EPL (Europhys. Lett.) 101(1), 18004 (2013)

    Article  Google Scholar 

  51. Lü, L., **, C., Zhou, T.: Similarity index based on local paths for link prediction of complex networks. Phys. Rev. E 80, 046–122 (2009)

    Google Scholar 

  52. Macskassy, S.A., Provost, F.: A simple relational classifier. In: Proceedings of the Second Workshop on Multi-Relational Data Mining (MRDM-2003) at KDD-2003, pp. 64–76 (2003)

    Google Scholar 

  53. McPherson, M., Smith-Lovin, L., Cook, J.M.: Birds of a feather: homophily in social networks. Annual Rev. Sociol. 27(1), 415–444 (2001)

    Article  Google Scholar 

  54. Menon, A.K., Elkan, C.: Link prediction via matrix factorization. In: Gunopulos, D., Hofmann, T., Malerba, D., Vazirgiannis, M. (eds.) ECML PKDD 2011. LNCS (LNAI), vol. 6912, pp. 437–452. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-23783-6_28

    Chapter  Google Scholar 

  55. Miao, D., Duan, Q., Zhang, H., Jiao, N.: Rough set based hybrid algorithm for text classification. Expert Syst. Appl. 36(5), 9168–9174 (2009)

    Article  Google Scholar 

  56. Moody, J., McFarland, D., Bender-deMoll, S.: Dynamic network visualization. Am. J. Sociol. 110(4), 1206–1241 (2005)

    Article  Google Scholar 

  57. Moradabadi, B., Meybodi, M.R.: Link prediction in weighted social networks using learning automata. Eng. Appl. Artif. Intell. 70, 16–24 (2018)

    Article  Google Scholar 

  58. Naaman, M., Harada, S., Wang, Q., Garcia-Molina, H., Paepcke, A.: Context data in geo-referenced digital photo collections. In: Proceedings of the 12th Annual ACM International Conference on Multimedia, MULTIMEDIA 2004, pp. 196–203. ACM, New York (2004)

    Google Scholar 

  59. Neville, J., Jensen, D.: Leveraging relational autocorrelation with latent group models. In: Fifth IEEE International Conference on Data Mining (ICDM 2005), p. 8. IEEE (2005)

    Google Scholar 

  60. Newman, M.E.J.: Clustering and preferential attachment in growing networks. Phys. Rev. E 64, 025–102 (2001)

    Google Scholar 

  61. Pan, S.J., Yang, Q.: A survey on transfer learning. IEEE Trans. Knowl. Data Eng. 22(10), 770–783 (2010)

    Article  Google Scholar 

  62. Papadimitriou, A., Symeonidis, P., Manolopoulos, Y.: Fast and accurate link prediction in social networking systems. J. Syst. Softw. 85(9), 2119–2132 (2012)

    Article  Google Scholar 

  63. Qi, G., Aggarwal, C.C., Huang, T.S.: Community detection with edge content in social media networks. In: Kementsietsidis, A., Salles, M.A.V. (eds.) Proceedings of the 2012 IEEE 28th International Conference on Data Engineering, pp. 534–545. IEEE Computer Society, Washington, D.C. (2012)

    Google Scholar 

  64. Qi, G., Hua, X., Zhang, H.: Learning semantic distance from community-tagged media collection. In: Proceedings of the 17th ACM International Conference on Multimedia, MM 2009, pp. 243–252. ACM, New York (2009)

    Google Scholar 

  65. Raina, R., Battle, A., Lee, H., Packer, B., Ng, A.Y.: Self-taught learning: transfer learning from unlabeled data. In: Proceedings of the 24th International Conference on Machine Learning, ICML 2007, pp. 759–766. ACM, New York (2007)

    Google Scholar 

  66. Rao, Y., Li, X.: A topic-based dynamic clustering algorithm for text stream. In: International Conference on Artificial Intelligence and Industrial Engineering (AIIE 2015), Thailand, pp. 480–483 (2015)

    Google Scholar 

  67. Rocchio, J.J.: Relevance feedback in information retrieval, pp. 313–323 (1971)

    Google Scholar 

  68. Rossetti, G., Cazabet, R.: Community discovery in dynamic networks: a survey. ACM Comput. Surv. 51(2), 35:1–35:37 (2018)

    Article  Google Scholar 

  69. Sapountzi, A., Psannis, K.E.: Social networking data analysis tools and challenges. Future Gen. Comput. Syst. 86, 893–913 (2018)

    Article  Google Scholar 

  70. Seifzadeh, S., Farahat, A.K., Kamel, M.S., Karray, F.: Short-text clustering using statistical semantics. In: Gangemi, A., Leonardi, S., Panconesi, A. (eds.) Proceedings of the 24th International Conference on World Wide Web, pp. 805–810. ACM, New York (2015)

    Google Scholar 

  71. Shen, Z., Ma, K.: MobiVis: a visualization system for exploring mobile data. In: Proceedings of MobiVis: A Visualization System for Exploring Mobile Data, pp. 175–182. IEEE, Japan (2008)

    Google Scholar 

  72. Shen, Z., Ma, K., Eliassi-Rad, T.: Visual analysis of large heterogeneous social networks by semantic and structural abstraction. IEEE Trans. Vis. Comput. Graph. 12(6), 1427–1439 (2006)

    Article  Google Scholar 

  73. Sigurbjörnsson, B., van Zwol, R.: Flickr tag recommendation based on collective knowledge. In: Proceedings of the 17th International Conference on World Wide Web, WWW 2008, pp. 327–336. ACM, New York (2008)

    Google Scholar 

  74. Sun, J., Tao, D., Faloutsos, C.: Beyond streams and graphs: dynamic tensor analysis. In: Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2006, pp. 374–383. ACM, New York (2006)

    Google Scholar 

  75. Tang, F., Gao, Y.: Fast near duplicate detection for personal image collections. In: Proceedings of the 17th ACM International Conference on Multimedia, MM 2009, pp. 701–704. ACM, New York (2009)

    Google Scholar 

  76. Tang, J., Yan, S., Hong, R., Qi, G., Chua, T.: Inferring semantic concepts from community-contributed images and noisy tags. In: Proceedings of the 17th ACM International Conference on Multimedia, MM 2009, pp. 223–232. ACM, New York (2009)

    Google Scholar 

  77. Valverde-Rebaza, J.C., de Andrade Lopes, A.: Exploiting behaviors of communities of Twitter users for link prediction. Social Netw. Analys. Mining 3(4), 1063–1074 (2013)

    Article  Google Scholar 

  78. Wang, X.J., Zhang, L., Li, X., Ma, W.Y.: Annotating images by mining image search results. IEEE Trans. Pattern Anal. Mach. Intell. 30(11), 1919–1932 (2008)

    Article  Google Scholar 

  79. Wang, Z., Song, Y., Zhang, C.: Transferred dimensionality reduction. In: Daelemans, W., Goethals, B., Morik, K. (eds.) ECML PKDD 2008. LNCS (LNAI), vol. 5212, pp. 550–565. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-87481-2_36

    Chapter  Google Scholar 

  80. Weinberger, K.Q., Slaney, M., Van Zwol, R.: Resolving tag ambiguity. In: Proceedings of the 16th ACM International Conference on Multimedia, MM 2008, pp. 111–120. ACM, New York (2008)

    Google Scholar 

  81. Wu, P., Tretter, D.: Close & closer: social cluster and closeness from photo collections. In: Gao, W., et al. (eds.) ACM Multimedia, pp. 709–712. ACM, New York (2009)

    Google Scholar 

  82. Wu, S., Sun, J., Tang, J.: Patent partner recommendation in enterprise social networks. In: Proceedings of the Sixth ACM International Conference on Web Search and Data Mining, WSDM 2013, pp. 43–52. ACM, New York (2013)

    Google Scholar 

  83. **ang, R., Neville, J.: Collective inference for network data with copula latent Markov networks. In: Proceedings of the Sixth ACM International Conference on Web Search and Data Mining, WSDM 2013, pp. 647–656. ACM, New York (2013)

    Google Scholar 

  84. Xu, G., Zhang, Y., Li, L.: Web Mining and Social Networking: Techniques and Applications, 1st edn. Springer, New York (2011). https://doi.org/10.1007/978-1-4419-7735-9

    Book  Google Scholar 

  85. Xu, X., Zhang, F., Niu, Z.: An ontology-based query system for digital libraries. In: Pacific-Asia Workshop on Computational Intelligence and Industrial Application, China, pp. 222–226 (2008)

    Google Scholar 

  86. Yamamoto, T., Honda, K., Notsu, A., Ichihashi, H.: A comparative study on TIBA imputation methods in FCMdd-based linear clustering with relational data. Adv. Fuzzy Syst. 2011, 265170:1–265170:10 (2011)

    Article  MathSciNet  Google Scholar 

  87. Yu, J., Luo, J.: Leveraging probabilistic season and location context models for scene understanding. In: Proceedings of the 2008 International Conference on Content-Based Image and Video Retrieval, CIVR 2008, pp. 169–178. ACM, New York (2008)

    Google Scholar 

  88. Zadrozny, B.: Learning and evaluating classifiers under sample selection bias. In: Proceedings of the Twenty-First International Conference on Machine Learning, ICML 2004, pp. 114–121. ACM, New York (2004)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Soufien Jaffali .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Jaffali, S., Jamoussi, S., Khelifi, N., Hamadou, A.B. (2020). Survey on Social Networks Data Analysis. In: Rautaray, S., Eichler, G., Erfurth, C., Fahrnberger, G. (eds) Innovations for Community Services. I4CS 2020. Communications in Computer and Information Science, vol 1139. Springer, Cham. https://doi.org/10.1007/978-3-030-37484-6_6

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-37484-6_6

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-37483-9

  • Online ISBN: 978-3-030-37484-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Navigation