Abstract
Predicting the evolution of viral processes on networks is an important problem with applications arising in biology, the social sciences, and the study of the Internet. In existing works, mean-field analysis based upon degree distribution is used for the prediction of viral spreading across networks of different types. However, it has been shown that degree distribution alone fails to predict the behavior of viruses on some real-world networks and recent attempts have been made to use assortativity to address this shortcoming. In this paper, we show that adding assortativity does not fully explain the variance in the spread of viruses for a number of real-world networks. We propose using the graphlet frequency distribution in combination with assortativity to explain variations in the evolution of viral processes across networks with identical degree distribution. Using a data-driven approach by coupling predictive modeling with viral process simulation on real-world networks, we show that simple regression models based on graphlet frequency distribution can explain over 95% of the variance in virality on networks with the same degree distribution but different network topologies. Our results not only highlight the importance of graphlets but also identify a small collection of graphlets which may have the highest influence over the viral processes on a network.
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs00332-018-9465-y/MediaObjects/332_2018_9465_Fig1_HTML.gif)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs00332-018-9465-y/MediaObjects/332_2018_9465_Fig2_HTML.jpg)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs00332-018-9465-y/MediaObjects/332_2018_9465_Fig3_HTML.gif)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs00332-018-9465-y/MediaObjects/332_2018_9465_Fig4_HTML.gif)
Similar content being viewed by others
Notes
Disconnected graphlets have also been considered in several graphlet-based works, but in this work we only consider connected graphlets because connectedness is essential for the evolution of a viral process on a network.
References
Anderson, R.M., May, R.M., Anderson, B.: Infectious Diseases of Humans: Dynamics and Control, vol. 28. Wiley Online Library, New York (1992)
Berger, N., Borgs, C., Chayes, J.T., Saberi, A.: On the spread of viruses on the internet. In: Proceedings of the Sixteenth Annual ACM-SIAM Symposium on Discrete Algorithms, pp. 301–310. Society for Industrial and Applied Mathematics (2005)
Bhuiyan, M., Rahman, M., Rahman, M., Al Hasan, M.: GUISE: uniform sampling of graphlets for large graph analysis. In: 12th IEEE International Conference on Data Mining, ICDM 2012, Brussels, pp. 91–100 (2012)
Callaway, D.S., Newman, M.E.J., Strogatz, S.H., Watts, D.J.: Network robustness and fragility: percolation on random graphs. Phys. Rev. Lett. 85(25), 5468 (2000)
Chakrabarti, D., Wang, Y., Wang, C., Leskovec, J., Faloutsos, C.: Epidemic thresholds in real networks. ACM Trans. Inf. Syst. Secur. 10(4), 1:1–1:26 (2008a)
Chakrabarti, D., Wang, Y., Wang, C., Leskovec, J., Faloutsos, C.: Epidemic thresholds in real networks. ACM Trans. Inf. Syst. Secur. (TISSEC) 10(4), 1 (2008b)
Crane, R., Sornette, D.: Robust dynamic classes revealed by measuring the response function of a social system. Proc. Natl. Acad. Sci. 105(41), 15649–15653 (2008)
Dave, V., Ahmed, N., Al Hasan, M.: E-CLoG: counting edge-centric local graphlets. In: Proceedings of the 2017 IEEE International Conference on Big Data (Big Data), BIG DATA ’17. IEEE Computer Society (2017)
Dezső, Z., Barabási, A.-L.: Halting viruses in scale-free networks. Phys. Rev. E 65(5), 055103 (2002)
Dietz, K.: Models for vector-borne parasitic diseases. In: Vito Volterra Symposium on Mathematical Models in Biology, pp. 264–277. Springer, Berlin (1980)
Farajtabar, M., Yang, J., Ye, X., Xu, H., Trivedi, R., Khalil, E., Li, S., Song, L., Zha, H.: Fake news mitigation via point process based intervention.In: International Conference on Machine Learning, pp. 1097–1106 (2017)
Ganesh, A., Massoulie, L., Towsley, D.: The effect of network topology on the spread of epidemics. In: Proceedings IEEE 24th Annual Joint Conference of the IEEE Computer and Communications Societies, vol. 2, pp. 1455–1466 (2005)
Givan, O., Schwartz, N., Cygelberg, A., Stone, L.: Predicting epidemic thresholds on complex networks: limitations of mean-field approaches. J. Theor. Biol. 288, 21–28 (2011)
Jalan, S., Yadav, A.: Assortative and disassortative mixing investigated using the spectra of graphs. Phys. Rev. E 91(1), 012813 (2015)
Leskovec, J., Adamic, L.A., Huberman, B.A.: The dynamics of viral marketing. ACM Trans. Web (TWEB) 1(1), 5 (2007)
Lovasz, L.: Eigenvalues of Graphs. http://web.cs.elte.hu/~lovasz/eigenvals-x.pdf (2007)
Lusseau, D., Schneider, K., Boisseau, O.J., Haase, P., Slooten, E., Dawson, S.M.: The bottlenose dolphin community of Doubtful Sound features a large proportion of long-lasting associations. Behav. Ecol. Sociobiol. 54(4), 396–405 (2003)
Maslov, S., Sneppen, K.: Specificity and stability in topology of protein networks. Science 296(5569), 910–913 (2002)
Mihail C.G.M., Zegura, E.: The Markov chain simulation method for generating connected power law random graphs. In: Proceedings of the 5th Workshop on Algorithm Engineering and Experiments (ALENEX). SIAM (2003)
Newman, M.E.J.: Assortative mixing in networks. Phys. Rev. Lett. 89(20), 208701 (2002)
Newman, M.E.J.: The structure and function of complex networks. SIAM Rev. 45, 167–256 (2003)
Qu, J., Wang, S.-J., Jusup, M., Wang, Z.: Effects of random rewiring on the degree correlation of scale-free networks. Sci. Rep. 5, 15450 (2015)
Rahman, M., Al Hasan, M.: Sampling triples from restricted networks using MCMC strategy. In: Proceedings of the 23rd ACM International Conference on Conference on Information and Knowledge Management, CIKM 2014, Shanghai, pp. 1519–1528 (2014)
Rahman, M., Al Hasan, M.: Link prediction in dynamic networks using graphlet. In: Joint European Conference on Machine Learning and Knowledge Discovery in Databases, pp. 394–409. Springer, Berlin (2016)
Rahman, M., Bhuiyan, M., Al Hasan, M.: GRAFT: an approximate graphlet counting algorithm for large graph analysis. In: 21st ACM International Conference on Information and Knowledge Management, CIKM’12, Maui, pp. 1467–1471 (2012)
Rahman, M., Bhuiyan, M.A., Al Hasan, M.: GRAFT: an efficient graphlet counting method for large graph analysis. IEEE Trans. Knowl. Data Eng. 26(10), 2466–2478 (2014a)
Rahman, M., Bhuiyan, M.A., Rahman, M., Al Hasan, M.: GUISE: a uniform sampler for constructing frequency histogram of graphlets. Knowl. Inf. Syst. 38(3), 511–536 (2014b)
Rossi, R.A., Ahmed, N.K.: The network data repository with interactive graph analytics and visualization. In: Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence (2015)
Rossi, R.A., Gleich, D.F., Gebremedhin, A.H., Patwary, M.A.: What if CLIQUE were fast? Maximum Cliques in Information Networks and Strong Components in Temporal Networks, pp. 1–11. ar**v preprint ar**v:1210.5802 (2012)
Saha, T.K., Al Hasan, M.: Finding network motifs using MCMC sampling. In: Complex Networks VI—Proceedings of the 6th Workshop on Complex Networks CompleNet 2015, New York City, pp. 13–24 (2015)
Short, M.B., Mohler, G.O., Brantingham, P.J., Tita, G.E.: Gang rivalry dynamics via coupled point process networks. Discrete Contin. Dyn. Syst. Ser. B 19(5), 1459–1477 (2014)
Van Mieghem, P., Wang, H., Ge, X., Tang, S., Kuipers, F.A.: Influence of assortativity and degree-preserving rewiring on the spectra of networks. Eur. Phys. J. B Condens. Matter Complex Syst. 76(4), 643–652 (2010)
Yang, L.-X., Draief, M., Yang, X.: The impact of the network topology on the viral prevalence: a node-based approach. PLoS ONE 10(7), e0134507 (2015)
Zachary, W.W.: An information flow model for conflict and fission in small groups. J. Anthropol. Res. 33(4), 452–473 (1977)
Zhao, Q., Erdogdu, M.A., He, H.Y., Rajaraman, A., Leskovec, J.: Seismic: a self-exciting point process model for predicting tweet popularity. In: Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1513–1522. ACM (2015)
Acknowledgements
This work was supported in part by NSF Grants SCC-1737585, SES-1343123, ATD-1737996, and ATD-1737925.
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by Mason A. Porter and Andrea L. Bertozzi.
Rights and permissions
About this article
Cite this article
Khorshidi, S., Al Hasan, M., Mohler, G. et al. The Role of Graphlets in Viral Processes on Networks. J Nonlinear Sci 30, 2309–2324 (2020). https://doi.org/10.1007/s00332-018-9465-y
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00332-018-9465-y