ONDOCS: Ordering Nodes to Detect Overlap** Community Structure

  • Chapter
  • First Online:
Data Mining for Social Network Data

Part of the book series: Annals of Information Systems ((AOIS,volume 12))

Abstract

Finding communities is an important task for the discovery of underlying structures in social networks. While existing approaches give interesting results, they typically neglect the fact that communities may overlap, with some hub nodes participating in multiple communities. Similarly, most methods cannot deal with outliers, which are nodes that belong to no germane communities. The definition of community is still vague and the criterion to locate hubs or outliers varies. Existing approaches usually require guidance in this regard, specified as input parameters, e.g., the number of communities in the network, without much intuition. Here we present a general community definition and a list of requirements for a community mining metric. We review advantages and disadvantages of existing metrics and propose our new metric to quantify the relation between nodes in a social network. We then use the new metric to build a visual data mining system, which first helps the user to achieve appropriate parameter selection by observing initial data visualizations, then detects overlap** community structure from the network while also excluding outliers. Experimental results verify the scalability and accuracy of our approach on real data networks and show its advantages over existing methods that also consider overlaps. An empirical evaluation of our metric demonstrates superior performance over previous measures.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
GBP 19.95
Price includes VAT (United Kingdom)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
GBP 71.50
Price includes VAT (United Kingdom)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
GBP 89.99
Price includes VAT (United Kingdom)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free ship** worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Ankerst, M., Breunig, M.M., Kriegel, H.P., and Sander, J. Optics: Ordering points to identify the clustering structure. In SIGMOD, Philadelphia, PA, pp. 49–60, 1999.

    Google Scholar 

  2. Ankerst, M., Elsen, C., Ester, M., and Kriegel, H.P. Visual classification: An interactive approach to decision tree construction. In KDD, San Diego, CA, pp. 392–396, 1999.

    Google Scholar 

  3. Ankerst, M., Ester, M., and Kriegel, H.P. Towards an effective cooperation of the user and the computer for classification. In KDD, New York, NY, pp. 179–188, 2000.

    Google Scholar 

  4. Ankerst, M., and Keim, D.A. Visual data mining. San Francisco, CA, Tutorial at SIAM Int. Conf on Data Mining, 2003.

    Google Scholar 

  5. Baumes, J., Goldberg, M.K., and Magdon-Ismail, M. Efficient identification of overlap** communities. In ISI, New York, NY, pp. 27–36, 2005.

    Google Scholar 

  6. Boguñá, M., Pastor-Satorras, R., Díiaz-Guilera, A., and Arenas, A. Models of social networks based on social distance attachment. Physical Review E, 70(5):056,122, 2004.

    Article  Google Scholar 

  7. Clauset, A., Newman, M.E.J., and Moore, C. Finding community structure in very large networks. Physical Review E, 70:066,111, 2004.

    Article  Google Scholar 

  8. Danon, L., Duch, J., Diaz-Guilera, A., and Arenas, A. Comparing community structure identification. Journal of Statistical Mechanics: Theory and Experiment, 9:P09008–09008, 2005.

    Google Scholar 

  9. Ding, C.H.Q., He, X., Zha, H., Gu, M., and Simon, H.D. A min-max cut algorithm for graph partitioning and data clustering. In ICDM, San Jose, CA, pp. 107–114, 2001.

    Google Scholar 

  10. Duch, J., and Arenas, A. Community detection in complex networks using extremal optimization. Physical Review E, 72:027,104 2005.

    Google Scholar 

  11. Gnuplot: http://www.gnuplot.info/

  12. Gregory, S. An algorithm to find overlap** community structure in networks. In PKDD, Warsaw, Poland, pp. 91–102, 2007.

    Google Scholar 

  13. Gregory, S. A fast algorithm to find overlap** communities in networks. In PKDD, Bristol, pp. 408–423, 2008.

    Google Scholar 

  14. Guimera, R., and Amaral, L.A.N. Functional cartography of complex metabolic networks. Nature, 433:895–900, 2005.

    Article  Google Scholar 

  15. Han, J., and Cercone, N. Ruleviz: A model for visualizing knowledge discovery process. In KDD, Boston, MA, pp. 244–253, 2000.

    Google Scholar 

  16. Knuth, D.E. The stanford graphbase: A platform for combinatorial computing. Reading, MA, Addison-Wesley, 1993.

    Google Scholar 

  17. Krebs, V. http://www.orgnet.com/

  18. Li, X., Liu, B., and Yu, P.S. Discovering overlap** communities of named entities. In PKDD, Heidelberg, pp. 593–600, 2006.

    Google Scholar 

  19. Nepusz, T., Petroćzi, A., Neǵyessy, L., and Bazso, F. Fuzzy communities and the concept of bridgeness in complex networks. Physical Review E, 77:016107, 2008.

    Article  Google Scholar 

  20. Newman, M. http://www-personal.umich.edu/~mejn/netdata/

  21. Newman, M.E.J. The structure of scientific collaboration networks. In Proceedings of the National Academy of Science USA, 98:404–409, 2001.

    Article  Google Scholar 

  22. Newman, M.E.J. Fast algorithm for detecting community structure in networks. Physical Review E, 69:066133, 2004.

    Google Scholar 

  23. Newman, M.E.J. Finding community structure in networks using the eigenvectors of matrices. Physical Review E, 74:036104, 2006.

    Google Scholar 

  24. Newman, M.E.J., and Girvan, M. Finding and evaluating community structure in networks. Physical Review E, 69:026113, 2004.

    Google Scholar 

  25. Pajek: http://vlado.fmf.uni-lj.si/pub/networks/pajek/

  26. Palla, G., Derenyi, I., Farkas, I., and Vicsek, T. Uncovering the overlap** community structure of complex networks in nature and society. Nature, 435:814, 2005.

    Article  Google Scholar 

  27. Ruan, J. and Zhang, W. An efficient spectral algorithm for network community discovery and its applications to biological and social networks. In ICDM, Omoha, NE, pp. 643–648, 2007.

    Google Scholar 

  28. Shi, J. and Malik, J. Normalized cuts and image segmentation. IEEE. Trans. on Pattern Analysis and Machine Intelligence, 22(8):888–905, 2000.

    Google Scholar 

  29. Teoh, S.T. and Ma, K.L. Painting class: Interactive construction, visualization and exploration of decision trees. In Proceedings of the Ninth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 667–672, 2003.

    Google Scholar 

  30. Wei, F., Wang, C., Ma, L., and Zhou, A. Detecting overlap** community structures in networks with global partition and local expansion. In Detecting International Asia-Pacific Web Conference (APWeb), pp. 43–55, 2008.

    Google Scholar 

  31. White, S., and Smyth, P. A spectral clustering approach to finding communities in graphs. In Proceedings of the 2005 SIAM International Conference on Data Mining, pp. 274–286, 2005.

    Google Scholar 

  32. Xu, X., Yuruk, N., Feng, Z., and Schweiger, T.A.J. Scan: A structural clustering algorithm for networks. In Proceedings of the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 824–833, 2007.

    Google Scholar 

  33. Yip, K.Y., and Ng, M.K. Harp: A practical projected clustering algorithm. IEEE TKDE, 16(11):1387–1397, 2004.

    Google Scholar 

  34. Zaïane, O.R., Foss, A., Lee, C.H., and Wang, W. On data clustering analysis: Scalability, constraints, and validation. In Pacific-Asia Conference on Knowledge Discovery and Data Mining, pp. 28–39, 2002.

    Google Scholar 

  35. Zhang, S., Wang, R., and Zhang, X. Identification of overlap** community structure in complex networks using fuzzy c-means clustering. Physica A, 374:483–490, 2007.

    Article  Google Scholar 

Download references

Acknowledgments

Our work is supported by the Canadian Natural Sciences and Engineering Research Council (NSERC), by the Alberta Ingenuity Centre for Machine Learning (AICML), and by the Alberta Informatics Circle of Research Excellence (iCORE).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Osmar R. Zaïane .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer US

About this chapter

Cite this chapter

Chen, J., Zaïane, O.R., Sander, J., Goebel, R. (2010). ONDOCS: Ordering Nodes to Detect Overlap** Community Structure. In: Memon, N., Xu, J., Hicks, D., Chen, H. (eds) Data Mining for Social Network Data. Annals of Information Systems, vol 12. Springer, Boston, MA. https://doi.org/10.1007/978-1-4419-6287-4_8

Download citation

Publish with us

Policies and ethics

Navigation