Big Data Search and Mining

Radha Krishna, P.

doi:10.1007/978-81-322-2494-5_4

P. Radha Krishna⁵

Part of the book series: Studies in Big Data ((SBD,volume 11))

10k Accesses

Abstract

Most enterprises are generating data at an unprecedented way. On the other hand, traditional consumers are transforming into digital consumers due to high adoption of social media and networks by individuals. Since transactions on these sites are huge and increasing rapidly, social networks have become the new target for several business applications. Big Data mining deals with tap** large amount of data that is complex with a wide variety of data types and provides actionable insights at the right time. The search and mining applications over Big Data resulted in the development of a new kind of technologies, platforms, and frameworks. This chapter introduces the notion of search and data mining in the Big Data context and technologies supporting Big Data. We also present some data mining techniques that deal with scalability and heterogeneity of large data. We further discuss clustering social networks using topology discovery and also address the problem of evaluating and managing text-based sentiments from social network media. Further, this chapter accentuates some of the open source tools for Big Data mining.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

EUR 32.99 /Month

Get 10 units per month
Download Article/Chapter or Ebook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Subscribe now

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Hardcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Social Semantic Web Mining and Big Data Analytics

Evaluation and Development of Data Mining Tools for Social Network Analysis

Review of social media analytics process and Big Data pipeline

Article 09 April 2018

References

Bifet, A., Holmes, G., Kirkby, R., Pfahringer, B.: MOA: massive online analysis. J. Mach. Learn. Res. (JMLR) 11, 1601–1604 (2010)
Google Scholar
Brandes, U.: A faster algorithm for betweenness centrality. J. Math. Sociol. 25(2), 163–177 (2001)
Article MATH Google Scholar
Chan, S.Y., Leung, I.X., Li.: Fast centrality approximation in modular networks. In: 1st ACM International Workshop on Complex Networks meet Information and Knowledge Management (CNIKM ’09), ACM, pp. 31–38 (2009)
Google Scholar
Celen, M., Satyabrata, P., Radha Krishna, P.: Clustering social networks to discover topologies. In: 17th International Conference on Management of Data (COMAD 2011), Bangalore, India (2011)
Google Scholar
Dean, J., Ghemawat, S.: Mapreduce: simplified data processing on large clusters. In: Sixth Symposium on Operating System Design and Implementation (OSDI’04), San Francisco, CA, pp. 137–150 (2004)
Google Scholar
Dhaval, C.L., Somayajulu, D.V.L.N., Radha Krishna, P.: SE-CDA: a scalable and efficient community detection algorithm. In: 2014 IEEE International Conference on BigData (IEEE BigData14), Washington DC, 2014, pp. 877–882 (2014)
Google Scholar
Eppstein, D., Wang, J.: Fast approximation of centrality. J. Graph Algorithms Appl. 8(1), 39–45 (2004)
Article MATH MathSciNet Google Scholar
Fan, W., Bifet, A.: Mining Big data: current status, and forecast to the future. SIGKDD Explor. 14(2), 1–5 (2012)
Article Google Scholar
Imre, D., Palla, G., Vicsek, T.: Clique percolation in random networks. Phys. Rev. Lett. 94(16), 160–202 (2005)
Google Scholar
Ipsen, I.C.F., Rebecca, S. Wills: Mathematical Properties and Analysis of Google’s PageRank. http://www4.ncsu.edu/~ipsen/ps/cedya.pdf
Jyoti Rani, Y., Somayajulu, D.V.L.N., Radha Krishna, P.: A scalable algorithm for discovering topologies in social networks. In: IEEE ICDM workshop on business applications and social network analysis (BASNA 2014) (2014)
Google Scholar
Kang, U., Tsourakakis, C.E.,Christos Faloutsos. PEGASUS: a peta-scale graph mining system—implementation and observations. In: IEEE International Conference on Data Mining (ICDM), Miami, Florida, USA (2009)
Google Scholar
Low, Y., Gonzalez, J., Kyrola, A., Bickson, D., Guestrin, C., Hellerstein, J.M.: Graphlab: a new parallel framework for machine learning. In: Conference on Uncertainty in Artificial Intelligence (UAI), Catalina Island, California, USA (2010)
Google Scholar
Manuel, K., Kishore Varma Indukuri, Radha Krishna, P.: Analyzing internet slang for sentiment mining. In: Second Vaagdevi International Conference on Information Technology for Real World Problems (VCON), pp. 9–11 (2010)
Google Scholar
McCreadie, R.M.C., Macdonald, C., Ounis, L.: Comparing Distributed Indexing: To MapReduce or Not?. LSDS-IR Workshop, Boston, USA (2009)
Google Scholar
Newman, M.E.J.: Fast algorithm for detecting community structure in networks. Phys. Rev. E 69, 066133 (2004)
Article Google Scholar
Pang, B., Lee, L.: Opinion mining and sentiment analysis. Found. Trends Inf. Retr. 2(1–2), 1–135 (2008)
Article Google Scholar
R Development Core Team (2013). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria (2013) (ISBN 3-900051-07-0)
Google Scholar
Radha Krishna, P., Indukuri, K.V., Syed, S.: A generic topology discovery approach for huge social networks. In: ACM COMPUTE 2012, 23–24 Jan 2012
Google Scholar
Tang, L., Haun, L.: Chapter 16: Graph mining application to social network analysis. Aggarwal, C.C., Wang, H. (eds.) Managing and Mining Graph Data, Springer, pp. 487–513
Google Scholar

Download references

Author information

Authors and Affiliations

Infosys Labs, Infosys Limited, Hyderabad, India
P. Radha Krishna

Authors

P. Radha Krishna
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to P. Radha Krishna .

Editor information

Editors and Affiliations

School of Computer and Information Scien, University of Hyderabad, Hyderabad, Andhra Pradesh, India
Hrushikesha Mohanty
School of Computer Engineering, KIIT University, Bhubaneshwar, Odisha, India
Prachet Bhuyan
Teradata India Private Limited, Hyderabad, Andhra Pradesh, India
Deepak Chenthati

Exercises

1.
Define (a) Big Data search and (b) Big Data mining
2.
What type of intermediate data does MapReduce store? Where does MapReduce store them?
3.
Write Mapper and Reducer functions for k-means clustering algorithm.
4.
Give the algorithm to find the page ranking.
5.
List ideas to improve/tune (existing) text processing and mining approaches to support big data scale.
6.
(a) Explain the significance of social networks and their role in the context of Big Data.

(b) List challenges of Big Data in supporting social network analytics and discuss approaches to handle them with justification.

7.
How the centrality measures and structure of the social networks are useful in analyzing social networks.
8.
What is community detection? Discuss various community detection approaches.
9.
Explain social network clustering algorithm (using topology discovery) that allows overlap clusters.
10.
Explain the concepts of active learning and concept drift. How these concepts are useful for big data search and mining.
11.
What is sentiment mining? Describe an approach for extracting sentiments from a given text with examples.
12.
Develop a suitable architecture for supporting real-time sentiment mining and discuss their components.
13.
List open source tools and their characteristics to perform Big Data analytics.
14.
Discuss various alternative mechanisms to MapReduce along with their merits

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Radha Krishna, P. (2015). Big Data Search and Mining. In: Mohanty, H., Bhuyan, P., Chenthati, D. (eds) Big Data. Studies in Big Data, vol 11. Springer, New Delhi. https://doi.org/10.1007/978-81-322-2494-5_4

Download citation

DOI: https://doi.org/10.1007/978-81-322-2494-5_4
Published: 28 June 2015
Publisher Name: Springer, New Delhi
Print ISBN: 978-81-322-2493-8
Online ISBN: 978-81-322-2494-5
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics

Big Data Search and Mining

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Social Semantic Web Mining and Big Data Analytics

Evaluation and Development of Data Mining Tools for Social Network Analysis

Review of social media analytics process and Big Data pipeline

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Exercises

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Big Data Search and Mining

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Social Semantic Web Mining and Big Data Analytics

Evaluation and Development of Data Mining Tools for Social Network Analysis

Review of social media analytics process and Big Data pipeline

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Exercises

Exercises

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Share this chapter

Publish with us

Search

Navigation