Abstract
Traditional technologies and data processing applications are inadequate for big data processing. Big Data concern very large-volume, complex formats, growing data sets with multiple, heterogeneous sources, and formats. With the reckless expansion in networking, communication, storage, and data collection capability, the big data science is rapidly growing in every engineering and science domain. Challenges in front of data scientists include different tasks, such as data capture, classification, storage, sharing, transfer, analysis, search, visualization, and decision making. This paper is aimed to discuss the need of big data analytics, journey of raw data to meaningful decision, and the different tools and technologies emerged to process the big data at different levels, to derive meaningful decisions out of it.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
X. Wu, X. Zhu, G.-Q. Wu, and W. Ding, “Data mining with big data,” IEEE Trans. Knowl. Data Eng., vol. 26, no. 1, pp. 97–107, 2014.
L. Wang, K. Lu, P. Liu, R. Ranjan, and L. Chen, “IK-SVD: Dictionary Learning for Spatial Big Data via Incremental Atom Update,” vol. XX, no. Xx, pp. 1–12, 2014.
M. Augier, “Sublime Simon: The consistent vision of economic psychology’s Nobel laureate,” J. Econ. Psychol., vol. 22, no. 3, pp. 307–334, 2001.
Y. Liu, B. Wu, H. Wang, and P. Ma, “BPGM : A Big Graph Mining Tool,” vol. 19, no. 1, 2014.
S. Meng, W. Dou, X. Zhang, J. Chen, and S. Member, “KASR : A Keyword-Aware Service Recommendation Method on MapReduce for Big Data Applications,” vol. 25, no. 12, pp. 1–11, 2013.
D. Takaishi, S. Member, H. Nishiyama, and S. Member, “in Densely Distributed Sensor Networks,” vol. 2, no. 3, 2014.
P. Shen and C. Li, “Distributed Information Theoretic Clustering,” vol. 62, no. 13, pp. 3442–3453, 2014.
Y. Wang, L. Chen, S. Member, and J. Mei, “Incremental Fuzzy Clustering With Multiple Medoids for Large Data,” vol. 22, no. 6, pp. 1557–1568, 2014.
M. Muja, “Scalable Nearest Neighbour Methods for High Dimensional Data,” vol. 36, no. April, pp. 2227–2240, 2013.
(2012), sqoop [online]. Available: https://sqoop.apache.org/docs/1.4.2/SqoopUserGuide.htm.
W. Dou, X. Zhang, J. Liu, J. Chen, and S. Member, “HireSome -II : Towards Privacy-Aware Cross- Cloud Service Composition for Big Data Applications,” vol. 26, no. 2, pp. 1–11, 2013.
(2013), Flume [online]. Available: https://flume.apache.org/FlumeUserGuide.html.
(2014), Zookeeper [online]. Available: https://zookeeper.apache.org/releases.html.
(2013). HBase [Online]. Available: http://hbase.apache.org/.
R. Cattell, “Scalable SQL and NoSQL data stores,’’ SIGMOD Rec., vol. 39, no. 4, pp. 12_27, 2011.
(2014), Gluster [online]. Available: http://www.gluster.org/.
(2013). Hadoop Distributed File System [Online]. Available: http://hadoop.apache.org/docs/r1.0.4/hdfsdesign.html.
S. Ghemawat, H. Gobioff, and S.-T. Leung, “The Google file system,’’ in Proc. 19th ACM Symp. Operating Syst. Principles, 2003, pp.29_43.
(2015), Infinispan [online]. Available: http://infinispan.org/documentation/.
A. Thusoo et al., “Hive: A warehousing solution over a Map-Reduceframework,’’ Proc. VLDB Endowment, vol. 2, no. 2, pp. 1626_1629, 2009.
(2014), Lucene [online]. Available: https://lucene.apache.org/.
(2013). Solr [Online]. Available: http://lucene.apache.org/solr/.
(2013). Rapidminer [Online]. Available: https://rapidminer.com.
(2015). Talend [Online]. Available: https://www.talend.com/.
(2015). SpagoBI [Online]. Available: http://www.spagobi.org/.
D. Breuker, “Towards Model-Driven Engineering for Big Data Analytics -- An Exploratory Analysis of Domain-Specific Languages for Machine Learning,” 2014 47th Hawaii Int. Conf. Syst. Sci., pp. 758–767, 2014.
S. J. Rysavy, D. Bromley, and V. Daggett, “DIVE: A graph-based visual-analytics framework for big data,” IEEE Comput. Graph. Appl., vol. 34, no. 2, pp. 26–37, 2014.
(2015). Orange [Online]. Available: http://orange.biolab.si/.
P. Louridas and C. Ebert, “Embedded analytics and statistics for big data,” IEEE Softw., vol. 30, no. 6, pp. 33–39, 2013.
(2015). Storm [Online]. Available: http://storm-project.net/.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer Science+Business Media Singapore
About this paper
Cite this paper
Pole, G., Gera, P. (2016). A Recent Study of Emerging Tools and Technologies Boosting Big Data Analytics. In: Saini, H., Sayal, R., Rawat, S. (eds) Innovations in Computer Science and Engineering. Advances in Intelligent Systems and Computing, vol 413. Springer, Singapore. https://doi.org/10.1007/978-981-10-0419-3_4
Download citation
DOI: https://doi.org/10.1007/978-981-10-0419-3_4
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-10-0417-9
Online ISBN: 978-981-10-0419-3
eBook Packages: EngineeringEngineering (R0)