Accelerating Big Data Processing on Modern HPC Clusters

Lu, **aoyi; Wasi-ur-Rahman, Md.; Islam, Nusrat; Shankar, Dipti; Panda, Dhabaleswar K. (DK)

doi:10.1007/978-3-319-33742-5_5

**aoyi Lu²,
Md. Wasi-ur-Rahman²,
Nusrat Islam²,
Dipti Shankar² &
…
Dhabaleswar K. (DK) Panda²

1919 Accesses
1 Citations

Abstract

Modern HPC systems and the associated middleware (such as MPI and parallel file systems) have been exploiting the advances in HPC technologies (multi-/many-core architecture, RDMA-enabled networking, and SSD) for many years. However, Big Data processing and management middleware have not fully taken advantage of such technologies. These disparities are taking HPC and Big Data processing into divergent trajectories. This chapter provides an overview of popular Big Data processing middleware, high-performance interconnects and storage architectures, and discusses the challenges in accelerating Big Data processing middleware by leveraging emerging technologies on modern HPC clusters. This chapter presents case studies of advanced designs based on RDMA and heterogeneous storage architecture, that were proposed to address these challenges for multiple components of Hadoop (HDFS and MapReduce) and Spark. The advanced designs presented in the case studies are publicly available as a part of the High-Performance Big Data (HiBD) project. An overview of the HiBD project is also provided in this chapter. All these works aim to bring HPC and Big Data processing into a convergent trajectory.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

EUR 32.99 /Month

Get 10 units per month
Download Article/Chapter or Ebook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Subscribe now

Buy Now

Chapter: EUR 29.95; Price includes VAT (Germany)

eBook: EUR 139.09; Price includes VAT (Germany)

Softcover Book: EUR 181.89; Price includes VAT (Germany)

Hardcover Book: EUR 181.89; Price includes VAT (Germany)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Big Data and HPC Convergence: The Cutting Edge and Outlook

Characterizing and benchmarking stand-alone Hadoop MapReduce on modern HPC clusters

Article 01 June 2016

Experiences of Converging Big Data Analytics Frameworks with High Performance Computing Systems

References

Apache Software Foundation (2016) Apache Hadoop. http://hadoop.apache.org/
Apache Software Foundation (2016) Apache Spark. http://spark.apache.org/
M. Armbrust, R.S. **n, C. Lian, Y. Huai, D. Liu, J.K. Bradley, X. Meng, T. Kaftan, M.J. Franklin, A. Ghodsi, M. Zaharia, Spark SQL: relational data processing in spark, in Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data, SIGMOD ’15 (ACM, New York, 2015), pp. 1383–1394
Google Scholar
F. Chang, J. Dean, S. Ghemawat, W.C. Hsieh, D.A. Wallach, M. Burrows, T. Chandra, A. Fikes, R.E. Gruber, Bigtable: a distributed storage system for structured data, in 7th Conference on Usenix Symposium on Operating Systems Design and Implementation, vol. 7 (2006), pp. 205–218
Google Scholar
Cluster File System Inc., Lustre: scalable clustered object storage (2016), http://www.lustre.org/
Google Scholar
J. Dean, S. Ghemawat, MapReduce: simplified data processing on large clusters, in OSDI’04: Proceedings of the 6th conference on Symposium on Operating Systems Design and Implementation (USENIX Association, Berkeley, CA, 2004)
Google Scholar
Digital Universe Invaded By Sensors (2014), http://www.emc.com/about/news/press/2014/20140409-01.htm
X. Ding, S. Jiang, F. Chen, K. Davis, X. Zhang, DiskSeen: exploiting disk layout and access history to enhance I/O prefetch, in 2007 USENIX Annual Technical Conference on Proceedings of the USENIX Annual Technical Conference, ATC’07 (USENIX Association, Berkeley, CA, 2007), pp. 20:1–20:14
Google Scholar
C. Engelmann, H. Ong, S.L. Scott, Middleware in modern high performance computing system architectures, in Proceedings of ICCS, Bei**g (2007)
Google Scholar
General-Purpose Computation on Graphics Processing Units (GPGPU) (2016), http://gpgpu.org
C. Gniady, Y. Hu, Y.H. Lu, Program counter based techniques for dynamic power management, in IEEE Proceedings-Software (2004), pp. 24–35
Google Scholar
D. Goldenberg, M. Kagan, R. Ravid, M.S. Tsirkin, Transparently achieving superior socket performance using zero copy socket direct protocol over 20 Gb/s InfiniBand links, in 2005 IEEE Int’l Conference on Cluster Computing (Cluster) (2005), pp. 1–10
Google Scholar
J.E. Gonzalez, R.S. **n, A. Dave, D. Crankshaw, M.J. Franklin, I. Stoica, GraphX: Graph processing in a distributed dataflow framework, in 11th USENIX Symposium on Operating Systems Design and Implementation (OSDI 14) (USENIX Association, Broomfield, CO, 2014), pp. 599–613
Google Scholar
HiBench Suite: The BigData Micro Benchmark Suite (2016), https://github.com/intel-hadoop/HiBench
Google Scholar
High-Performance Big Data (HiBD) (2016), http://hibd.cse.ohio-state.edu
IDC Digital Universe Study (2011), http://www.emc.com/leadership/programs/digital-universe.htm
Infiniband Trade Association (2016), http://www.infinibandta.org
Intel Many Integrated Core Architecture (2016), http://www.intel.com/technology/architecture-silicon/mic/index.htm
International Data Corporation (IDC): New IDC Worldwide HPC end-user study identifies latest trends in high performance computing usage and spending (2013), http://www.idc.com/getdoc.jsp?containerId=prUS24409313
IP over InfiniBand Working Group (2016), http://www.ietf.org/html.charters/ipoib-charter.html
N.S. Islam, X. Lu, M.W. Rahman, J. Jose, D.K. Panda, A micro-benchmark suite for evaluating HDFS operations on modern clusters, in Proceedings of the 2nd Workshop on Big Data Benchmarking, WBDB (2012)
Google Scholar
N.S. Islam, M.W. Rahman, J. Jose, R. Rajachandrasekar, H. Wang, H. Subramoni, C. Murthy, D.K. Panda, High performance RDMA-based design of HDFS over InfiniBand, in The International Conference for High Performance Computing, Networking, Storage and Analysis (SC) (2012)
Google Scholar
N.S. Islam, X. Lu, M.W. Rahman, D.K. Panda, Can parallel replication benefit Hadoop distributed file system for high performance interconnects?, in The Proceedings of IEEE 21st Annual Symposium on High-Performance Interconnects (HOTI), San Jose, CA (2013)
Google Scholar
N. Islam, X. Lu, M. Wasi-ur Rahman, R. Rajachandrasekar, D. Panda, In-memory I/O and replication for HDFS with memcached: early experiences, in IEEE International Conference on Big Data (Big Data’14) (2014), pp. 213–218
Google Scholar
N.S. Islam, X. Lu, M.Wu. Rahman, D.K.D. Panda, SOR-HDFS: a SEDA-based approach to maximize overlap** in RDMA-enhanced HDFS, in Proceedings of the 23rd International Symposium on High-performance Parallel and Distributed Computing, HPDC ’14 (ACM, New York, 2014), pp. 261–264
Google Scholar
N.S. Islam, X. Lu, M.W. Rahman, D. Shankar, D.K. Panda, Triple-H: a hybrid approach to accelerate HDFS on HPC clusters with heterogeneous storage architecture, in 15th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (2015)
Google Scholar
N.S. Islam, D. Shankar, X. Lu, M. Wasi-Ur-Rahman, D.K. Panda, Accelerating I/O performance of big data analytics on HPC clusters through RDMA-based key-value store, in 44th International Conference on Parallel Processing (ICPP) (2015), pp. 280–289
Google Scholar
M. Itoh, T. Ishizaki, M. Kishimoto, Accelerated socket communications in system area networks, in IEEE International Conference on Cluster Computing, Cluster ’00 (IEEE Computer Society, Los Alamitos, CA, 2000), p. 357
Google Scholar
S. Krishnan, M. Tatineni, C. Baru, myHadoop - Hadoop-on-Demand on traditional HPC resources. Tech. rep., Chapter in ‘Contemporary HPC Architectures’ [KV04] Vassiliki Koutsonikola and Athena Vakali. Ldap: Framework, Practices, and Trends (2004)
Google Scholar
S.W. Lee, B. Moon, C. Park, Advances in flash memory SSD technology for enterprise database applications, in Proceedings of the 2009 ACM SIGMOD International Conference on Management of Data, SIGMOD ’09 (ACM, New York, 2009), pp. 863–870
Google Scholar
libmemcached: Open Source C/C++ Client Library and Tools for Memcached (2016), http://libmemcached.org/
X. Lu, N.S. Islam, M.W. Rahman, J. Jose, H. Subramoni, H. Wang, D.K. Panda, High-performance design of Hadoop RPC with RDMA over InfiniBand, in The Proceedings of IEEE 42nd International Conference on Parallel Processing (ICPP) (2013)
Google Scholar
X. Lu, M. Wasi-ur Rahman, N. Islam, D. Panda, A micro-benchmark suite for evaluating Hadoop RPC on high-performance networks, in Advancing Big Data Benchmarks. Lecture Notes in Computer Science, vol. 8585 (Springer, Heidelberg, 2014), pp. 32–42
Google Scholar
X. Lu, M. Rahman, N. Islam, D. Shankar, D. Panda, Accelerating spark with RDMA for big data processing: early experiences, in 2014 IEEE 22nd Annual Symposium on High-Performance Interconnects (HOTI) (2014), pp. 9–16
Google Scholar
Memcached: High-Performance, Distributed Memory Object Caching System (2016), http://memcached.org/
MVAPICH: MPI over InfiniBand, 10GigE/iWARP and RoCE. Network Based Computing Lab, The Ohio State University (2016), http://mvapich.cse.ohio-state.edu/
Google Scholar
NVIDIA: GPUDirect RDMA (2016), http://docs.nvidia.com/cuda/gpudirect-rdma
Google Scholar
NVM Express (NVMe) (2016), http://www.enterprisetech.com/2014/08/06/flashtec-nvram-15-million-iops-sub-microsecond-latency/
Parallel Virtual File System (2016), http://www.pvfs.org
M.W. Rahman, N.S. Islam, X. Lu, J. Jose, H. Subramoni, H. Wang, D.K. Panda, High-performance RDMA-based design of Hadoop MapReduce over InfiniBand, in The Proceedings of International Workshop on High Performance Data Intensive Computing (HPDIC), in conjunction with IEEE International Parallel and Distributed Processing Symposium (IPDPS), Boston (2013)
Google Scholar
M.W. Rahman, X. Lu, N.S. Islam, D.K. Panda, HOMR: a hybrid approach to exploit maximum overlap** in MapReduce over high performance interconnects, in International Conference on Supercomputing (ICS), Munich (2014)
Google Scholar
M.W. Rahman, X. Lu, N.S. Islam, R. Rajachandrasekar, D.K. Panda, MapReduce over lustre: can RDMA-based approach benefit?, in 20th International European Conference on Parallel Processing, Porto, Euro-Par (2014)
Google Scholar
M.W. Rahman, X. Lu, N.S. Islam, R. Rajachandrasekar, D.K. Panda, High-performance design of YARN MapReduce on modern HPC clusters with lustre and RDMA, in 29th IEEE International Parallel and Distributed Processing Symposium (IPDPS) (2015)
Google Scholar
RDMA Consortium: Architectural Specifications for RDMA over TCP/IP (2016), http://www.rdmaconsortium.org/
Google Scholar
SDSC Comet (2016), http://www.sdsc.edu/services/hpc/hpc_systems.html
D. Shankar, X. Lu, M.W. Rahman, N. Islam, D.K. Panda, A micro-benchmark suite for evaluating Hadoop MapReduce on high-performance networks, in Proceedings of the Fifth workshop on Big Data Benchmarks, Performance Optimization, and Emerging Hardware, BPOE-5, vol. 8807 (Springer International Publishing, Hangzhou, 2014), pp. 19–33
Google Scholar
K. Shvachko, H. Kuang, S. Radia, R. Chansler, The Hadoop distributed file system, in The Proceedings of the IEEE 26th Symposium on Mass Storage Systems and Technologies (MSST), Washington, DC (2010)
Google Scholar
T.L. Sterling, J. Salmon, D.J. Becker, D.F. Savarese, How to Build a Beowulf: A Guide to the Implementation and Application of PC Clusters (MIT Press, Cambridge, MA, 1999)
Google Scholar
T. Sterling, E. Lusk, W. Gropp, Beowulf Cluster Computing with Linux (MIT Press, Cambridge, MA, 2003)
Google Scholar
H. Subramoni, P. Lai, M. Luo, D.K. Panda, RDMA over ethernet - a preliminary study, in Proceedings of the 2009 Workshop on High Performance Interconnects for Distributed Computing (HPIDC’09) (2009)
Google Scholar
H. Subramoni, K. Hamidouche, A. Venkatesh, S. Chakraborty, D. Panda, Designing MPI library with dynamic connected transport (DCT) of InfiniBand: early experiences, in Supercomputing, ed. by J. Kunkel, T. Ludwig, H. Meuer. Lecture Notes in Computer Science, vol. 8488 (Springer, Berlin, 2014), pp. 278–295
Google Scholar
S. Sur, H. Wang, J. Huang, X. Ouyang, D.K. Panda (2010) Can high performance interconnects benefit Hadoop distributed file system?, in Workshop on Micro Architectural Support for Virtualization, Data Center Computing, and Clouds, in Conjunction with MICRO 2010, Atlanta, GA
Google Scholar
TACC Stampede (2016), https://www.tacc.utexas.edu/stampede/
Google Scholar
The Netty Project (2016), http://netty.io
Top500 Supercomputing System (2016), http://www.top500.org
V.K. Vavilapalli, A.C. Murthy, C. Douglas, S. Agarwal, M. Konar, R. Evans, T. Graves, J. Lowe, H. Shah, S. Seth, B. Saha, C. Curino, O. O’Malley, S. Radia, B. Reed, E. Baldeschwieler, Apache Hadoop YARN: yet another resource negotiator, in Proceedings of the 4th Annual Symposium on Cloud Computing, SOCC ’13 (ACM, New York, 2013), pp 5:1–5:16
Google Scholar
Y. Wang, X. Que, W. Yu, D. Goldenberg, D. Sehgal, Hadoop acceleration through network levitated merge, in Proceedings of International Conference for High Performance Computing, Networking, Storage and Analysis (SC), Seattle, WA (2011)
Google Scholar
Y. Wang, R. Goldstone, W. Yu, T. Wang, Characterization and optimization of memory-resident MapReduce on HPC systems, in 28th IEEE International Parallel and Distributed Processing Symposium (IPDPS) (2014)
Google Scholar
M. Welsh, D. Culler, E. Brewer, SEDA: an architecture for well-conditioned, scalable Internet services, in Proceedings of the 18th ACM Symposium on Operating Systems Principles (SOSP), Banff, Alberta (2001)
Google Scholar
M. Zaharia, M. Chowdhury, M.J. Franklin, S. Shenker, I. Stoica, Spark: cluster computing with working sets, in Proceedings of the 2nd USENIX Conference on Hot Topics in Cloud Computing (HotCloud), Boston, MA (2010)
Google Scholar
M. Zaharia, M. Chowdhury, T. Das, A. Dave, J. Ma, M. McCauly, M.J. Franklin, S. Shenker, I. Stoica, Resilient distributed datasets: a fault-tolerant abstraction for in-memory cluster computing. Presented as part of the 9th USENIX symposium on networked systems design and implementation (NSDI 12) (USENIX, San Jose, CA, 2012), pp. 15–28
Google Scholar
M. Zaharia, T. Das, H. Li, T. Hunter, S. Shenker, I. Stoica, Discretized streams: fault-tolerant streaming computation at scale, in Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles, SOSP ’13 (ACM, New York, 2013), pp. 423–438
Google Scholar

Download references

Acknowledgements

This research is supported in part by National Science Foundation grants #CNS-1419123, #IIS-1447804, and #ACI-1450440.

Author information

Authors and Affiliations

Department of Computer Science and Engineering, The Ohio State University, Columbus, OH, USA
**aoyi Lu, Md. Wasi-ur-Rahman, Nusrat Islam, Dipti Shankar & Dhabaleswar K. (DK) Panda

Authors

**aoyi Lu
View author publications
You can also search for this author in PubMed Google Scholar
Md. Wasi-ur-Rahman
View author publications
You can also search for this author in PubMed Google Scholar
Nusrat Islam
View author publications
You can also search for this author in PubMed Google Scholar
Dipti Shankar
View author publications
You can also search for this author in PubMed Google Scholar
Dhabaleswar K. (DK) Panda
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to **aoyi Lu .

Editor information

Editors and Affiliations

Texas Advanced Computing Center, Austin, Texas, USA
Ritu Arora

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Lu, X., Wasi-ur-Rahman, M., Islam, N., Shankar, D., Panda, D.K.(. (2016). Accelerating Big Data Processing on Modern HPC Clusters. In: Arora, R. (eds) Conquering Big Data with High Performance Computing. Springer, Cham. https://doi.org/10.1007/978-3-319-33742-5_5

Download citation

DOI: https://doi.org/10.1007/978-3-319-33742-5_5
Published: 17 September 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-33740-1
Online ISBN: 978-3-319-33742-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Accelerating Big Data Processing on Modern HPC Clusters

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Big Data and HPC Convergence: The Cutting Edge and Outlook

Characterizing and benchmarking stand-alone Hadoop MapReduce on modern HPC clusters

Experiences of Converging Big Data Analytics Frameworks with High Performance Computing Systems

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Accelerating Big Data Processing on Modern HPC Clusters

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Big Data and HPC Convergence: The Cutting Edge and Outlook

Characterizing and benchmarking stand-alone Hadoop MapReduce on modern HPC clusters

Experiences of Converging Big Data Analytics Frameworks with High Performance Computing Systems

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Share this chapter

Publish with us

Search

Navigation