Abstract
Heterogeneous clusters using accelerators are widely used for high-performance computing system. In such systems, the inter-node communication among accelerators becomes bottleneck due to the data transfer between the accelerator and the host.
To eliminate this overhead, we have been develo** a novel communication system realizing direct communication among accelerators over computation nodes under the HA-PACS (Highly Accelerated Parallel Advanced system for Computational Sciences) project. Also we are investigating high-level parallel programming language, and several practical application programs on our concept, as well as studying the enhancement of TCA and develo** system software stack in the CREST project.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
OpenACC. http://www.openacc-standard.org
PGAS - Partitioned Global Address Space Languages. http://www.pgas.org
QUDA - A Library for QCD on GPUs. http://lattice.github.io/quda/
XcalableMP Specification Version 1.2, November 2012. http://www.xcalablemp.org/spec/xmp-spec-1.2.pdf
Altera Corp.: Stratix IV Device Handbook. http://www.altera.co.jp/literature/lit-stratix-iv.jsp
Amano, H., Kuhara, T., Kaneda, T., Hanawa, T., Kodama, Y., Boku, T.: A preliminarily evaluation of PEACH3: a switching hub for tightly coupled accelerators. In: Proc. of 2nd International Workshop on Computer Systems and Architectures (CSA 2014), in Conjunction with the 2nd International Symposium on Computing and Networking (CANDAR 2014), December 2014
Ammendola, R., et al.: APEnet+: high bandwidth 3D torus direct network for petaflops scale commodity clusters. Journal of Physics 331(Part 5) (2011)
Clark, M.A., Babich, R., Barros, K., Brower, R.C., Rebbi, C.: Solving lattice QCD systems of equations using mixed precision solvers on GPUs. Comput. Phys. Commun. 181, 1517–1528 (2010)
Dongarra, J., Meuer, H., Stromaier, E., Simon, H.: Top500 list. http://www.top500.org
Feng, W.C., Cameron, K.W.: Green500 list. http://www.green500.org
Fujita, N., Fujii, H., Hanawa, T., Kodama, Y., Boku, T., Kuramashi, Y., Clark, M.: QCD library for GPU cluster with proprietary interconnect for GPU direct communication. In: Lopes, L., et al. (eds.) Euro-Par 2014, Part I. LNCS, vol. 8805, pp. 251–262. Springer, Heidelberg (2014)
Gudmundson, J.: Enabling multi-host system designs with PCI Express technology, May 2004. http://www.plxtech.com/products/expresslane/techinfo
Hanawa, T., Kodama, Y., Boku, T., Sato, M.: Interconnect for tightly coupled accelerators architecture. In: Proc. of IEEE 21st Annual Sympsium on High-Performance Interconnects (HOT Interconnects 21), pp. 79–82, August 2013
Hanawa, T., Kodama, Y., Boku, T., Sato, M.: Tightly coupled accelerators architecture for minimizing communication latency among accelerators. In: The Third International Workshop on Accelerators and Hybrid Exascale Systems (AsHES2013) in Conjunction with IPDPS, pp. 1030–1039, May 2013
Kodama, Y., Hanawa, T., Boku, T., Sato, M.: PEACH2: FPGA based PCIe network device for tightly coupled accelerators. ACM SIGARCH Computer Architecture News 42(4), 3–8 (2014)
Mellanox Technologies: Mellanox OFED GPUDirect. http://www.mellanox.com/content/pages.php?pg=products_dyn&product_family=116&menu_section=34
Nakao, M., Lee, J., Boku, T., Sato, M.: Productivity and performance of global-view programming with XcalableMP PGAS language. In: The 2012 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid 2012), pp. 402–409, May 2012
Nakao, M., Murai, H., Shimosaka, T., Tabuchi, A., Hanawa, T., Kodama, Y., Boku, T., Sato, M.: XcalableACC: extension of XcalableMP PGAS language using OpenACC for accelerator clusters. In: Proc. of Workshop on a Accelerator Programming Using Directives (WACCPD 2014), in Conjunction with SC14, pp. 27–36, November 2014
NVIDIA Corp.: Develo** A Linux Kernel Module Using RDMA For GPUDirect. http://developer.download.nvidia.com/compute/cuda/5_0/rc/docs/GPUDirect_RDMA.pdf
NVIDIA Corp.: NVIDIA GPUDirect. http://developer.nvidia.com/gpudirect
PCI-SIG: PCI Express Card Electromechanical (CEM) Specification, Rev. 2.0, April 2007
PCI-SIG: PCI Express Base Specification, Rev. 3.0, November 2010
Rossetti, D., et al.: Leveraging NVIDIA GPUDirect on APEnet+ 3D torus cluster interconnect, May 2012. http://developer.download.nvidia.com/GTC/PDF/GTC2012/PresentationPDF/S0282-GTC2012-GPU-Torus-Cluster.pdf
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Hanawa, T. et al. (2015). Towards Unification of Accelerated Computing and Interconnection For Extreme-Scale Computing. In: Sano, K., Soudris, D., Hübner, M., Diniz, P. (eds) Applied Reconfigurable Computing. ARC 2015. Lecture Notes in Computer Science(), vol 9040. Springer, Cham. https://doi.org/10.1007/978-3-319-16214-0_43
Download citation
DOI: https://doi.org/10.1007/978-3-319-16214-0_43
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-16213-3
Online ISBN: 978-3-319-16214-0
eBook Packages: Computer ScienceComputer Science (R0)