Analysis of Relationship Between SIMD-Processing Features Used in NVIDIA GPUs and NEC SX-Aurora TSUBASA Vector Processors

Afanasyev, Ilya V.; Voevodin, Vadim V.; Voevodin, Vladimir V.; Komatsu, Kazuhiko; Kobayashi, Hiroaki

doi:10.1007/978-3-030-25636-4_10

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 11657))

Included in the following conference series:

International Conference on Parallel Computing Technologies

795 Accesses

Abstract

This paper presents comprehensive analysis of main SIMD-processing features and computational characteristics of three high performance architectures: two NVIDIA GPU architectures (of Pascal and Volta generations) and NEC SX-Aurora TSUBASA vector processor. Since both these types of architectures strongly rely on using SIMD-processing features, certain similarities of data-processing principles can be found between them. However, despite having vectorised data-processing included in both NVIDIA GPU and NEC SX-Aurora TSUBASA architectures, vectorisation features of both architectures are implemented in completely different ways. These differences lead to several fundamental restrictions on classes of algorithms which can be efficiently implemented on corresponding platforms. This paper is devoted to the research of the possibility of porting various classes of programs and algorithms among the discussed architectures with a focus on utilising all vectorisation features available. However, without a detailed analysis of similar and different SIMD-processing features in these architectures, it is impossible to approach this problem. The performed analysis allowed us to identify several important examples of typical applications and algorithms. Some of them demonstrated comparable and the others showed different efficiency on NVIDIA GPUs and NEC SX-Aurora TSUBASA vector processors, including reduction operations, programs relying on frequent indirect memory accesses and data-transfers through co-processor interconnect. Moreover, the conducted analysis allows to easily extend this set of examples to approach the problem of automated porting of programs between the reviewed architectures, what we consider as an important direction of our future research.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

EUR 32.99 /Month

Get 10 units per month
Download Article/Chapter or Ebook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Subscribe now

Buy Now

Chapter: EUR 29.95; Price includes VAT (France)

eBook: EUR 60.98; Price includes VAT (France)

Softcover Book: EUR 78.06; Price includes VAT (France)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

A Portable SIMD Primitive Using Kokkos for Heterogeneous Architectures

Optimization of the Himeno Benchmark for SX-Aurora TSUBASA

A Study on Vectorization Methods for Multicore SIMD Architecture Provided by Compilers

References

STREAM Benchmark. https://www.cs.virginia.edu/stream/
Thrust Library. https://thrust.github.io
Egawa, R., et al.: Potential of a modern vector supercomputer for practicalapplications: performance evaluation of SX-ACE. J. Supercomput. 73(9), 3948–3976 (2017). https://doi.org/10.1007/s11227-017-1993-y
Article Google Scholar
Flynn, M.J.: Very high-speed computing systems. Proc. IEEE 54(12), 1901–1909 (1966)
Article Google Scholar
Harris, M., et al.: Optimizing parallel reduction in CUDA. Nvidia Dev. Technol. 2(4), 70 (2007)
Google Scholar
Komatsu, K., Egawa, R., Isobe, Y., Ogata, R., Takizawa, H., Kobayashi, H.: An approach to the highest efficiency of the HPCG benchmark on the SX-ACE supercomputer. In: Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis (SC15), Poster, pp. 1–2, November 2015
Google Scholar
Komatsu, K., et al.: Performance evaluation of a vector supercomputer SX-aurora TSUBASA. In: Proceedings of the International Conference for High Performance Computing, Networking, Storage, and Analysis, SC 2018, pp. 54:1–54:12. IEEE Press, Piscataway (2018). http://dl.acm.org/citation.cfm?id=3291656.3291728
NVIDIA: Nvidia Tesla P100: The most advanced datacenter accelerator ever built featuring Pascal GP100, the world’s fastest GPU. Whitepaper (2016)
Google Scholar
NVIDIA Tesla: V100 GPU architecture (2017)
Google Scholar
Wu, B., Zhao, Z., Zhang, E.Z., Jiang, Y., Shen, X.: Complexity analysis and algorithm design for reorganizing data to minimize non-coalesced memory accesses on GPU. In: ACM SIGPLAN Notices, vol. 48, pp. 57–68. ACM (2013)
Google Scholar
Yamada, Y., Momose, S.: Vector engine processor of NECs brand-new supercomputer SX-aurora TSUBASA. In: Intenational Symposium on High Performance Chips (Hot Chips 2018) (2018)
Google Scholar

Download references

Author information

Authors and Affiliations

Research Computing Center of Moscow State University, Moscow, 119234, Russia
Ilya V. Afanasyev, Vadim V. Voevodin & Vladimir V. Voevodin
Tohoku University, Sendai, Miyagi, 980-8579, Japan
Kazuhiko Komatsu & Hiroaki Kobayashi

Authors

Ilya V. Afanasyev
View author publications
You can also search for this author in PubMed Google Scholar
Vadim V. Voevodin
View author publications
You can also search for this author in PubMed Google Scholar
Vladimir V. Voevodin
View author publications
You can also search for this author in PubMed Google Scholar
Kazuhiko Komatsu
View author publications
You can also search for this author in PubMed Google Scholar
Hiroaki Kobayashi
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ilya V. Afanasyev .

Editor information

Editors and Affiliations

Institute of Computational Mathematics and Mathematical Geophysics SB RAS, Novosibirsk State University, Novosibirsk State Technical University, Novosibirsk, Russia
Victor Malyshkin

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Afanasyev, I.V., Voevodin, V.V., Voevodin, V.V., Komatsu, K., Kobayashi, H. (2019). Analysis of Relationship Between SIMD-Processing Features Used in NVIDIA GPUs and NEC SX-Aurora TSUBASA Vector Processors. In: Malyshkin, V. (eds) Parallel Computing Technologies. PaCT 2019. Lecture Notes in Computer Science(), vol 11657. Springer, Cham. https://doi.org/10.1007/978-3-030-25636-4_10

Download citation

DOI: https://doi.org/10.1007/978-3-030-25636-4_10
Published: 17 July 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-25635-7
Online ISBN: 978-3-030-25636-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Analysis of Relationship Between SIMD-Processing Features Used in NVIDIA GPUs and NEC SX-Aurora TSUBASA Vector Processors

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

A Portable SIMD Primitive Using Kokkos for Heterogeneous Architectures

Optimization of the Himeno Benchmark for SX-Aurora TSUBASA

A Study on Vectorization Methods for Multicore SIMD Architecture Provided by Compilers

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Analysis of Relationship Between SIMD-Processing Features Used in NVIDIA GPUs and NEC SX-Aurora TSUBASA Vector Processors

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

A Portable SIMD Primitive Using Kokkos for Heterogeneous Architectures

Optimization of the Himeno Benchmark for SX-Aurora TSUBASA

A Study on Vectorization Methods for Multicore SIMD Architecture Provided by Compilers

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation