Accelerating Finite Element Assembly on a GPU

  • Conference paper
  • First Online:
Advances in Engineering Design (FLAME 2022)

Abstract

In this paper, a parallel strategy for assembly of finite element matrices on graphics processing unit (GPU) is presented. Considering the limited memory size of a GPU, the proposed strategy doesn’t store the elemental matrices into memory but performs on-the-fly computation and stores the data directly into a global stiffness matrix, reducing memory requirement and preventing overhead due to a separate assembly step. The global stiffness matrix is stored in compressed sparse row (CSR) storage format, commonly used by GPU-accelerated linear solver libraries. However, the assembly of elemental matrices directly into a sparse storage format requires prior knowledge of locations of nonzeros. The current work presents an efficient strategy to pre-compute indices for assembly into CSR sparse storage format. The proposed strategy has been implemented on both CPU and GPU. The performance characteristic of the proposed finite element solver is measured by solving large-scale three-dimensional (3D) elasticity problem involving a maximum of 4.7 million degrees of freedom (DOFs). A comparison is made with the standard assembly implementation in Eigen C++ library, which first stores the nonzero values in the form of triplets and then assembles into CSR sparse format. For the finest mesh with 4.7 million DOFs, the proposed CPU-based assembly strategy achieves 9.3× speedup over Eigen library. The computation of indices for assembly into CSR format takes 15.7 s on CPU and 2.4 s on GPU for 4.7 million DOFs. The computation of elemental matrices and their assembly, implemented on GPU as a single compute kernel, is found to be up to 24.3× faster than optimized CPU implementation. In terms of wall-clock time, the GPU-accelerated finite element solver is found to have up to 4× speedup over CPU solver.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now
Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 139.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 179.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free ship** worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Zienkiewicz OC, Taylor RL, Zhu JZ (2005) The finite element method: its basis and fundamentals, 6th edn. Butterworth-Heinemann, Oxford

    MATH  Google Scholar 

  2. Georgescu S, Chow P, Okuda H (2013) GPU acceleration for FEM-based structural analysis. Archiv Comput Methods Eng 20(2):111–121

    Article  MathSciNet  MATH  Google Scholar 

  3. Filippone S, Cardellini V, Barbieri D, Fanfarillo A (2017) Sparse matrix-vector multiplication on GPGPUs. ACM Trans Math Softw (TOMS) 43(4):1–49

    Article  MathSciNet  MATH  Google Scholar 

  4. Kiran U, Sanfui S, Ratnakar SK, Gautam SS, Sharma D (2019) Comparative analysis of GPU-based solver libraries for a sparse linear system of equations. In: Advances in computational methods in manufacturing. Springer, Singapore, pp 889–897

    Google Scholar 

  5. Kiran U, Gautam SS, Sharma D (2020) GPU-based matrix-free finite element solver exploiting symmetry of elemental matrices. Computing 102(9):1941–1965

    Article  MathSciNet  MATH  Google Scholar 

  6. Kiran U, Agrawal V, Sharma D, Gautam SS (2019) A GPU based acceleration of finite element and isogeometric analysis. In: Liu GR, **angguo GX (eds) Proceedings at the 10th international conference on computational methods (ICCM2019). ScienTech Publisher, Singapore, pp 641–651

    Google Scholar 

  7. Bell N, Garland M (2008) Efficient sparse matrix-vector multiplication on CUDA. Nvidia Technical Report NVR-2008-004, Nvidia Corporation

    Google Scholar 

  8. Guennebaud G, Jacob B (2021) Eigen V3, http://www.eigen.tuxfamily.org

  9. The MathWorks. Inc. (2021) MATLAB version R2021a. Natick, Massachusetts

    Google Scholar 

  10. Dziekonski A, Sypek P, Lamecki A, Mrozowski M (2012) Finite element matrix generation on a GPU. Progress Electromagn Res 128:249–265

    Article  MATH  Google Scholar 

  11. Sanfui S, Sharma D (2017) A two-kernel based strategy for performing assembly in FEA on the graphics processing unit. In: International conference on advances in mechanical, industrial, automation and management systems (AMIAMS), pp 1–9. IEEE

    Google Scholar 

  12. Kiran U, Sharma D, Gautam SS (2019) GPU-warp based finite element matrices generation and assembly using coloring method. J Comput Des Eng 6(4):705–718

    Google Scholar 

  13. Sanfui S, Sharma D (2020) A three-stage graphics processing unit-based finite element analyses matrix generation strategy for unstructured meshes. Int J Numer Meth Eng 121(17):3824–3848

    Article  MathSciNet  Google Scholar 

  14. NVIDIA Corporation. NVIDIA CUDA C++ programming guide, version 11.6 (2022)

    Google Scholar 

  15. Dalton S, Bell N, Olson L, Garland M (2014) Cusp: generic parallel algorithms for sparse matrix and graph computations. version 0.5.0, http://cusplibrary.github.io

Download references

Acknowledgements

This work was supported by the Science and Engineering Research Board [IMP/2019/000276, SB/FTP/ETA- 0008/2014].

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Deepak Sharma .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Kiran, U., Gautam, S.S., Sharma, D. (2023). Accelerating Finite Element Assembly on a GPU. In: Sharma, R., Kannojiya, R., Garg, N., Gautam, S.S. (eds) Advances in Engineering Design. FLAME 2022. Lecture Notes in Mechanical Engineering. Springer, Singapore. https://doi.org/10.1007/978-981-99-3033-3_4

Download citation

  • DOI: https://doi.org/10.1007/978-981-99-3033-3_4

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-99-3032-6

  • Online ISBN: 978-981-99-3033-3

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics

Navigation