Log in

Improved parallel matrix multiplication using Strassen and Urdhvatiryagbhyam method

  • Regular Paper
  • Published:
CCF Transactions on High Performance Computing Aims and scope Submit manuscript

Abstract

The current milieu, encourages rapid growth of wireless communication, multimedia applications, robotics and graphics to have efficient utilization of resources with high throughput and low power digital signal processing (DSP) systems. In an aggregate DSP system ranging from audio/video signal processing to wireless sensor networks, floating point matrix multiplication is used in wide scale in most of the fundamental processing units. Hardware implementation of floating-point matrix multiplication demands a colossal number of arithmetic operations that alter speed and consuming more area and power. DSP systems essentially uses two techniques to reduce dynamic power consumption:—they are pipelining and parallel processing that needs high performance processing element with less area and low power in diverse scientific computing applications. However, number of adders and multipliers used in the design of floating-point unit also increases subsequently. The adders and multipliers are the most area, delay and power consuming data path elements in the processing unit. The arithmetic level reduction of delay, power and area in the processing element is performed by the selection of appropriate adders and multipliers. This article proposes a parallel multiplication architecture using Strassen and UrdhvaTiryagbhyam multiplier, which involves design of efficient parallel matrix multiplication with flexible implementation of FPGA (Field Programmable Gate Array) device to analyse the computation and area. The design incorporates scheduling of blocks, operations on processing elements, block size determination, parallelization and double buffering for storage of matrix elements.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

Data availability

The data supporting the findings of this study are available within the paper.

References

  • Amrutha, K., Ravi Kumar, M.N., Panduranga, H.T.: Implementation of dense matrix multiplication. In: Proceedings of 2nd ASAR International Conference, pp. 17–20 (2015)

  • Arish, S., Sharma, R.K.: Run time reconfigurable multi precision floating point matrix multiplier intellectual property core on FPGA. Circuits Syst. Signal Process. 36(3), 998–1026 (2016)

    Article  MATH  Google Scholar 

  • Cannon, L.E.: A cellular computer to implement the kalman filter algorithm. PhD dissertation. Montana State University (1969)

  • Chetan, S., Sourabh, K.S., Lekshmi, V., Sudhakar, S., Manikandan, J.: Design and evaluation of floating point matrix operations for FPGA based system design. Procedia Comput. Sci. 171, 959–968 (2020)

    Article  Google Scholar 

  • Choi, J.: A new parallel matrix multiplication algorithm on distributed-memory concurrent computers. Concurr. Pract. Exp. 10(8), 224–229 (1997)

    Google Scholar 

  • Choi, J., Dangarra, J.J., Pozo, R., Walker, D.W.: PUMMA: parallel universal matrix multiplication algorithms on distributed memory concurrent computers. Concurr. Pract. Exp. 6(7), 543–570 (1994)

    Article  Google Scholar 

  • Dou, Y., Vassiliadis, S., Kuzmanov, G.K., Gaydadjiev, G.N.: 64-bit floating point FPGA matrix multiplication. In: Proceeding of the ACM/SIGDA 13th International Symposium on Field Programmable Gate Arrays (FPGA). pp. 86–95 (2005)

  • Fox, G.C., Otto, S.W.: Matrix algorithms on a hypercube I: matrix multiplication. Parallel Comput. 4(1), 17–31 (1987)

    Article  MATH  Google Scholar 

  • Geijn, R.A.V., Watts, J.: SUMMA: scalable universal matrix multiplication algorithm. Concurr. Pract. Exp. 9(4), 255–274 (1998)

    Article  Google Scholar 

  • Jagadguru Swami Sri BharatiKrsnaTirthaji Maharaja.: Vedic mathematics or sixteen simple mathematical formulae from the Vedas. MotilalBanarsidass, Delhi (1985)

  • Kalaiselvi, A.: Multimedia security for image encryption using transformation matrix. Maejo Int. J. Sci. Technol. 1(3), 79–88 (2010)

    Google Scholar 

  • Kang, B.-H.: A review on image and video processing. Int. J. Multimed. Ubiquitous Eng. 2(2), 49–64 (2007)

    Google Scholar 

  • Khayyat, A., Manjikian, N.: Analysis of blocking and scheduling for FPGA based floating point matrix multiplication. Can. J. Electr. Comput. Eng. 37(2), 65–75 (2014)

    Article  Google Scholar 

  • Kumar, V.B.Y., Joshi, S., Patkar, S.B., Narayanan, H.: FPGA based high performance double precision matrix multiplication. Int. J. Parallel Prog. 38(3), 322–338 (2010)

    Article  MATH  Google Scholar 

  • Li, K.: Constant time boolean matrix multiplication on a linear array with a reconfigurable pipelined bus system. J. Supercomput. 11(4), 391–403 (1997)

    Article  Google Scholar 

  • Li, K., Pan, V.Y.: Parallel matrix multiplication on a linear array with a reconfigurable pipelined bus system. IEEE Trans. Comput. 50(5), 519–525 (2001)

    Article  MathSciNet  MATH  Google Scholar 

  • Matam, K.K., Prasanna, V.K.: Energy efficient large scale matrix multiplication on FPGAs. In: Proceedings of the International Conference on Reconfigurable Computing and FPGAs (ReConFig). pp. 1–8 (2013)

  • Matam, K.K., Le, H., Prasanna, V.K.: Evaluating energy efficiency of floating point matrix multiplication on FPGAs. In: Proceeding of the IEEE High Performance Extreme Computing Conference (HPEC). pp. 1–6 (2013)

  • Palacios, I., Medina, M., Moreno, J.: Matrix multiplication on digital signal processors and hierarchical memory systems. In: Baeza-Yates, R., Manber, U. (eds.) Computer Science, pp. 473–483. Springer, Boston, MA (1992)

    Chapter  Google Scholar 

  • Pan, V.: Complexity of parallel matrix computation. Theoret. Comput. Sci. 54, 65–85 (1987)

    Article  MathSciNet  MATH  Google Scholar 

  • Pan, Y., Li, K., Zheng, S.Q.: Fast nearest neighbor algorithms on a linear array with a reconfigurable pipelined bus system. Parallel Algorithms Appl. 13(1), 1–25 (2007)

    Article  MathSciNet  MATH  Google Scholar 

  • Pedram, A., Gei**, R.A., Gerstlauer, A.: Co-design tradeoffs for high-performance low power linear algebra architectures. IEEE Trans. Comput. 61(12), 1724–1736 (2012)

    Article  MathSciNet  MATH  Google Scholar 

  • Prabhune, O., Sabale, P., Sonawane, D.N., Prabhune C.L.: Image Processing and Matrices. In: International conference on Data Management Analytics and Innovation (ICDMAI). pp. 166–171 (2017)

  • Qasim, S.M., Abbasi, S.A., Almashary, B.: FPGA-based design and realization of fixed and floating point matrix multipliers: a review. J. Active Passiv. Electron. Devices 5, 181–189 (2010)

    Google Scholar 

  • Sajish, C., Abhyankar, Y., Ghotgalkar, S., Venkates, K.A.: Floating point matrix multiplication on a reconfigurable computing system. In: Current Trends in High Performance Computing and its Applications, pp. 113–122. Springer, Berlin (2005)

    Chapter  Google Scholar 

  • Shen, H., Chen, J.: Efficient matrix multiplication on wireless sensor networks. In: Proc of 7th International Conference on Grid and Cooperative Computing: 331–341 (2008)

  • Silva, H.D., Gustafson, J.L., Wong, W.F.: Making Strassen matrix multiplication safe. In: Proceedings of the 25th International Conference on High Performance Computing, pp. 173–182 (2018)

  • Singh, K.N., Tarunkumar, H.: A review on various multipliers designs in VLSI. In: Annual IEEE India Conference (INDICON), pp. 1–4 (2015)

  • Sonawane, D.N., Sutaone, M.S., InayatMalek: Resource efficient 64-bit floating point matrix multiplication algorithm using FPGA. In: IEEE Region 10 Conference TENCON, pp. 1–5 (2009)

  • Stojcev, M.K., Milovanovic, I.Z., Radonjic, Z.C.: Some shifting methods for matrix multiplication. IEE Proc. E-Comput. Digital Tech. 132(1), 33–44 (1985)

    Article  Google Scholar 

  • Strassen, V.: Gaussian elimination is not optimal. Numerischemathematik 13(4), 354–356 (1969)

    MathSciNet  MATH  Google Scholar 

  • Thabet, K., Al-Ghuribi, S.: Matrix multiplication algorithms. Int. J. Comput. Sci. Netw. Secur. 12(2), 74–79 (2012)

    Google Scholar 

  • Tiwari, S., Singh, S., Meena, N.: FPGA design and implementation of matrix multiplication architecture by PPI-MO techniques. Int. J. Comput. Appl. 80(1), 19–22 (2013)

    Google Scholar 

  • Van De Geijn, R.A., Watts, J.: SUMMA: scalable universal matrix multiplication algorithm. Concurr.: Pract. Exp. 9(4), 255–274 (1998)

    Article  Google Scholar 

  • Zhang, T., Li, C.T., Qin, Y., Nie, M.: An optimized floating point matrix multiplication on FPGA. Inf. Technol. J. 12(9), 1832–1838 (2013)

    Article  Google Scholar 

  • Zhou, L., Prassana, V.K.: High performance designs for linear algebra operations on reconfigurable hardware. IEEE Trans. Comput. 57(8), 1057–1071 (2008)

    Article  MathSciNet  Google Scholar 

Download references

Acknowledgements

This research has been funded by the research general direction at Universidad Santiago de Cali, Colombia under call no 01-2022. This research is collaborated with the authors in these institutions such as St. Xavier’s Catholic College of Engineering, Tamilnadu, India, Gems Educational Institutions, Sbte, Karunya Institute of Technology and Sciences, Coimbatore, India, and Al-nahrain university, al-nahrain nonrenewable energy research center Baghdad, Iraq.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Digvijay Pandey.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Bessant, Y.R.A., Jency, J.G., Sagayam, K.M. et al. Improved parallel matrix multiplication using Strassen and Urdhvatiryagbhyam method. CCF Trans. HPC 5, 102–115 (2023). https://doi.org/10.1007/s42514-023-00149-9

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s42514-023-00149-9

Keywords

Navigation