Performance Analysis of Deep Learning Inference in Convolutional Neural Networks on Intel Cascade Lake CPUs

  • Conference paper
  • First Online:
Mathematical Modeling and Supercomputer Technologies (MMST 2020)

Abstract

The paper aims to compare the performance of deep convolutional network inference. Experiments are carried out on a high-end server with two Intel Xeon Platinum 8260L 2.4 GHz CPUs (48 cores in total). Performance analysis is done using the ResNet-50 and GoogleNet-v3 models. The inference is implemented employing the commonly used software libraries, namely Intel Distribution of Caffe, TensorFlow, PyTorch, MXNet, OpenCV, and the Intel Distribution of OpenVINO toolkit. We compare total run time and the number of processed frames per second and examine the strong scaling efficiency when using up to 48 CPU cores. Experiments have shown that OpenVINO provides the best performance and scales well up to 48 cores. We also observe that OpenVINO in the Throughput mode compared to latency mode accelerates inference from 4.9x for an image batch size of 1 to 1.4x for an image batch size of 32. We found that INT8 quantization in OpenVINO substantially improves the inference performance while maintaining almost the same classification quality.

The paper is recommended for publication by the Program Committee of the international conference Mathematical Modelling and Supercomputing Technologies-2020.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free ship** worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Apache MXNet. https://mxnet.apache.org

  2. Caffe. http://caffe.berkeleyvision.org

  3. Default Quantization algorithm in Intel Distribution of OpenVINO Toolkit. https://docs.openvinotoolkit.org/latest/pot_compression_algorithms_quantization_default_README.html

  4. Inference Performance Analysis repository. https://github.com/itlab-vision/inference_performance_analysis

  5. Intel Distribution of Caffe. https://github.com/intel/caffe

  6. Intel Distribution of OpenVINO toolkit. https://software.intel.com/en-us/openvino-toolkit

  7. OneAPI Deep Neural Network Library. https://github.com/oneapi-src/oneDNN

  8. OpenCV. https://opencv.org

  9. PyTorch. https://pytorch.org

  10. TensorFlow. https://www.tensorflow.org

  11. Uniform Quantization in the Intel Distribution of OpenVINO Toolkil. https://docs.openvinotoolkit.org/latest/po_compression_algorithms_quantization_README.html

  12. Abts, D., et al.: Think fast: a tensor streaming processor (TSP) for accelerating deep learning workloads. In: Proceedings of the Symposium on Computer Architecture, pp. 145–158 (2020). https://doi.org/10.1109/ISCA45697.2020.00023

  13. Ciaparrone, G., et al.: Deep learning in video multi-object tracking: a survey. Neurocomputing 381, 61–88 (2020). https://doi.org/10.1016/j.neucom.2019.11.023

    Article  Google Scholar 

  14. Coleman, C., et al.: DAWNBench: an end-to-end deep learning benchmark and competition. In: NIPS ML Systems Workshop, pp. 1–10 (2017). https://dawn.cs.stanford.edu/benchmark/papers/nips17-dawnbench.pdf

  15. George, D., Huerta, E.A.: Deep Learning for real-time gravitational wave detection and parameter estimation: results with advanced LIGO data. Phys. Lett. B 778, 64–70 (2018). https://doi.org/10.1016/j.physletb.2017.12.053

    Article  Google Scholar 

  16. Gonoskov, A., et al.: Employing machine learning for theory validation and identification of experimental conditions in laser-plasma physics. Sci. Rep. 9(1), 1–15 (2019). https://doi.org/10.1038/s41598-019-43465-3

    Article  Google Scholar 

  17. Gorbachev, Y., et al.: OpenVINO deep learning workbench: comprehensive analysis and tuning of neural networks inference. In: Proceedings of the IEEE/ICCV Workshops (2019)

    Google Scholar 

  18. He, K., et al.: Deep residual learning for image recognition. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016). https://doi.org/10.1109/CVPR.2016.90

  19. Ignatov, A., et al.: AI benchmark: all about deep learning on smartphones in 2019, pp. 3617–3635, October 2019. https://doi.org/10.1109/ICCVW.2019.00447

  20. Jain, A., et al.: Efficient execution of quantized deep learning models: a compiler approach. arxiv preprint ar**v:2006.10226 (2020)

  21. Kustikova, V., Vasiliev, E., Khvatov, A., Kumbrasiev, P., Rybkin, R., Kogteva, N.: DLI: deep learning inference benchmark. In: Voevodin, V., Sobolev, S. (eds.) RuSCDays 2019. CCIS, vol. 1129, pp. 542–553. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-36592-9_44

    Chapter  Google Scholar 

  22. March, P.S.: Optimize Virtualized Deep Learning Performance with New Intel Architectures (2020). https://www.vmware.com/techpapers/2020/virtualized-vnni-perf.html

  23. Park, J., et al.: Deep learning inference in Facebook data centers: characterization, performance optimizations and hardware implications. ar**v preprint ar**v:1811.09886 (2018)

  24. Raissi, M., et al.: Physics-informed neural networks: a deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. J. Comput. Phys. 378, 686–707 (2019). https://doi.org/10.1016/j.jcp.2018.10.045

    Article  MathSciNet  MATH  Google Scholar 

  25. Ravi, D., et al.: Deep learning for health informatics. IEEE J. Biomed. Health Inform. 21(1), 4–21 (2017). https://doi.org/10.1109/JBHI.2016.2636665

  26. Reddi, V.J.: MLPerf inference benchmark. In: Proceedings of the Symposium on Computer Architecture, pp. 446–459 (2020). https://doi.org/10.1109/ISCA45697.2020.00045

  27. Russakovsky, O., et al.: ImageNet large scale visual recognition challenge. Int. J. Comput. Vis. 115(3), 211–252 (2015). https://doi.org/10.1007/s11263-015-0816-y

    Article  MathSciNet  Google Scholar 

  28. Szegedy, C., et al.: Rethinking the inception architecture for computer vision. In: Proceedings of the IEEE Computer Society Conference on CV and Pattern Recognition, pp. 2818–2826 (2016). https://doi.org/10.1109/CVPR.2016.308

  29. Torelli, P., Bangale, M.: Measuring Inference Performance of Machine-Learning Frameworks on Edge-class Devices with the MLMark\(^{{\rm TM}}\) Benchmark. https://www.eembc.org/techlit/articles/MLMARK-WHITEPAPER-FINAL-1.pdf

  30. Voulodimos, A., et al.: Deep learning for computer vision: a brief review. Comput. Intell. Neurosci. (2018). https://doi.org/10.1155/2018/7068349

    Article  Google Scholar 

  31. Wu, H., et al.: Integer quantization for deep learning inference: principles and empirical evaluation. ar**v preprint ar**v:2004.09602 (2020)

  32. Yang, C.T., et al.: Performance benchmarking of deep learning framework on Intel Xeon Phi. J. Supercomput. (2020). https://doi.org/10.1007/s11227-020-03362-3

    Article  Google Scholar 

  33. Young, T., et al.: Recent trends in deep learning based natural language processing [Review Article]. IEEE Comput. Intell. Mag. 13(3), 55–75 (2018). https://doi.org/10.1109/MCI.2018.2840738

    Article  Google Scholar 

Download references

Acknowledgements

I.M. and V.V. acknowledge support of Russian Government Grant No. 0729-2020-0055. E.K., E.V., and V.K. acknowledge support of Intel Corporation. The authors are grateful to N. Ageeva, Yu. Gorbachev, K. Korniakov, and Z. Matveev for valuable comments. The experiments were performed on the Intel Endeavor supercomputer at Intel and the Lobachevsky supercomputer at UNN.

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Vasiliev, E.P., Kustikova, V.D., Volokitin, V.D., Kozinov, E.A., Meyerov, I.B. (2021). Performance Analysis of Deep Learning Inference in Convolutional Neural Networks on Intel Cascade Lake CPUs. In: Balandin, D., Barkalov, K., Gergel, V., Meyerov, I. (eds) Mathematical Modeling and Supercomputer Technologies. MMST 2020. Communications in Computer and Information Science, vol 1413. Springer, Cham. https://doi.org/10.1007/978-3-030-78759-2_29

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-78759-2_29

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-78758-5

  • Online ISBN: 978-3-030-78759-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Navigation