Log in

Multi-GPU multi-display rendering of extremely large 3D environments

  • Original article
  • Published:
The Visual Computer Aims and scope Submit manuscript

Abstract

In real-time rendering applications, mesh rendering quality suffers from limited GPU memory capacity and display resolution. Due to the increased complexity of models and the demand for higher display resolutions, people have started building commodity workstations with multiple GPUs at a low cost. As a result, more GPU memory is available across multiple GPUs, and a higher display resolution can be achieved by connecting each GPU to a display monitor, resulting in a large tiled display configuration. However, a multi-GPU workstation may not efficiently handle a complex model that cannot fit into the GPU memory, due to (1) the unified configuration treating GPUs as one hardware entity and requiring the same data replicated in all GPUs, and (2) the lack of scalability to reduce, balance, and stream data dynamically between the CPU and GPUs as well as among the GPUs. In this work, we present a fine-grained parallel rendering approach that integrates a view-dependent LOD selection strategy with the inter-GPU load balancing method to ensure each GPU handles the portion of data it rasterizes, without data replication. A new multi-GPU out-of-core method minimizes the amount of data transferred from the CPU to each GPU by taking the advantage of frame-to-frame coherence. A comprehensive evaluation is presented to understand the efficiency and scalability of the execution components over extremely large scenes.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16

Similar content being viewed by others

Data Availability

The datasets generated during and/or analyzed during the current study are available from the corresponding author on reasonable request.

References

  1. Abraham, F., Celes, W., Cerqueira, R., Campos, J.L.: A load-balancing strategy for sort-first distributed rendering. In: Proceedings. 17th Brazilian Symposium on Computer Graphics and Image Processing, pp. 292–299 (2004)

  2. Ahrens, J., Li-Ta Lo, Nouanesengsy, B., Patchett, J., McPherson, A.: Petascale visualization: approaches and initial results. In: 2008 Workshop on Ultrascale Visualization, pp. 24–28 (2008)

  3. Allard, J., Raffin, B.: A shader-based parallel rendering framework. In: VIS 05. IEEE Visualization, pp. 127–134 (2005)

  4. AMD: CrossFire technology (2017). https://www.amd.com/en/technologies/crossfire

  5. Argudo, O., Besora, I., Brunet, P., Creus, C., Hermosilla, P., Navazo, I., Vinacua, À.: Interactive inspection of complex multi-object industrial assemblies. Comput.-Aided Des. 79, 48–59 (2016). https://doi.org/10.1016/j.cad.2016.06.005. www.sciencedirect.com/science/article/pii/S0010448516300628

  6. Bethel, E.W., van Rosendale, J., Southard, D., Gaither, K., Childs, H., Brugger, E., Ahern, S.: Visualization at supercomputing centers: the tale of little big iron and the three skinny guys. IEEE Comput. Graph. Appl. 31(1), 90–95 (2011). https://doi.org/10.1109/MCG.2011.13

    Article  Google Scholar 

  7. Bhaniramka, P., Robert, P.C., Eilemann, S.: OpenGL multipipe SDK: a toolkit for scalable parallel rendering. In: VIS 05. IEEE Visualization, pp. 119–126. IEEE (2005)

  8. Cabiddu, D., Attene, M.: Large mesh simplification for distributed environments. Comput. Graph. 51, 81–89 (2015)

    Article  Google Scholar 

  9. Cignoni, P., Ganovelli, F., Gobbetti, E., Marton, F., Ponchio, F., Scopigno, R.: BDAM—batched dynamic adaptive meshes for high performance terrain visualization. Comput. Graph. Forum (2003). https://doi.org/10.1111/1467-8659.00698

    Article  Google Scholar 

  10. Cruz-Neira, C., Sandin, D.J., DeFanti, T.A., Kenyon, R.V., Hart, J.C.: The CAVE: audio visual experience automatic virtual environment. Commun. ACM 35(6), 64–73 (1992)

    Article  Google Scholar 

  11. Dong, Y., Peng, C.: Screen partitioning load balancing for parallel rendering on a multi-GPU multi-display workstation. In: Childs, H., Frey, S. (eds.) Eurographics Symposium on Parallel Graphics and Visualization. The Eurographics Association (2019). https://doi.org/10.2312/pgv.20191111

  12. Eilemann, S.: An analysis of parallel rendering systems. White Paper (2007). https://eyescale.github.io/equalizergraphics.com/documents/WhitePapers/ParallelRenderingSystems.pdf

  13. Eilemann, S., Bilgili, A., Abdellah, M., Hernando, J., Makhinya, M., Pajarola, R., Schürmann, F.: Parallel rendering on hybrid multi-GPU clusters. In: Childs, H., Kuhlen, T., Marton, F. (eds.) Eurographics Symposium on Parallel Graphics and Visualization. The Eurographics Association (2012). https://doi.org/10.2312/EGPGV/EGPGV12/109-117

  14. Eilemann, S., Makhinya, M., Pajarola, R.: Equalizer: a scalable parallel rendering framework. IEEE Trans. Vis. Comput. Graph. 15(3), 436–452 (2009)

    Article  Google Scholar 

  15. Eilemann, S., Steiner, D., Pajarola, R.: Equalizer 2.0—convergence of a parallel rendering framework. IEEE Trans. Vis. Comput. Graph. 26(2), 1292–1307 (2020)

    Article  Google Scholar 

  16. Erol, F., Eilemann, S., Pajarola, R.: Cross-segment load balancing in parallel rendering. In: Proceedings of the 11th Eurographics Conference on Parallel Graphics and Visualization, EGPGV ’11, pp. 41–50. Eurographics Association, Goslar, DEU (2011)

  17. Febretti, A., Nishimoto, A., Mateevitsi, V., Renambot, L., Johnson, A., Leigh, J.: Omegalib: a multi-view application framework for hybrid reality display environments. In: 2014 IEEE Virtual Reality (VR), pp. 9–14. IEEE (2014)

  18. Febretti, A., Nishimoto, A., Thigpen, T., Talandis, J., Long, L., Pirtle, J., Peterka, T., Verlo, A., Brown, M., Plepys, D., et al.: CAVE2: A hybrid reality environment for immersive simulation and information analysis. In: The Engineering Reality of Virtual Reality, vol. 8649, p. 864903. International Society for Optics and Photonics (2013)

  19. Funkhouser, T.A., Séquin, C.H.: Adaptive display algorithm for interactive frame rates during visualization of complex virtual environments. In: Proceedings of the 20th Annual Conference on Computer Graphics and Interactive Techniques, SIGGRAPH ’93, pp. 247–254. Association for Computing Machinery, New York (1993). https://doi.org/10.1145/166117.166149

  20. Gobbetti, E., Marton, F.: Far Voxels: a multiresolution framework for interactive rendering of huge complex 3D models on commodity graphics platforms. ACM Trans. Graph. 24(3), 878–885 (2005). https://doi.org/10.1145/1073204.1073277

    Article  Google Scholar 

  21. Grosset, A.V.P., Prasad, M., Christensen, C., Knoll, A., Hansen, C.: TOD-Tree: task-overlapped direct send tree image compositing for hybrid MPI parallelism and GPUs. IEEE Trans. Vis. Comput. Graph. 23(6), 1677–1690 (2017)

    Article  Google Scholar 

  22. Han, M., Wald, I., Usher, W., Morrical, N., Knoll, A., Pascucci, V., Johnson, C.R.: A virtual frame buffer abstraction for parallel rendering of large tiled display walls. In: Proceedings of the IEEE Visualization Conference (VIS), pp. 11–15 (2020). https://doi.org/10.1109/VIS47514.2020.00009

  23. Harris, M., Sengupta, S., Owens, J.D.: Parallel prefix sum (scan) with CUDA. GPU Gems 3(39), 851–876 (2007)

    Google Scholar 

  24. Hu, L., Sander, P.V., Hoppe, H.: Parallel view-dependent level-of-detail control. IEEE Trans. Vis. Comput. Graph. 16(5), 718–728 (2010)

    Article  Google Scholar 

  25. Humphreys, G., Houston, M., Ng, R., Frank, R., Ahern, S., Kirchner, P.D., Klosowski, J.T.: Chromium: a stream-processing framework for interactive rendering on clusters. In: Proceedings of the 29th Annual Conference on Computer Graphics and Interactive Techniques, SIGGRAPH ’02, pp. 693–702. Association for Computing Machinery, New York (2002). https://doi.org/10.1145/566570.566639

  26. Kenzel, M., Kerbl, B., Schmalstieg, D., Steinberger, M.: A high-performance software graphics pipeline architecture for the GPU. ACM Trans. Graph. (2018). https://doi.org/10.1145/3197517.3201374

  27. Kontkanen, J., Tabellion, E., Overbeck, R.S.: Coherent out-of-core point-based global illumination. In: Proceedings of the 22nd Eurographics Conference on Rendering, EGSR ’11, pp. 1353–1360. Eurographics Association, Goslar, DEU (2011). https://doi.org/10.1111/j.1467-8659.2011.01995.x

  28. Lai, D.Q., Sajadi, B., Jiang, S., Meenakshisundaram, G., Majumder, A.: A distributed memory hierarchy and data management for interactive scene navigation and modification on tiled display walls. IEEE Trans. Vis. Comput. Graph. 21(6), 714–729 (2015). https://doi.org/10.1109/TVCG.2015.2398439

    Article  Google Scholar 

  29. Laine, S., Karras, T.: High-performance software rasterization on GPUs. In: Proceedings of the ACM SIGGRAPH Symposium on High Performance Graphics, HPG ’11, pp. 79–88. Association for Computing Machinery, New York (2011). https://doi.org/10.1145/2018323.2018337

  30. Larsen, M., Moreland, K., Johnson, C.R., Childs, H.: Optimizing multi-image sort-last parallel rendering. In: 2016 IEEE 6th Symposium on Large Data Analysis and Visualization (LDAV), pp. 37–46 (2016)

  31. Liu, H., Wang, P., Wang, K., Cai, X., Zeng, L., Li, S.: Scalable multi-GPU decoupled parallel rendering approach in shared memory architecture. In: 2011 International Conference on Virtual Reality and Visualization, pp. 172–178 (2011)

  32. Melax, S.: A simple, fast, and effective polygon reduction algorithm. Game Dev. 11, 44–49 (1998)

    Google Scholar 

  33. Molnar, S., Cox, M., Ellsworth, D., Fuchs, H.: A sorting classification of parallel rendering. IEEE Comput. Graph. Appl. 14(4), 23–32 (1994)

    Article  Google Scholar 

  34. Moloney, B., Weiskopf, D., Möller, T., Strengert, M.: Scalable sort-first parallel direct volume rendering with dynamic load balancing. In: Proceedings of the 7th Eurographics Conference on Parallel Graphics and Visualization, EGPGV ’07, pp. 45–52. Eurographics Association, Goslar, DEU (2007)

  35. NVIDIA: SLI best practices. Tech. rep., Technical report, NVIDIA Corporation (2011). http://developer.download.nvidia.com/whitepapers/2011/SLI_Best_Practices_2011_Feb.pdf

  36. NVIDIA: NVIDIA Mosaic technology—user’s guide (2017). https://www.nvidia.com/content/dam/en-zz/Solutions/design-visualization/quadro-product-literature/NVMosaic-UG.pdf

  37. Peng, C., Cao, Y.: A GPU-based approach for massive model rendering with frame-to-frame coherence. Comput. Graph. Forum 31(2pt2), 393–402 (2012). https://doi.org/10.1111/j.1467-8659.2012.03018.x

  38. Peng, C., Cao, Y.: Parallel LOD for CAD model rendering with effective GPU memory usage. Comput.-Aided Des. Appl. 13(2), 173–183 (2016). https://doi.org/10.1080/16864360.2015.1084184

    Article  Google Scholar 

  39. Ren, X., Lustig, D., Bolotin, E., Jaleel, A., Villa, O., Nellans, D.: Hmg: extending cache coherence protocols across modern hierarchical multi-GPU systems. In: 2020 IEEE International Symposium on High Performance Computer Architecture (HPCA), pp. 582–595. IEEE (2020)

  40. Samanta, R., Zheng, J., Funkhouser, T., Li, K., Singh, J.P.: Load balancing for multi-projector rendering systems. In: Proceedings of the ACM SIGGRAPH/EUROGRAPHICS Workshop on Graphics Hardware, HWWS ’99, pp. 107–116. Association for Computing Machinery, New York (1999). https://doi.org/10.1145/311534.311584

  41. Son, M., Yoon, S.E.: Timeline scheduling for out-of-core ray batching. In: Proceedings of High Performance Graphics, HPG ’17. Association for Computing Machinery, New York (2017). https://doi.org/10.1145/3105762.3105784

  42. Steiner, D., Paredes, E.G., Eilemann, S., Pajarola, R.: Dynamic work packages in parallel rendering. In: Gobbetti, E., Bethel, W. (eds.) Eurographics Symposium on Parallel Graphics and Visualization. The Eurographics Association (2016). https://doi.org/10.2312/pgv.20161185

  43. Varadhan, G., Manocha, D.: Out-of-core rendering of massive geometric environments. In: Proceedings of the Conference on Visualization ’02, VIS ’02, pp. 69–76. IEEE Computer Society, USA (2002)

  44. Wang, P., Liu, H., Li, S., Zeng, L., Cai, X.: Multi-GPU compositeless parallel rendering algorithm. In: 2011 12th International Conference on Computer-Aided Design and Computer Graphics, pp. 103–107 (2011)

  45. Whitman, S.: Dynamic load balancing for parallel polygon rendering. IEEE Comput. Graph. Appl. 14(4), 41–48 (1994)

    Article  Google Scholar 

  46. Yoon, S.E., Salomon, B., Gayle, R., Manocha, D.: Quick-VDR: interactive view-dependent rendering of massive models. In: Proceedings of the Conference on Visualization ’04, VIS ’04, pp. 131–138. IEEE Computer Society, USA (2004). https://doi.org/10.1109/VISUAL.2004.86

  47. Zheng, G., Bhatele, A., Meneses, E., Kale, L.V.: Periodic hierarchical load balancing for large supercomputers. Int. J. High Perform. Comput. Appl. 25(4), 371–385 (2011)

    Article  Google Scholar 

Download references

Acknowledgements

This work was supported by the National Science Foundation Grant CNS-1464323. We thank Nvidia for donating the GPU device that has been used in this work to run our algorithms and produce the experimental results. The Power Plant model is brought through the courtesy of the University of North Carolina at Chapel Hill. We also thank the RIT MAGIC Center for their technical and logistic support for this research.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Chao Peng.

Ethics declarations

Conflict of interest

The authors have no conflicts of interest/competing interests to declare that are relevant to the content of this article.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary file 1 (mp4 82067 KB)

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Dong, Y., Peng, C. Multi-GPU multi-display rendering of extremely large 3D environments. Vis Comput 39, 6473–6489 (2023). https://doi.org/10.1007/s00371-022-02740-7

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00371-022-02740-7

Keywords

Navigation