Abstract
Reconstruction of the soft tissues in robotic surgery from endoscopic stereo videos is important for many applications such as intra-operative navigation and image-guided robotic surgery automation. Previous works on this task mainly rely on SLAM-based approaches, which struggle to handle complex surgical scenes. Inspired by recent progress in neural rendering, we present a novel framework for deformable tissue reconstruction from binocular captures in robotic surgery under the single-viewpoint setting. Our framework adopts dynamic neural radiance fields to represent deformable surgical scenes in MLPs and optimize shapes and deformations in a learning-based manner. In addition to non-rigid deformations, tool occlusion and poor 3D clues from a single viewpoint are also particular challenges in soft tissue reconstruction. To overcome these difficulties, we present a series of strategies of tool mask-guided ray casting, stereo depth-cueing ray marching and stereo depth-supervised optimization. With experiments on DaVinci robotic surgery videos, our method significantly outperforms the current state-of-the-art reconstruction method for handling various complex non-rigid deformations. To our best knowledge, this is the first work leveraging neural rendering for surgical scene 3D reconstruction with remarkable potential demonstrated. Code is available at: https://github.com/med-air/EndoNeRF.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Brandao, P., Psychogyios, D., Mazomenos, E., Stoyanov, D., Janatka, M.: Hapnet: hierarchically aggregated pyramid network for real-time stereo matching. Comput. Methods Biomech. Biomed. Eng. Imaging Vis. 9(3), 219–224 (2021)
Chen, L., Tang, W., John, N.W., Wan, T.R., Zhang, J.J.: Slam-based dense surface reconstruction in monocular minimally invasive surgery and its application to augmented reality. Comput. Methods Programs Biomed. 158, 135–146 (2018)
Deng, K., Liu, A., Zhu, J.Y., Ramanan, D.: Depth-supervised nerf: fewer views and faster training for free. ar**v preprint ar**v:2107.02791 (2021)
Gao, W., Tedrake, R.: Surfelwarp: efficient non-volumetric single view dynamic reconstruction. In: Robotics: Science and Systems XIV (2019)
Kajiya, J.T., Von Herzen, B.P.: Ray tracing volume densities. ACM SIGGRAPH Comput. Graph. 18(3), 165–174 (1984)
Kato, H., Ushiku, Y., Harada, T.: Neural 3D mesh renderer. In: CVPR, pp. 3907–3916 (2018)
Kniss, J., Ikits, M., Lefohn, A., Hansen, C., Praun, E., et al.: Gaussian transfer functions for multi-field volume visualization. In: IEEE Visualization, 2003, VIS 2003, pp. 497–504. IEEE (2003)
Li, Y., et al.: Super: a surgical perception framework for endoscopic tissue manipulation with surgical robotics. IEEE Rob. Autom. Lett. 5(2), 2294–2301 (2020)
Li, Z., et al.: Revisiting stereo depth estimation from a sequence-to-sequence perspective with transformers. In: ICCV, pp. 6197–6206 (2021)
Liu, X., et al.: Reconstructing sinus anatomy from endoscopic video – towards a radiation-free approach for quantitative longitudinal assessment. In: Martel, A.L., et al. (eds.) MICCAI 2020. LNCS, vol. 12263, pp. 3–13. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-59716-0_1
Long, Y., et al.: E-DSSR: efficient dynamic surgical scene reconstruction with transformer-based stereoscopic depth perception. In: de Bruijne, M., et al. (eds.) MICCAI 2021. LNCS, vol. 12904, pp. 415–425. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-87202-1_40
Lu, J., Jayakumari, A., Richter, F., Li, Y., Yip, M.C.: Super deep: a surgical perception framework for robotic tissue manipulation using deep learning for feature extraction. In: ICRA, pp. 4783–4789. IEEE (2021)
Luo, H., Wang, C., Duan, X., Liu, H., Wang, P., Hu, Q., Jia, F.: Unsupervised learning of depth estimation from imperfect rectified stereo laparoscopic images. Comput. Biol. Med. 140, 105109 (2022)
Martin-Brualla, R., Radwan, N., Sajjadi, M.S., Barron, J.T., Dosovitskiy, A., Duckworth, D.: Nerf in the wild: neural radiance fields for unconstrained photo collections. In: CVPR, pp. 7210–7219 (2021)
Mildenhall, B., Srinivasan, P.P., Tancik, M., Barron, J.T., Ramamoorthi, R., Ng, R.: NeRF: representing scenes as neural radiance fields for view synthesis. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12346, pp. 405–421. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58452-8_24
Newcombe, R.A., Fox, D., Seitz, S.M.: Dynamicfusion: reconstruction and tracking of non-rigid scenes in real-time. In: CVPR, pp. 343–352 (2015)
Niemeyer, M., Geiger, A.: Giraffe: representing scenes as compositional generative neural feature fields. In: CVPR, pp. 11453–11464 (2021)
Park, K., et al.: Nerfies: deformable neural radiance fields. In: CVPR, pp. 5865–5874 (2021)
Park, K., et al.: Hypernerf: a higher-dimensional representation for topologically varying neural radiance fields. ACM Trans. Graph. (TOG) 40(6), 1–12 (2021)
Penza, V., De Momi, E., Enayati, N., Chupin, T., Ortiz, J., Mattos, L.S.: envisors: enhanced vision system for robotic surgery. a user-defined safety volume tracking to minimize the risk of intraoperative bleeding. Front. Rob AI 4, 15 (2017)
Pumarola, A., Corona, E., Pons-Moll, G., Moreno-Noguer, F.: D-nerf: neural radiance fields for dynamic scenes. In: CVPR, pp. 10318–10327 (2021)
Recasens, D., Lamarca, J., Fácil, J.M., Montiel, J., Civera, J.: Endo-depth-and-motion: reconstruction and tracking in endoscopic videos using depth networks and photometric constraints. IEEE Rob. Autom. Lett. 6(4), 7225–7232 (2021)
Song, J., Wang, J., Zhao, L., Huang, S., Dissanayake, G.: Dynamic reconstruction of deformable soft-tissue with stereo scope in minimal invasive surgery. IEEE Rob. Autom. Lett. 3(1), 155–162 (2017)
Tancik, M., et al.: Fourier features let networks learn high frequency functions in low dimensional domains. NeurIPS 33, 7537–7547 (2020)
Tang, R., et al.: Augmented reality technology for preoperative planning and intraoperative navigation during hepatobiliary surgery: a review of current methods. Hepatobiliary Panc. Dis. Int. 17(2), 101–112 (2018)
Tewari, A., et al.: Advances in neural rendering. In: ACM SIGGRAPH 2021 Courses, pp. 1–320 (2021)
Tewari, A., al.: State of the art on neural rendering. In: Computer Graphics Forum, vol. 39, pp. 701–727. Wiley Online Library (2020)
Tukra, S., Marcus, H.J., Giannarou, S.: See-through vision with unsupervised scene occlusion reconstruction. IEEE Trans. Pattern Anal. Mach. Intell. 01, 1–1 (2021)
Wei, G., Yang, H., Shi, W., Jiang, Z., Chen, T., Wang, Y.: Laparoscopic scene reconstruction based on multiscale feature patch tracking method. In: 2021 International Conference on Electronic Information Engineering and Computer Science (EIECS), pp. 588–592. IEEE (2021)
Wei, R., et al.: Stereo dense scene reconstruction and accurate laparoscope localization for learning-based navigation in robot-assisted surgery. ar**v preprint ar**v:2110.03912 (2021)
Zhou, H., Jagadeesan, J.: Real-time dense reconstruction of tissue surface from stereo optical video. IEEE Trans. Med. Imaging 39(2), 400–412 (2019)
Zhou, H., Jayender, J.: EMDQ-SLAM: real-time high-resolution reconstruction of soft tissue surface from stereo laparoscopy videos. In: de Bruijne, M., et al. (eds.) MICCAI 2021. LNCS, vol. 12904, pp. 331–340. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-87202-1_32
Zhou, H., Jayender, J.: Real-time nonrigid mosaicking of laparoscopy images. IEEE Trans. Med. Imaging 40(6), 1726–1736 (2021)
Acknowledgements
This work was supported in part by CUHK Shun Hing Institute of Advanced Engineering (project MMT-p5-20), in part by Shenzhen-HK Collaborative Development Zone, and in part by Multi-Scale Medical Robotics Centre InnoHK.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Wang, Y., Long, Y., Fan, S.H., Dou, Q. (2022). Neural Rendering for Stereo 3D Reconstruction of Deformable Tissues in Robotic Surgery. In: Wang, L., Dou, Q., Fletcher, P.T., Speidel, S., Li, S. (eds) Medical Image Computing and Computer Assisted Intervention – MICCAI 2022. MICCAI 2022. Lecture Notes in Computer Science, vol 13437. Springer, Cham. https://doi.org/10.1007/978-3-031-16449-1_41
Download citation
DOI: https://doi.org/10.1007/978-3-031-16449-1_41
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-16448-4
Online ISBN: 978-3-031-16449-1
eBook Packages: Computer ScienceComputer Science (R0)