Log in

MEAN: An attention-based approach for 3D mesh shape classification

  • Original article
  • Published:
The Visual Computer Aims and scope Submit manuscript

Abstract

3D shape processing is a fundamental computer application. Specifically, 3D mesh could provide a natural and detailed way for object representation. However, due to its non-uniform and irregular data structure, applying deep learning technologies to 3D mesh is difficult. Furthermore, previous deep learning approaches for 3D mesh mainly focus on local structural features and there is a loss of information. In this paper, to make better mesh shape awareness, a novel deep learning approach is proposed, which aims to full-use the information of mesh data and exploit comprehensive features for more accurate classification. To utilize self-attention mechanism and learn global features of mesh edges, we propose a novel attention-based structure with the edge attention module. Then, for local feature learning, our model aggregates edge features from adjacent edges. We refine the network by discarding pooling layers for efficiency. Thus, it captures comprehensive features from both local and global fields for better shape awareness. Moreover, we adopt spatial position encoding module based on spatial information of edges to enhance the model to better recognize edges and make full use of mesh data. We demonstrate effectiveness of our model in classification tasks with numerous experiments which show outperforming results on popular datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price includes VAT (Thailand)

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

Data availability

The datasets generated during and/or analyzed during the current study are available from the corresponding author on reasonable request.

References

  1. Wang, K., Zhang, G., Yang, J., Bao, H.: Dynamic human body reconstruction and motion tracking with low-cost depth cameras. Vis. Comput. 37(3), 603–618 (2021)

    Article  Google Scholar 

  2. Haouchine, N., Roy, F., Courtecuisse, H., Niebner, M., Cotin, S.: Physics-based image and video editing through cad model proxies. Vis. Comput. 36(1), 211–226 (2020)

    Article  Google Scholar 

  3. Zabulis, X., Lourakis, M.I.A., Koutlemanis, P.: Correspondence-free pose estimation for 3d objects from noisy depth data. Vis. Comput. 34(2), 193–211 (2018)

    Article  Google Scholar 

  4. Hua, H., Jia, T.: Wire cut of double-sided minimal surfaces. Visu. Comput. 34(6-8, SI), 985–995 (2018). In: 35th Computer Graphics International Conference (CGI), Indonesia, 11–14 June, 2018

  5. Yeo, C., Kim, B.C., Cheon, S., Lee, J., Mun, D.: Machining feature recognition based on deep neural networks to support tight integration with 3d cad systems. Sci. Rep. 11(1), 22147 (2021)

    Article  Google Scholar 

  6. Regli, W., Rossignac, J., Shapiro, V., Srinivasan, V.: The new frontiers in computational modeling of material structures. Comput. Aided Des. 77, 73–85 (2016)

    Article  Google Scholar 

  7. Liang, Y., He, F., Zeng, X., Luo, J.: An improved loop subdivision to coordinate the smoothness and the number of faces via multi-objective optimization. Integr. Comput. Aided Eng. 29(1), 23–41 (2022)

    Article  Google Scholar 

  8. Hai, W., Jain, N., Wydra, A., Thalmann, N.M., Thalmann, D.: Time-scaled interactive object-driven multi-party vr. Vis. Comput. 34(6-8, SI), 887–897 (2018). In: 35th Computer Graphics International Conference (CGI), Indonesia, 11–14 June, 2018

  9. Diaz, J., Ropinski, T., Navazo, I., Gobbetti, E., Vazquez, P.-P.: An experimental study on the effects of shading in 3d perception of volumetric models. Vis. Comput. 33(1), 47–61 (2017)

    Article  Google Scholar 

  10. Wang, Y., Rosen, D.W.: Multiscale heterogeneous modeling with surfacelets. Comput. Aided Des. Appl. 7(5), 759–776 (2010)

    Article  Google Scholar 

  11. Buonamici, F., Furferi, R., Governi, L., Lazzeri, S., McGreevy, K.S., Servi, M., Talanti, E., Uccheddu, F., Volpe, Y.: A practical methodology for computer-aided design of custom 3d printable casts for wrist fractures. Vis. Comput. 36(2), 375–390 (2020)

    Article  Google Scholar 

  12. Kim, Y., Kwon, K., Mun, D.: Mesh-offset-based method to generate a delta volume to support the maintenance of partially damaged parts through 3d printing. J. Mech. Sci. Technol. 35(7), 3131–3143 (2021)

    Article  Google Scholar 

  13. Bukenberger, D.R., Lensch, H.P.A.: Be water my friend: mesh assimilation. Vis. Comput. 37, 2725–2739 (2021). ((9-11, SI))

    Article  Google Scholar 

  14. **ao, D., Lin, H., **an, C., Gao, S.: Cad mesh model segmentation by clustering. Comput. Graph. 35(3), 685–691 (2011)

    Article  Google Scholar 

  15. Qi, C.R., Su, H., Mo, K., Guibas, L.J.: Pointnet: deep learning on point sets for 3d classification and segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 652–660 (2017)

  16. Song, Y., He, F., Duan, Y., Liang, Y., Yan, X.: A kernel correlation-based approach to adaptively acquire local features for learning 3d point clouds. Comput. Aided Des. 146, 103196 (2022)

    Article  MathSciNet  Google Scholar 

  17. Maturana, D., Scherer, S.: Voxnet: a 3d convolutional neural network for real-time object recognition. In: 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 922–928 (2015)

  18. Hanocka, R., Hertz, A., Fish, N., Giryes, R., Fleishman, S., Cohen-Or, D.: Meshcnn: a network with an edge. ACM Trans. Graph. (TOG) 38(4), 1–12 (2019)

    Article  Google Scholar 

  19. Schneider, L., Niemann, A., Beuing, O., Preim, B., Saalfeld, S.: Medmeshcnn-enabling meshcnn for medical surface models. Comput. Methods Programs Biomed. 210, 106372 (2021)

    Article  Google Scholar 

  20. Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T., Dehghani, M., Minderer, M., Heigold, G., Gelly, S, et al.: An image is worth 16x16 words: transformers for image recognition at scale. ar**v preprint ar**v:2010.11929 (2020)

  21. Guo, M.H., Cai, J.X., Liu, Z.N., Mu, T.J., Martin, R.R., Hu, S.M.: Pct: point cloud transformer. Comput Vis Media 7(2), 187–199 (2021)

    Article  Google Scholar 

  22. Yu, X., Tang, L., Rao, Y., Huang, T., Zhou, J., Lu, J.: Point-bert: pre-training 3d point cloud transformers with masked point modeling. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 19313–19322 (2022)

  23. Wang, P., Gan, Y., Shui, P., Fenggen, Yu., Zhang, Y., Chen, S., Sun, Z.: 3d shape segmentation via shape fully convolutional networks. Comput. Graph. 70, 128–139 (2018)

    Article  Google Scholar 

  24. Verma, N., Boyer, E., Verbeek, J.: Feastnet: feature-steered graph convolutions for 3d shape analysis. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018)

  25. Qiao, Y.-L., Gao, L., Yang, J., Rosin, P.L., Lai, Y.-K., Chen, X.: Learning on 3d meshes with Laplacian encoding and pooling. IEEE Trans. Vis. Comput. Graph. 28(2), 1317–1327 (2022)

    Article  Google Scholar 

  26. Dong, Q., Wang, Z., Li, M., Gao, J., Chen, S., Shu, Z., **n, S., Tu, C., Wang, W.: Laplacian2mesh: Laplacian-based mesh understanding. ar**v preprint ar**v:2202.00307 (2022)

  27. Li, X.-J., Yang, J., Zhang, F.-L.: Laplacian mesh transformer: dual attention and topology aware network for 3d mesh classification and segmentation. In: Computer Vision - ECCV, pp. 541–560 (2022)

  28. Ruizhongtai, Q.C., Yi, L., Su, H., Guibas, L.J.: Pointnet++: deep hierarchical feature learning on point sets in a metric space. In: Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems, 4–9 Dec, 2017, Long Beach, CA, USA, pp. 5099–5108 (2017)

  29. Chen, Yu., Zhao, J., Shi, C., Yuan, D.: Mesh convolution: a novel feature extraction method for 3d nonrigid object classification. IEEE Trans. Multimed. 23, 3098–3111 (2020)

    Article  Google Scholar 

  30. Sharp, N., Attaiki, S., Crane, K., Ovsjanikov, M.: Diffusionnet: discretization agnostic learning on surfaces. ACM Trans. Graph. (TOG) 41(3), 1–16 (2022)

    Article  Google Scholar 

  31. Smirnov, D., Solomon, J.: Hodgenet: learning spectral geometry on triangle meshes. ACM Trans. Graph. (TOG) 40(4), 1–11 (2021)

    Article  Google Scholar 

  32. Perozzi, B., Al-Rfou, R., Skiena, S.: Deepwalk: online learning of social representations. In: Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 701–710 (2014)

  33. Lahav, A., Tal, A.: Meshwalker: deep mesh understanding by random walks. ACM Trans. Graph. (TOG) 39(6), 1–13 (2020)

    Article  Google Scholar 

  34. Ben Izhak, R., Lahav, A., Tal, A.: Attwalk: attentive cross-walks for deep mesh analysis. In: 2022 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), pp. 2937–2946. IEEE (2022)

  35. Milano, F., Loquercio, A., Rosinol, A., Scaramuzza, D., Carlone, L.: Primal-dual mesh convolutional neural networks. Adv. Neural Inf. Process. Syst. 33, 952–963 (2020)

    Google Scholar 

  36. Veličković, P., Cucurull, G., Casanova, A., Romero, A., Lio, P. Graph attention networks. ar**v preprint ar**v:1710.10903 (2017)

  37. Lian, C., Wang, L., Wu, T.H., Liu, M., Durain, F., Ko, C.C., Shen, D.: Meshsnet: deep multi-scale mesh feature learning for end-to-end tooth labeling on 3d dental surfaces. In: International Conference on Medical Image Computing and Computer-Assisted Intervention, pp. 837–845. Springer (2019)

  38. Feng, Y., Feng, Y., You, H., Zhao, X., Gao, Y.: Meshnet: mesh neural network for 3d shape representation. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, pp. 8279–8286 (2019)

  39. Xu, H., Dong, M., Zhong, Z.: Directionally convolutional networks for 3d shape segmentation. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2698–2707 (2017)

  40. Hu, S.M., Liu, Z.N., Guo, M.H., Cai, J.X., Huang, J., Mu, T.J., Martin, R.R.: Subdivision-based mesh convolution networks. ACM Trans. Graph. (TOG) 41(3), 1–16 (2022)

    Article  Google Scholar 

  41. Bahdanau, D., Cho, K., Bengio, Y.: Neural machine translation by jointly learning to align and translate. ar**v preprint ar**v:1409.0473 (2014)

  42. Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser Ł, Polosukhin, I.: Attention is all you need. Adv. Neural Inf. Process. Syst. 30 (2017)

  43. Devlin, J., Chang, M.W., Lee, K. and Toutanova, K.: Bert: pre-training of deep bidirectional transformers for language understanding. ar**v preprint ar**v:1810.04805 (2018)

  44. Si, T., He, F., Zhang, Z. and Duan, Y.: Hybrid contrastive learning for unsupervised person re-identification. IEEE Trans. Multimed. (2022)

  45. Tang, W., He, F., Liu, Y.: Ydtr: infrared and visible image fusion via y-shape dynamic transformer. IEEE Trans. Multimed. (2022)

  46. Chen, J., Lu, Y., Yu, Q., Luo, X., Adeli, E., Wang, Y., Lu, L., Yuille, A.L., Zhou, Y.: Transunet: Transformers make strong encoders for medical image segmentation. ar**v preprint ar**v:2102.04306 (2021)

  47. **e, E., Wang, W., Yu, Z., Anandkumar, A., Alvarez, J.M., Luo, P.: Segformer: simple and efficient design for semantic segmentation with transformers. Adv. Neural Inf. Process. Syst. 34, 12077–12090 (2021)

    Google Scholar 

  48. Chaudhari, S., Mithal, V., Polatkan, G., Ramanath, R.: An attentive survey of attention models. ACM Trans. Intell. Syst. Technol. (TIST) 12(5), 1–32 (2021)

    Article  Google Scholar 

  49. Niu, Z., Zhong, G., Hui, Yu.: A review on the attention mechanism of deep learning. Neurocomputing 452, 48–62 (2021)

    Article  Google Scholar 

  50. Zhao, H., Jiang, L., Jia, J., Torr, P.H., Koltun, V.: Point transformer. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 16259–16268 (2021)

  51. Lin, K., Wang, L. and Liu, Z.: End-to-end human pose and mesh reconstruction with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1954–1963 (2021)

  52. Lin, K., Wang, L., Liu, Z.: Mesh graphormer. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 12939–12948 (2021)

  53. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)

  54. Lian, Z., Godil, A., Bustos, B., Daoudi, M., Hermans, J., Kawamura, S., Kurita, Y., Lavoue G., Van Nguyen, H., Ohbuchi, R, et al. Shrec’11 track: shape retrieval on non-rigid 3d watertight meshes. In: 3DOR@ Eurographics, pp. 79–88 (2011)

  55. Ezuz, D., Solomon, J., Kim, V.G., Ben-Chen, M.: Gwcnn: a metric alignment layer for deep shape analysis. In: Computer Graphics Forum, vol. 36, pp. 49–57. Wiley, Hoboken (2017)

    Google Scholar 

  56. Wu, Z., Song, S., Khosla, A., Yu, F., Zhang, L., Tang, X., **ao, J.: 3d shapenets: A deep representation for volumetric shapes. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1912–1920 (2015)

  57. Huang, J., Su, H., Guibas, L.: Robust watertight manifold surface generation method for shapenet models. ar**v preprint ar**v:1802.01698 (2018)

  58. Lee, A.W., Sweldens, W., Schröder, P., Cowsar, L. and Dobkin, D.: Maps: multiresolution adaptive parameterization of surfaces. In: Proceedings of the 25th annual Conference on Computer Graphics and Interactive Techniques, pp. 95–104 (1998)

  59. Zhang, J., He, F., Duan, Y., Yang, S.: Aidednet: anti-interference and detail enhancement dehazing network for real-world scenes. Front. Comput. Sci. 17(2), 172703 (2023)

    Article  Google Scholar 

  60. Zhang, Y., Yin, C., Qilin, W., He, Q., Zhu, H.: Location-aware deep collaborative filtering for service recommendation. IEEE Trans. Syst. Man Cybern. Syst. 51(6), 3796–3807 (2021)

  61. Mancini, S., Stecca, G.: A large neighborhood search based matheuristic for the tourist cruises itinerary planning. Comput. Ind. Eng. 122, 140–148 (2018)

    Article  Google Scholar 

  62. Li, S., Zhang, D., **an, Y., Li, B., Zhang, T., Zhong, C. Parameter-adaptive multi-frame joint pose optimization method. Vis. Comput

Download references

Funding

This work is supported by the National Natural Science Foundation of China (Grant No.62072348). The numerical calculations in this paper have been done on the supercomputing system in the Supercomputing Center of Wuhan University.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Fazhi He.

Ethics declarations

Conflict of interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Dai, J., Fan, R., Song, Y. et al. MEAN: An attention-based approach for 3D mesh shape classification. Vis Comput 40, 2987–3000 (2024). https://doi.org/10.1007/s00371-023-03003-9

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00371-023-03003-9

Keywords

Navigation