Metric Learning-Based Unsupervised Domain Adaptation for 3D Skeleton Hand Activities Categorization

  • Conference paper
  • First Online:
Image Analysis and Processing – ICIAP 2022 (ICIAP 2022)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13233))

Included in the following conference series:

  • 1324 Accesses

Abstract

First-person hand activity recognition plays a significant role in the computer vision field with various applications. Thanks to recent advances in depth sensors, several 3D skeleton-based hand activity recognition methods using supervised Deep Learning (DL) have been proposed, proven effective when a large amount of labeled data is available. However, the annotation of such data remains difficult and costly, which motivates the use of unsupervised methods. We propose in this paper a new approach based on unsupervised domain adaptation (UDA) for 3D skeleton hand activity clustering. It aims at exploiting the knowledge-driven from labeled samples of the source domain to categorize the unlabeled ones of the target domain. To this end, we introduce a novel metric learning-based loss function to learn a highly discriminative representation while preserving a good activity recognition accuracy on the source domain. The learned representation is used as a low-level manifold to cluster unlabeled samples. In addition, to ensure the best clustering results, we proposed a statistical and consensus-clustering-based strategy. The proposed approach is experimented on the real-world FPHA data set.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 69.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 89.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free ship** worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Sridhar, S., Feit, A.M., Theobalt, C., Oulasvirta, A.: Investigating the dexterity of multi-finger input for mid-air text entry. In: CHI 2015 (2015)

    Google Scholar 

  2. Ramirez-Amaro, K., Beetz, M., Cheng, G.: Transferring skills to humanoid robots by extracting semantic representations from observations of human activities. Artif. Intell. 247, 95–118 (2017)

    Article  MathSciNet  Google Scholar 

  3. Surie, D., Pederson, T., Lagriffoul, F., Janlert, L.-E., Sjölie, D.: Activity recognition using an egocentric perspective of everyday objects. In: Indulska, J., Ma, J., Yang, L.T., Ungerer, T., Cao, J. (eds.) UIC 2007. LNCS, vol. 4611, pp. 246–257. Springer, Heidelberg (2007). https://doi.org/10.1007/978-3-540-73549-6_25

    Chapter  Google Scholar 

  4. Bambach, S.: A survey on recent advances of computer vision algorithms for egocentric video. Ar**v, abs/1501.02825 (2015)

    Google Scholar 

  5. Du, Y., Wang, W., Wang, L.: Hierarchical recurrent neural network for skeleton based action recognition. In: 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1110–1118 (2015)

    Google Scholar 

  6. Soomro, K., Shah, M.: Unsupervised action discovery and localization in videos. In: 2017 IEEE International Conference on Computer Vision (ICCV), pp. 696–705 (2017)

    Google Scholar 

  7. Garcia-Hernando, G., Yuan, S., Baek, S., Kim, T.-K.: First-person hand action benchmark with RGB-D videos and 3D hand pose annotations. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 409–419 (2018)

    Google Scholar 

  8. Liu, W., Wen, Y., Yu, Z., Li, M., Raj, B., Song, L.: SphereFace: deep hypersphere embedding for face recognition. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 6738–6746 (2017)

    Google Scholar 

  9. Wang, H., et al.: CosFace: large margin cosine loss for deep face recognition. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5265–5274 (2018)

    Google Scholar 

  10. Deng, J., Guo, J., Zafeiriou, S.: ArcFace: additive angular margin loss for deep face recognition. In: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 4685–4694 (2019)

    Google Scholar 

  11. Tzeng, E., Hoffman, J., Zhang, N., Saenko, K., Darrell, T.: Deep domain confusion: maximizing for domain invariance. Ar**v, abs/1412.3474 (2014)

    Google Scholar 

  12. Saito, K., Ushiku, Y., Harada, T., Saenko, K.: Strong-weak distribution alignment for adaptive object detection. In: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 6949–6958 (2019)

    Google Scholar 

  13. Sankaranarayanan, S., Balaji, Y., Jain, A., Lim, S.-N., Chellappa, R.: Learning from synthetic data: addressing domain shift for semantic segmentation. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3752–3761 (2018)

    Google Scholar 

  14. Pan, S.J., Tsang, I.W.-H., Kwok, J.T., Yang, Q.: Domain adaptation via transfer component analysis. IEEE Trans. Neural Netw. 22, 199–210 (2011)

    Article  Google Scholar 

  15. Tzeng, E., Hoffman, J., Saenko, K., Darrell, T.: Adversarial discriminative domain adaptation. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2962–2971 (2017)

    Google Scholar 

  16. Pinheiro, P.H.O.: Unsupervised domain adaptation with similarity learning. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8004–8013 (2018)

    Google Scholar 

  17. Laradji, I.H., Babanezhad, R.: M-ADDA: unsupervised domain adaptation with deep metric learning. Ar**v, abs/1807.02552 (2020)

    Google Scholar 

  18. Long, M., Cao, Y., Wang, J., Jordan, M.I.: Learning transferable features with deep adaptation networks. Ar**v, abs/1502.02791 (2015)

    Google Scholar 

  19. Li, L., Wang, M., Ni, B., Wang, H., Yang, J., Zhang, W.: 3D human action representation learning via cross-view consistency pursuit. In: 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 4739–4748 (2021)

    Google Scholar 

  20. Rao, H., Shihao, X., ****, H., Cheng, J., Bin, H.: Augmented skeleton based contrastive action learning with momentum LSTM for unsupervised action recognition. Inf. Sci. 569, 90–109 (2021)

    Article  Google Scholar 

  21. Bhatnagar, B.L., Singh, S., Arora, C., Jawahar, C.V.: Unsupervised learning of deep feature representation for clustering egocentric actions. In: IJCAI (2017)

    Google Scholar 

  22. Boutaleb, Y., Soladié, C., Duong, N.-D., Kacete, A., Royan, J., Séguier, R.: Efficient multi-stream temporal learning and post-fusion strategy for 3D skeleton-based hand activity recognition. In: VISIGRAPP (2021)

    Google Scholar 

  23. Hadsell, R., Chopra, S., LeCun, Y.: Dimensionality reduction by learning an invariant map**. In: 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2006), vol. 2, pp. 1735–1742 (2006)

    Google Scholar 

  24. Wen, Y., Zhang, K., Li, Z., Qiao, Yu.: A discriminative feature learning approach for deep face recognition. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9911, pp. 499–515. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46478-7_31

    Chapter  Google Scholar 

  25. Hu, L.P., Bao, X.L., Wang, Q.: The repetition principle in scientific research. Zhong xi yi jie he xue bao = J. Chin. Integr. Med. 9(9), 937–940 (2011)

    Google Scholar 

  26. Strehl, A., Ghosh, J.: Cluster ensembles – a knowledge reuse framework for combining multiple partitions. J. Mach. Learn. Res. 3, 583–617 (2002)

    MathSciNet  MATH  Google Scholar 

  27. Chalamalla, A.: A survey on consensus clustering techniques (2010)

    Google Scholar 

  28. Pereyra, G., Tucker, G., Chorowski, J., Kaiser, L., Hinton, G.E.: Regularizing neural networks by penalizing confident output distributions. Ar**v, abs/1701.06548 (2017)

    Google Scholar 

  29. van der Maaten, L., Hinton, G.E.: Visualizing data using t-SNE. J. Mach. Learn. Res. 9, 2579–2605 (2008)

    MATH  Google Scholar 

  30. Frey, B.J., Dueck, D.: Clustering by passing messages between data points. Science 315, 972–976 (2007)

    Article  MathSciNet  Google Scholar 

  31. Rosenberg, A., Hirschberg, J.: V-measure: a conditional entropy-based external cluster evaluation measure. In: EMNLP (2007)

    Google Scholar 

  32. Romano, S., Bailey, J., Nguyen, X.V., Verspoor, K.M.: Standardized mutual information for clustering comparisons: one step further in adjustment for chance. In: ICML (2014)

    Google Scholar 

  33. Zhang, W., Zhao, D., Wang, X.: Agglomerative clustering via maximum incremental path integral. Pattern Recogn. 46, 3056–3065 (2013)

    Article  Google Scholar 

  34. Yang, Y., Dong, X., Nie, F., Yan, S., Zhuang, Y.: Image clustering using local discriminant models and global integration. IEEE Trans. Image Process. 19, 2761–2773 (2010)

    Article  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yasser Boutaleb .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Boutaleb, Y., Soladié, C., Duong, Nd., Kacete, A., Royan, J., Seguier, R. (2022). Metric Learning-Based Unsupervised Domain Adaptation for 3D Skeleton Hand Activities Categorization. In: Sclaroff, S., Distante, C., Leo, M., Farinella, G.M., Tombari, F. (eds) Image Analysis and Processing – ICIAP 2022. ICIAP 2022. Lecture Notes in Computer Science, vol 13233. Springer, Cham. https://doi.org/10.1007/978-3-031-06433-3_6

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-06433-3_6

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-06432-6

  • Online ISBN: 978-3-031-06433-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Navigation