Abstract
First-person hand activity recognition plays a significant role in the computer vision field with various applications. Thanks to recent advances in depth sensors, several 3D skeleton-based hand activity recognition methods using supervised Deep Learning (DL) have been proposed, proven effective when a large amount of labeled data is available. However, the annotation of such data remains difficult and costly, which motivates the use of unsupervised methods. We propose in this paper a new approach based on unsupervised domain adaptation (UDA) for 3D skeleton hand activity clustering. It aims at exploiting the knowledge-driven from labeled samples of the source domain to categorize the unlabeled ones of the target domain. To this end, we introduce a novel metric learning-based loss function to learn a highly discriminative representation while preserving a good activity recognition accuracy on the source domain. The learned representation is used as a low-level manifold to cluster unlabeled samples. In addition, to ensure the best clustering results, we proposed a statistical and consensus-clustering-based strategy. The proposed approach is experimented on the real-world FPHA data set.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Sridhar, S., Feit, A.M., Theobalt, C., Oulasvirta, A.: Investigating the dexterity of multi-finger input for mid-air text entry. In: CHI 2015 (2015)
Ramirez-Amaro, K., Beetz, M., Cheng, G.: Transferring skills to humanoid robots by extracting semantic representations from observations of human activities. Artif. Intell. 247, 95–118 (2017)
Surie, D., Pederson, T., Lagriffoul, F., Janlert, L.-E., Sjölie, D.: Activity recognition using an egocentric perspective of everyday objects. In: Indulska, J., Ma, J., Yang, L.T., Ungerer, T., Cao, J. (eds.) UIC 2007. LNCS, vol. 4611, pp. 246–257. Springer, Heidelberg (2007). https://doi.org/10.1007/978-3-540-73549-6_25
Bambach, S.: A survey on recent advances of computer vision algorithms for egocentric video. Ar**v, abs/1501.02825 (2015)
Du, Y., Wang, W., Wang, L.: Hierarchical recurrent neural network for skeleton based action recognition. In: 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1110–1118 (2015)
Soomro, K., Shah, M.: Unsupervised action discovery and localization in videos. In: 2017 IEEE International Conference on Computer Vision (ICCV), pp. 696–705 (2017)
Garcia-Hernando, G., Yuan, S., Baek, S., Kim, T.-K.: First-person hand action benchmark with RGB-D videos and 3D hand pose annotations. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 409–419 (2018)
Liu, W., Wen, Y., Yu, Z., Li, M., Raj, B., Song, L.: SphereFace: deep hypersphere embedding for face recognition. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 6738–6746 (2017)
Wang, H., et al.: CosFace: large margin cosine loss for deep face recognition. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5265–5274 (2018)
Deng, J., Guo, J., Zafeiriou, S.: ArcFace: additive angular margin loss for deep face recognition. In: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 4685–4694 (2019)
Tzeng, E., Hoffman, J., Zhang, N., Saenko, K., Darrell, T.: Deep domain confusion: maximizing for domain invariance. Ar**v, abs/1412.3474 (2014)
Saito, K., Ushiku, Y., Harada, T., Saenko, K.: Strong-weak distribution alignment for adaptive object detection. In: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 6949–6958 (2019)
Sankaranarayanan, S., Balaji, Y., Jain, A., Lim, S.-N., Chellappa, R.: Learning from synthetic data: addressing domain shift for semantic segmentation. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3752–3761 (2018)
Pan, S.J., Tsang, I.W.-H., Kwok, J.T., Yang, Q.: Domain adaptation via transfer component analysis. IEEE Trans. Neural Netw. 22, 199–210 (2011)
Tzeng, E., Hoffman, J., Saenko, K., Darrell, T.: Adversarial discriminative domain adaptation. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2962–2971 (2017)
Pinheiro, P.H.O.: Unsupervised domain adaptation with similarity learning. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8004–8013 (2018)
Laradji, I.H., Babanezhad, R.: M-ADDA: unsupervised domain adaptation with deep metric learning. Ar**v, abs/1807.02552 (2020)
Long, M., Cao, Y., Wang, J., Jordan, M.I.: Learning transferable features with deep adaptation networks. Ar**v, abs/1502.02791 (2015)
Li, L., Wang, M., Ni, B., Wang, H., Yang, J., Zhang, W.: 3D human action representation learning via cross-view consistency pursuit. In: 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 4739–4748 (2021)
Rao, H., Shihao, X., ****, H., Cheng, J., Bin, H.: Augmented skeleton based contrastive action learning with momentum LSTM for unsupervised action recognition. Inf. Sci. 569, 90–109 (2021)
Bhatnagar, B.L., Singh, S., Arora, C., Jawahar, C.V.: Unsupervised learning of deep feature representation for clustering egocentric actions. In: IJCAI (2017)
Boutaleb, Y., Soladié, C., Duong, N.-D., Kacete, A., Royan, J., Séguier, R.: Efficient multi-stream temporal learning and post-fusion strategy for 3D skeleton-based hand activity recognition. In: VISIGRAPP (2021)
Hadsell, R., Chopra, S., LeCun, Y.: Dimensionality reduction by learning an invariant map**. In: 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2006), vol. 2, pp. 1735–1742 (2006)
Wen, Y., Zhang, K., Li, Z., Qiao, Yu.: A discriminative feature learning approach for deep face recognition. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9911, pp. 499–515. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46478-7_31
Hu, L.P., Bao, X.L., Wang, Q.: The repetition principle in scientific research. Zhong xi yi jie he xue bao = J. Chin. Integr. Med. 9(9), 937–940 (2011)
Strehl, A., Ghosh, J.: Cluster ensembles – a knowledge reuse framework for combining multiple partitions. J. Mach. Learn. Res. 3, 583–617 (2002)
Chalamalla, A.: A survey on consensus clustering techniques (2010)
Pereyra, G., Tucker, G., Chorowski, J., Kaiser, L., Hinton, G.E.: Regularizing neural networks by penalizing confident output distributions. Ar**v, abs/1701.06548 (2017)
van der Maaten, L., Hinton, G.E.: Visualizing data using t-SNE. J. Mach. Learn. Res. 9, 2579–2605 (2008)
Frey, B.J., Dueck, D.: Clustering by passing messages between data points. Science 315, 972–976 (2007)
Rosenberg, A., Hirschberg, J.: V-measure: a conditional entropy-based external cluster evaluation measure. In: EMNLP (2007)
Romano, S., Bailey, J., Nguyen, X.V., Verspoor, K.M.: Standardized mutual information for clustering comparisons: one step further in adjustment for chance. In: ICML (2014)
Zhang, W., Zhao, D., Wang, X.: Agglomerative clustering via maximum incremental path integral. Pattern Recogn. 46, 3056–3065 (2013)
Yang, Y., Dong, X., Nie, F., Yan, S., Zhuang, Y.: Image clustering using local discriminant models and global integration. IEEE Trans. Image Process. 19, 2761–2773 (2010)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Boutaleb, Y., Soladié, C., Duong, Nd., Kacete, A., Royan, J., Seguier, R. (2022). Metric Learning-Based Unsupervised Domain Adaptation for 3D Skeleton Hand Activities Categorization. In: Sclaroff, S., Distante, C., Leo, M., Farinella, G.M., Tombari, F. (eds) Image Analysis and Processing – ICIAP 2022. ICIAP 2022. Lecture Notes in Computer Science, vol 13233. Springer, Cham. https://doi.org/10.1007/978-3-031-06433-3_6
Download citation
DOI: https://doi.org/10.1007/978-3-031-06433-3_6
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-06432-6
Online ISBN: 978-3-031-06433-3
eBook Packages: Computer ScienceComputer Science (R0)