Abstract
Person re-identification (Re-ID) is an essential computer vision task retrieving a person of interest across multiple non-overlap** cameras. In recent years, video-based person Re-ID research has become more and more popular. Compared with image-based person Re-ID, it can obtain more feature information from multiple frames such as temporal information. However, video-based person Re-ID still faces challenges such as occlusion, multiple people and target changes. Given the above issues, a network integrating person attributes feature and scene attributes feature with person feature is proposed to assist person Re-ID. In our method, the feature of person attributes and scene attributes is re-weighted, making it possible to make full use of the person attribute feature when it is difficult to extract the feature of the person in some problematic cases. Moreover, a strip pooling operation is applied to the person Re-ID network. The horizontal and vertical contextual information is extracted separately through the strip pooling operation, leading to an increased receptive field and improved the person Re-ID accuracy. Extensive experiments on MARS and DukeMTMC-VID datasets show that the proposed methods achieve competitive results with state-of-art methods.
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11042-023-15719-w/MediaObjects/11042_2023_15719_Fig1_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11042-023-15719-w/MediaObjects/11042_2023_15719_Fig2_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11042-023-15719-w/MediaObjects/11042_2023_15719_Fig3_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11042-023-15719-w/MediaObjects/11042_2023_15719_Fig4_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11042-023-15719-w/MediaObjects/11042_2023_15719_Fig5_HTML.png)
Similar content being viewed by others
References
Ahmed E, Michael J, Marks TK (2015) An improved deep learning architecture for person re-identification. In 2015 IEEE Conf Comput Vis Pattern Recognit (CVPR). p 3908–3916
Aich A, Zheng M, Karanam S, Chen T, Roy-Chowdhury AK, Wu Z (2021) Spatio-Temporal Representation Factorization for Video-based Person Re-Identification. ar**v e-prints. ar**v:2107.11878
Chen Z, Li A, Jiang S, Wang Y (2020) Attribute-aware Identity-hard Triplet Loss for Video-based Person Re-identification. ar**v e-prints, ar**v:2006.07597
Chen Z, Li A, Wang Y (2019) A Temporal Attentive Approach for Video-Based Pedestrian Attribute Recognition. p 209–220. 10
Chung D, Tahboub K, Delp EJ (2017) A two stream siamese convolutional neural network for person re-identification. In 2017 IEEE International Conference on Computer Vision (ICCV). p 1992–2000
Dai J, Zhang P, Wang D, Lu H, Wang H (2019) Video person re-identification by temporal residual learning. IEEE Trans Image Process 28(3):1366–1377
Fu Y, Wang X, Wei Y, Huang T (2018) STA: Spatial-Temporal Attention for Large-Scale Video-based Person Re-Identification. ar**v e-prints. ar**v:1811.04129
Gao J, Nevatia R (2018) Revisiting Temporal Modeling for Video-based Person ReID. ar**v e-prints. ar**v:1805.02104
Hirzer M, Beleznai C, Roth PM, Bischof H (2011) Person re-identification by descriptive and discriminative classification. p 91–102. 05
Hou Q, Zhang L, Cheng M-M, Feng J (2020) Strip pooling: Rethinking spatial pooling for scene parsing. In 2020 IEEE/CVF Conf Comput Vis Pattern Recognit (CVPR). p 4002–4011
Kviatkovsky I, Adam A, Rivlin E (2013) Color invariants for person reidentification. IEEE Transactions on Pattern Analysis and Machine Intelligence 35(7):1622–1634
Layne R, Hospedales T, Gong S (2012) Person re-identification by attributes. volume 2. 01
Li A, Liu L, Wang K, Liu S, Yan S (2015) Clothing attributes assisted person reidentification. IEEE Transactions on Circuits and Systems for Video Technology 25(5):869–878
Li S, Bak S, Carr P, Wang X (2018) Diversity regularized spatiotemporal attention for video-based person re-identification. In 2018 IEEE/CVF Conf Comput Vis Pattern Recognit, p 369–378
Li W, Zhao R, **ao T, Wang X (2014) Deepreid: Deep filter pairing neural network for person re-identification. In 2014 IEEE Conf Comput Vis Pattern Recognit. p 152–159
Liao S, Hu Y, Zhu X, Li SZ (2015) Person re-identification by local maximal occurrence representation and metric learning. In 2015 IEEE Conf Comput Vis Pattern Recognit (CVPR). p 2197–2206
Lin Y, Zheng L, Zheng Z, Wu Y, Hu Z, Yan C, Yang Y (2019) Improving person re-identification by attribute and identity learning. Pattern Recognition 95:151–161
Lin Y, Zheng L, Zheng Z, Wu Y, Hu Z, Yan C, Yang Y (2019) Improving person re-identification by attribute and identity learning. Pattern Recognition 95:151–161
Liu C-T, Wu C-W, Wang Y-C, Chien S-Y (2019) Spatially and temporally efficient non-local attention network for video-based person re-identification. 08
Matsukawa T, Suzuki E (2016) Person re-identification using cnn features learned from combination of attributes. In 2016 23rd Int Conf Pattern Recog (ICPR). p 2428–2433
Ning X, Gong K, Li W, Zhang L, Bai X, Tian S (2021) Feature refinement and filter network for person re-identification. IEEE Transactions on Circuits and Systems for Video Technology 31(9):3391–3402
Oliveira I, Pio J (2009) Object reidentification in multiple cameras system. 12
Paisitkriangkrai S, Shen C, van den Hengel A (2015) Learning to rank in person re-identification with metric ensembles. In 2015 IEEE Conf Comput Vis Pattern Recognit (CVPR). p 1846–1855
Song W, Zheng J, Wu Y, Chen C, Liu F (2019) A two-stage attribute-constraint network for video-based person re-identification. IEEE Access 7:8508–8518
Su C, Yang F, Zhang S, Tian Q, Davis LS, Gao W (2018) Multi-task learning with low rank attribute embedding for multi-camera person re-identification. IEEE Transactions on Pattern Analysis and Machine Intelligence 40(5):1167–1181
Varior R, Haloi M, Wang G (2016) Gated siamese convolutional neural network architecture for human re-identification. volume 9912. p 791–808. 10
Wang F, Zuo W, Lin L, Zhang D, Zhang L (2016) Joint learning of single-image and cross-image representations for person re-identification. In 2016 IEEE Conf Comput Vis Pattern Recognit (CVPR), p 1288–1296
Wang J, Zhu X, Gong S, Li W (2018) Transferable joint attribute-identity deep learning for unsupervised person re-identification. In 2018 IEEE/CVF Conf Comput Vis Pattern Recognit, p 2275–2284
Wu Y, Lin Y, Dong X, Yan Y, Ouyang W, Yang Y (2018) Exploit the unknown gradually: One-shot video-based person re-identification by stepwise learning. In 2018 IEEE/CVF Conf Comput Vis Pattern Recognit. p 5177–5186
Xu B, He L, Liao X, Liu W, Sun Z, Mei T (2020) Black re-id: A head-shoulder descriptor for the challenging problem of person re-identification. p 673–681. 10
Yan C, Pang G, Bai X, Liu C, Ning X, Gu L, Zhou J (2022) Beyond triplet loss: Person re-identification with fine-grained difference-aware pairwise loss. IEEE Trans Multimed 24:1665–1677
Yan Y, Ni B, Song Z, Ma C, Yan Y, Yang X (2017) Person re-identification via recurrent feature aggregation. 01
Yi D, Lei Z, Liao S, Li and SZ (2014) Deep metric learning for person re-identification. In 2014 22nd Int Conf Pattern Recog. p 34–39
Zhang W, Yu X, He X (2018) Learning bidirectional temporal cues for video-based person re-identification. IEEE Transactions on Circuits and Systems for Video Technology 28(10):2768–2776
Zhang Z, Lan C, Zeng W, Chen Z (2020) Multi-granularity reference-aided attentive feature aggregation for video-based person re-identification. In 2020 IEEE/CVF Conf Comput Vis Pattern Recognit (CVPR). p 10404–10413
Zhao R, Ouyang W, Wang X (2014) Learning mid-level filters for person re-identification. In 2014 IEEE Conf Comput Vis Pattern Recognit. p 144–151
Zhao Y, Shen X, ** Z, Lu H, Hua X-s (2019) Attribute-driven feature disentangling and temporal aggregation for video person re-identification. In 2019 IEEE/CVF Conf Comput Vis Pattern Recognit (CVPR). p 4908–4917
Zheng W-S, Gong S, **ang T (2013) Reidentification by relative distance comparison. IEEE Transactions on Pattern Analysis and Machine Intelligence 35(3):653–668
Zheng W-S, Gong S, **ang T (2013) Reidentification by relative distance comparison. IEEE Transactions on Pattern Analysis and Machine Intelligence, 35(3):653–668
Acknowledgements
This work is partly supported by the National Natural Science Foundation of China (No.61876158) and the Fundamental Research Funds for the Central Universities(2682021ZTPY030).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflicts of interest
The authors have no competing interests to declare that are relevant to the content of this article.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Gong, X., Luo, B. Video-based person re-identification with scene and person attributes. Multimed Tools Appl 83, 8117–8128 (2024). https://doi.org/10.1007/s11042-023-15719-w
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-023-15719-w