Abstract
In some real-world visual recognition tasks, instances are generated according to certain standards, which should serve as references during instance recognition. In this paper, we propose a template-centric representation learning (TCRL) framework that uses these standards as templates during recognition. The TCRL framework aims to learn a feature space where each instance is closely centered around its own template and away from the other templates. Within TCRL framework, we propose a template-centric objective function and a template-centric LDA layer, comprising two concrete models TDCNN and TDLDA. Experiments show that our method is superior to other traditional classification methods. The code will be made public after acceptance.
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11042-024-19589-8/MediaObjects/11042_2024_19589_Fig1_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11042-024-19589-8/MediaObjects/11042_2024_19589_Fig2_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11042-024-19589-8/MediaObjects/11042_2024_19589_Fig3_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11042-024-19589-8/MediaObjects/11042_2024_19589_Fig4_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11042-024-19589-8/MediaObjects/11042_2024_19589_Fig5_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11042-024-19589-8/MediaObjects/11042_2024_19589_Fig6_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11042-024-19589-8/MediaObjects/11042_2024_19589_Fig7_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11042-024-19589-8/MediaObjects/11042_2024_19589_Fig8_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11042-024-19589-8/MediaObjects/11042_2024_19589_Fig9_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11042-024-19589-8/MediaObjects/11042_2024_19589_Fig10_HTML.png)
Similar content being viewed by others
Data Availability
All data included in this study are available upon request by contact with the corresponding author.
References
Poggenhans F, Schreiber M, Stiller C (2015) A universal approach to detect and classify road surface markings. In: 2015 IEEE 18th International conference on intelligent transportation systems, pp 1915–1921. https://doi.org/10.1109/ITSC.2015.310
Hoang TM, Nam SH, Park KR (2019) Enhanced detection and recognition of road markings based on adaptive region of interest and deep learning. IEEE Access 109817–109832
Brunelli R (2009) Template matching techniques in pp 307–318. https://doi.org/10.1002/9780470744055
Jurie F, Dhome M et al (2002) Real time robust template matching. In: BMVC, vol. 2002, pp. 123–132
Korman S, Reichman D, Tsur G, Avidan S (2013) Fast-match: fast affine template matching. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2331–2338
Bengio Y, Courville A, Vincent P (2013) Representation learning: a review and new perspectives. IEEE Trans Pattern Anal Mach Intell 35(8):1798–1828
Meng Q, Qian H, Liu Y, Xu Y, Shen Z, Cui L (2023) Unsupervised representation learning for time series: a review. Preprint at ar**v:2308.01578
**g L, Tian Y (2020) Self-supervised visual feature learning with deep neural networks: a survey. IEEE Trans Pattern Anal Mach Intell 43(11):4037–4058
Qian H, Pan SJ, Miao C (2021) Weakly-supervised sensor-based activity segmentation and recognition via learning from distributions. Artif Intell 292:103429
Kaya M, Bilge HŞ (2019) Deep metric learning: a survey. Symmetry 11(9):1066
Wen Y, Zhang K, Li Z, Qiao Y (2016) A discriminative feature learning approach for deep face recognition. In: Computer Vision – ECCV 2016, pp 499–515
Liu W, Wen Y, Yu Z, Yang M (2016) Large-margin softmax loss for convolutional neural networks. Preprint ar**v:1612.02295
Schroff F, Kalenichenko D, Philbin J (2015) Facenet: a unified embedding for face recognition and clustering. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 815–823
Weinberger KQ, Saul LK (2009) Distance metric learning for large margin nearest neighbor classification. J Mach Learn Res 10(2)
Schroff F, Kalenichenko D, Philbin J (2015) Facenet: a unified embedding for face recognition and clustering. In: 2015 IEEE Conference on computer vision and pattern recognition (CVPR), pp 815–823. https://doi.org/10.1109/CVPR.2015.7298682
Dorfer M, Kelz R, Widmer G (2016) Deep linear discriminant analysis. In: Bengio Y, LeCun Y (eds) 4th International Conference on Learning Representations, ICLR 2016, San Juan, Puerto Rico, May 2-4, 2016, Conference Track Proceedings
Peng H, Yu S (2021) Beyond softmax loss: Intra-concentration and inter-separability loss for classification. Neurocomputing 438:155–164
Bartan B, Pilanci M (2022) Neural fisher discriminant analysis: optimal neural network embeddings in polynomial time. In: International conference on machine learning, pp 1647–1663. PMLR
Chang C-C (2023) Fisher’s linear discriminant analysis with space-folding operations. IEEE Trans Pattern Anal Mach Intell
Yan L, Wang Q, Ma S, Wang J, Yu C (2023) Solve the puzzle of instance segmentation in videos: a weakly supervised framework with spatio-temporal collaboration. IEEE Trans Circuits Syst Video Technol 33(1):393–406. https://doi.org/10.1109/TCSVT.2022.3202574
Cao Z, Chu Z, Liu D, Chen Y (2020) A vector-based representation to enhance head pose estimation
Wang W, Han C, Zhou T, Liu D (2023) Visual recognition with deep nearest centroids. In: The eleventh international conference on learning representations, ICLR 2023, Kigali, Rwanda, May 1-5, 2023. OpenReview.net, ???
Wang W, Liang J, Liu D (2022) Learning equivariant segmentation with instance-unique querying. Adv Neural Inf Process Syst 35:12826–12840
Fukunaga K (1990) Introduction to statistical pattern recognition. Academic Press
Boroujeni FR, Wang S, Li Z, West N, Stantic B, Yao L, Long G (2018) Trace ratio optimization with feature correlation mining for multiclass discriminant analysis. In: Proceedings of the thirty-second aaai conference on artificial intelligence, New Orleans, Louisiana, USA, February 2-7, pp 2746–2753
Wang L, Liu Q (2022) Discriminant distance template matching for image recognition. Mach Vis Appl 33(6):91
LeCun Y (1998) The mnist database of handwritten digits. http://yann.lecun.com/exdb/mnist/
Krizhevsky A, Hinton G (2010) Convolutional deep belief networks on cifar-10. Unpublished Manuscript 40(7):1–9
Coates A, Ng A, Lee H (2011) An analysis of single-layer networks in unsupervised feature learning. In: Proceedings of the fourteenth international conference on artificial intelligence and statistics, pp 215–223. JMLR Workshop and Conference Proceedings
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778
Paszke A, Gross S, Massa F, Lerer A, Bradbury J, Chanan G, Killeen T, Lin Z, Gimelshein N, Antiga L et al (2019) Pytorch: An imperative style, high-performance deep learning library. Adv Neural Inf Process Syst 32
Liu Z, Lin Y, Cao Y, Hu H, Wei Y, Zhang Z, Lin S, Guo B (2021) Swin transformer: hierarchical vision transformer using shifted windows. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 10012–10022
Gross R (2005) Face databases. Handbook of face recognition, 301–327
Acknowledgements
This work was supported in part by the Fundamental Research Funds for the Central Universities under Grant B230201025, and in part by the Key Research and Development Program of Changzhou (Social Development) under Grant CE20225042.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of Interest
The authors declare that they have no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Chai, Z., Wang, L., Shi, H. et al. Template-centric deep linear discriminant analysis for visual representation. Multimed Tools Appl (2024). https://doi.org/10.1007/s11042-024-19589-8
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1007/s11042-024-19589-8