Abstract
Text localization and detection within natural scene images have generated significant interest among researchers due to their inherent complexity and various real-life applications. In the last few decades, various methodologies have been developed for localization and detection of wild scene text regions. Among them, Maximally Stable Extremal Regions (MSER) based techniques have achieved remarkable success in a significant variety of text localization tasks over the last decade. MSER is a well-known blob detection method, which has been applied with some modifications in many scene text-related researches. In this paper, we have reviewed and evaluated the concept of MSER methods which are combined with traditional machine learning-based methods using hand-crafted features or deep learning-based methods using automatic feature learning for scene text localization. Different MSER methods, such as standard MSER, MSER with stroke width transform, eMSER, enhanced MSER, multi-level MSER, MSER with CNN features, component splitting with MSER tree, MSER with CNN and CRF, CE-MSER have been described in this study. Finally, we have compared and evaluated the performances of those different types of MSER methods on five publicly available standard scene text datasets, like ICDAR 2003, ICDAR 2013, ICDAR 2015, KAIST, and SVT and provided the insights of appropriate selection of MSER method along with its pros and cons.
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11042-023-17671-1/MediaObjects/11042_2023_17671_Fig1_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11042-023-17671-1/MediaObjects/11042_2023_17671_Fig2_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11042-023-17671-1/MediaObjects/11042_2023_17671_Fig3_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11042-023-17671-1/MediaObjects/11042_2023_17671_Fig4_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11042-023-17671-1/MediaObjects/11042_2023_17671_Fig5_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11042-023-17671-1/MediaObjects/11042_2023_17671_Fig6_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11042-023-17671-1/MediaObjects/11042_2023_17671_Fig7_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11042-023-17671-1/MediaObjects/11042_2023_17671_Fig8_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11042-023-17671-1/MediaObjects/11042_2023_17671_Fig9_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11042-023-17671-1/MediaObjects/11042_2023_17671_Fig10_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11042-023-17671-1/MediaObjects/11042_2023_17671_Fig11_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11042-023-17671-1/MediaObjects/11042_2023_17671_Fig12_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11042-023-17671-1/MediaObjects/11042_2023_17671_Fig13_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11042-023-17671-1/MediaObjects/11042_2023_17671_Fig14_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11042-023-17671-1/MediaObjects/11042_2023_17671_Fig15_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11042-023-17671-1/MediaObjects/11042_2023_17671_Fig16_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11042-023-17671-1/MediaObjects/11042_2023_17671_Fig17_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11042-023-17671-1/MediaObjects/11042_2023_17671_Fig18_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11042-023-17671-1/MediaObjects/11042_2023_17671_Fig19_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11042-023-17671-1/MediaObjects/11042_2023_17671_Fig20_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11042-023-17671-1/MediaObjects/11042_2023_17671_Fig21_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11042-023-17671-1/MediaObjects/11042_2023_17671_Fig22_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11042-023-17671-1/MediaObjects/11042_2023_17671_Fig23_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11042-023-17671-1/MediaObjects/11042_2023_17671_Fig24_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11042-023-17671-1/MediaObjects/11042_2023_17671_Fig25_HTML.png)
Similar content being viewed by others
Data Availability
In the present work, we have used five datasets, a)SVT [76], b)ICDAR2003 [47], c)ICDAR2013 [31], d)ICDAR2015 [32], e) KAIST [28]. The data sets are publicly available at “https://github.com/HCIILAB/Scene-Text-Recognition” and “https://github.com/jianyuheng/myOCR” respectively.
References
Ajay B, Naveena C (2019) A mechanism for detection of text in images using dwt and mser. Integr Intell Comput Commun Secur 669–6760. https://doi.org/10.1007/978-981-10-8797-4_68
Akoushideh A, Rasoulnejad SMF, Shahbahrami A (2022) Text localization in digital images using a hybrid method. Multimed Tools Appl 81(23):34047–34066. https://doi.org/10.1007/s11042-022-13179-2
Ali H (2022) Leveraging machine learning for less developed languages: progress on urdu text detection. ar**v:2209.14022
Awoke A, Tekeba M (2021) Ethiopic and latin multilingual text detection from images using hybrid techniques. Zede J 39(1):71–80
Baek Y, Lee B, Han D et al (2019) Character region awareness for text detection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 9365–9374. https://doi.org/10.1109/CVPR.2019.00959
Bartz C, Yang H, Meinel C (2018) See: towards semi-supervised end-to-end scene text recognition. In: Proceedings of the AAAI conference on artificial intelligence. https://doi.org/10.1609/aaai.v32i1.12242
Busta M, Neumann L, Matas J (2017) Deep textspotter: an end-to-end trainable scene text localization and recognition framework. In: Proceedings of the IEEE international conference on computer vision, pp 2204–2212. https://doi.org/10.1109/ICCV.2017.242
Chaitra Y, Dinesh R (2022) An impact of radon transforms and filtering techniques for text localization in natural scene text images. In: ICT with intelligent applications: proceedings of ICTIS 2021. Springer, vol 1, pp 563–573. https://doi.org/10.1007/978-981-16-4177-0_55
Chen H, Tsai SS, Schroth G et al (2011) Robust text detection in natural images with edge-enhanced maximally stable extremal regions. In: 2011 18th IEEE international conference on image processing. IEEE, pp 2609–2612. https://doi.org/10.1109/ICIP.2011.6116200
Chen X, ** L, Zhu Y et al (2021) Text recognition in the wild: a survey. ACM, New York, vol 54, pp 1–35. https://doi.org/10.1145/3440756
Cho H, Sung M, Jun B (2016) Canny text detector: fast and robust scene text localization algorithm. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3566–3573
Choudhary S, Singh NK, Chichadwani S (2018) Text detection and recognition from scene images using mser and cnn. In: 2018 second international conference on advances in electronics, computers and communications (ICAECC). IEEE, pp 1–4.https://doi.org/10.1109/ICAECC.2018.8479419
Cormen TH, Leiserson CE, Rivest RL et al (2009) Introduction to algorithms. MIT Press. https://doi.org/10.2307/2583667
Das S, Chattopadhyay S, Prasad R et al (2022) Text region identification from natural scene images using semi-supervised mser method. In: Proceedings of 2nd international conference on mathematical modeling and computational science: ICMMCS 2021. Springer, pp 401–408. https://doi.org/10.1007/978-981-19-0182-9_40
Diaz-Escobar J, Kober V (2020) Natural scene text detection and segmentation using phase-based regions and character retrieval. Math Probl Eng 2020:1–17. https://doi.org/10.1155/2020/7067251
El Abbadi NK et al (2023) Scene text detection and recognition by using multi-level features extractions based on you only once version five (yolov5) and maximally stable extremal regions (msers) with optical character recognition (ocr). Al-Salam J Eng Technol 2(1):13–27. https://doi.org/10.55145/ajest.2023.01.01.002
Ghosh J, Talukdar AK, Sarma KK (2023) A light-weight natural scene text detection and recognition system. Multimed Tools Appl 1–33. https://doi.org/10.1007/s11042-023-15696-0
Goud DS, Vigneshwari M, Aparna P et al (2022) Text localization and recognition from natural scene images using ai. In: 2022 International conference on automation, computing and renewable systems (ICACRS). IEEE, pp 1153–1158. https://doi.org/10.1109/ICACRS55517.2022.10029220
Gupta N, Jalal AS (2019) A robust model for salient text detection in natural scene images using mser feature detector and grabcut. Multimed Tools Appl 8:10821–10835. https://doi.org/10.1007/s11042-018-6613-1
He K, Sun J, Tang X (2012) Guided image filtering. IEEE Trans Pattern Anal Mach Intell 35(6):1397–1409. https://doi.org/10.1109/TPAMI.2012.213
He M, Liao M, Yang Z et al (2021) Most: a multi-oriented scene text detector with localization refinement. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 8813–8822. https://doi.org/10.1109/CVPR46437.2021.00870
He T, Huang W, Qiao Y et al (2016) Text-attentional convolutional neural network for scene text detection. IEEE Trans Image Process 25(6):2529–2541. https://doi.org/10.1109/TIP.2016.2547588
He W, Zhang XY, Yin F et al (2017) Deep direct regression for multi-oriented scene text detection. In: Proceedings of the IEEE international conference on computer vision, pp 745–753. https://doi.org/10.1109/ICCV.2017.87
Huang W, Qiao Y, Tang X (2014) Robust scene text detection with convolution neural network induced mser trees. In: European conference on computer vision. Springer, pp 497–511. https://doi.org/10.1007/978-3-319-10593-2_33
Islam MR, Mondal C, Azam MK et al (2016) Text detection and recognition using enhanced mser detection and a novel ocr technique. In: 2016 5th international conference on informatics, electronics and vision (ICIEV). IEEE, pp 15–20. https://doi.org/10.1109/ICIEV.2016.7760054
Islam R, Islam R, Talukder KH (2020) An enhanced mser pruning algorithm for detection and localization of bangla texts from scene images. Int Arab J Inf Technol 17(3):375–385. https://doi.org/10.34028/iajit/17/3/11
Jiang Y, Zhu X, Wang X et al (2017) R2cnn: rotational region cnn for orientation robust scene text detection. ar**v:1706.09579
Jung J, Lee S, Cho MS et al (2011) Touch tt: scene text extractor using touchscreen interface. ETRI J 33(1):78–88. https://doi.org/10.4218/etrij.11.1510.0029
Jung K, Kim KI, Jain AK (2004) Text information extraction in images and video: a survey. Pattern Recognit 37(5):977–997. https://doi.org/10.1016/j.patcog.2003.10.012
Karaoglu S, Fernando B, Trémeau A (2010) A novel algorithm for text detection and localization in natural scene images. In: 2010 international conference on digital image computing: techniques and applications. IEEE, pp 635–642. https://doi.org/10.1109/DICTA.2010.115
Karatzas D, Shafait F, Uchida S et al (2013) Icdar 2013 robust reading competition. In: 2013 12th international conference on document analysis and recognition. IEEE, pp 1484–1493. https://doi.org/10.1109/ICDAR.2013.221
Karatzas D, Gomez-Bigorda L, Nicolaou A et al (2015) Icdar 2015 competition on robust reading. In: 2015 13th international conference on document analysis and recognition (ICDAR). IEEE, pp 1156–1160. https://doi.org/10.1109/ICDAR.2015.7333942
Khare V, Shivakumara P, Raveendran P et al (2016) A blind deconvolution model for scene text detection and recognition in video. Pattern Recognit 54:128–148. https://doi.org/10.1016/j.patcog.2016.01.008
L1kw1d (2013) Borndigitaltext. Accessed 18 June 2023
Larbi G (2023) Two-step text detection framework in natural scenes based on pseudo-zernike moments and cnn. Multimed Tools Appl 82(7):10595–10616. https://doi.org/10.1007/s11042-022-13690-6
Lee CY, Baek Y, Lee H (2019) Tedeval: a fair evaluation metric for scene text detectors. In: 2019 international conference on document analysis and recognition workshops (ICDARW). IEEE, pp 14–17. https://doi.org/10.1109/icdarw.2019.60125
Li R, Chen S, Zhao F et al (2023) Text detection model for historical documents using cnn and mser. J Database Manag (JDM) 34(1):1–23. https://doi.org/10.4018/jdm.322086
Li Y, Lu H (2012) Scene text detection via stroke width. In: Proceedings of the 21st international conference on pattern recognition (ICPR2012). IEEE, pp 681–684
Li Y, Shen C, Jia W et al (2013) Leveraging surrounding context for scene text detection. In: 2013 IEEE international conference on image processing. IEEE, pp 2264–2268. https://doi.org/10.1109/ICIP.2013.6738467
Li Y, Jia W, Shen C et al (2014) Characterness: an indicator of text in the wild. IEEE Trans Image Process 23(4):1666–1677. https://doi.org/10.1109/TIP.2014.2302896
Liang J, Doermann D, Li H (2005) Camera-based analysis of text and documents: a survey. Int J Doc Anal Recognit (IJDAR) 7(2):84–104. https://doi.org/10.1007/s10032-004-0138-z
Liao M, Shi B, Bai X et al (2017) Textboxes: a fast text detector with a single deep neural network. In: Proceedings of the AAAI conference on artificial intelligence. https://doi.org/10.1609/aaai.v31i1.11196
Liao M, Shi B, Bai X (2018) Textboxes++: a single-shot oriented scene text detector. IEEE Transac Image Process 27(8):3676–3690. https://doi.org/10.1109/TIP.2018.2825107
Liu Y, ** L (2017) Deep matching prior network: toward tighter multi-oriented text detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1962–1969. https://doi.org/10.1109/CVPR.2017.368
Liu Y, ** L, **e Z et al (2019) Tightness-aware evaluation protocol for scene text detection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 9612–9620. https://doi.org/10.1109/CVPR.2019.00984
Long S, He X, Yao C (2021) Scene text detection and recognition: the deep learning era. Int J Comput Vis 129(1):161–184. https://doi.org/10.1007/s11263-020-01369-0
Lucas SM, Panaretos A, Sosa L et al (2005) Icdar 2003 robust reading competitions: entries, results, and future directions. Int J Doc Anal Recognit (IJDAR) 7(2–3):105–122. https://doi.org/10.1007/s10032-004-0134-3
Lundgren A, Castro D, Lima E et al (2019) Octshufflemlt: a compact octave based neural network for end-to-end multilingual text detection and recognition. In: 2019 international conference on document analysis and recognition workshops (ICDARW). IEEE, pp 37–42. https://doi.org/10.1109/ICDARW.2019.30062
Lyu P, Liao M, Yao C et al (2018a) Mask textspotter: an end-to-end trainable neural network for spotting text with arbitrary shapes. In: Proceedings of the European conference on computer vision (ECCV), pp 67–83. https://doi.org/10.1109/TPAMI.2019.2937086
Lyu P, Yao C, Wu W et al (2018b) Multi-oriented scene text detection via corner localization and region segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 7553–7563. https://doi.org/10.1109/CVPR.2018.00788
Ma J, Shao W, Ye H et al (2018) Arbitrary-oriented scene text detection via rotation proposals. IEEE Trans Multimed 20(11):3111–3122. https://doi.org/10.1109/TMM.2018.2818020
Mansouri S, Zrigui S, Zrigui M et al (2021) Text detection in Arabic news video based on mser and retinanet. In: 2021 IEEE/ACS 18th international conference on computer systems and applications (AICCSA). IEEE, pp 1–7. https://doi.org/10.1109/AICCSA53542.2021.9686930
Matas J, Chum O, Urban M et al (2004) Robust wide-baseline stereo from maximally stable extremal regions. Image Vis Comput 22(10):761–767. https://doi.org/10.1016/j.imavis.2004.02.006
Meetei LS, Singh TD, Bandyopadhyay S (2019) Extraction and identification of manipuri and mizo texts from scene and document images. In: International conference on pattern recognition and machine intelligence. Springer, pp 405–414
Naiemi F, Ghods V, Khalesi H (2020) Scene text detection using enhanced extremal region and convolutional neural network. Multimed Tools Appl 79:27137–27159. https://doi.org/10.1007/s11042-020-09318-2
Naiemi F, Ghods V, Khalesi H (2021) A novel pipeline framework for multi oriented scene text image detection and recognition. Expert Syst Appl 170:114549. https://doi.org/10.1016/j.eswa.2020.114549
Nayef N, Yin F, Bizid I et al (2017) Icdar2017 robust reading challenge on multi-lingual scene text detection and script identification-rrc-mlt. In: 2017 14th IAPR international conference on document analysis and recognition (ICDAR). IEEE, pp 1454–1459. https://doi.org/10.1109/ICDAR.2017.237
Nayef N, Patel Y, Busta M et al (2019) Icdar2019 robust reading challenge on multi-lingual scene text detection and recognition-rrc-mlt-2019. In: 2019 International conference on document analysis and recognition (ICDAR). IEEE, pp 1582–1587. https://doi.org/10.1109/ICDAR.2019.00254
Panda S, Ash S, Chakraborty N et al (2020) Parameter tuning in mser for text localization in multi-lingual camera-captured scene text images. In: Computational intelligence in pattern recognition: proceedings of CIPR 2019. Springer, pp 999–1009. https://doi.org/10.1007/978-981-13-9042-5_86
Qin L, Shivakumara P, Lu T et al (2016) Video scene text frames categorization for text detection and recognition. In: 2016 23rd international conference on pattern recognition (ICPR). IEEE, pp 3886–3891. https://doi.org/10.1109/ICPR.2016.7900241
Shahab A, Shafait F, Dengel A (2011) Icdar 2011 robust reading competition challenge 2: Reading text in scene images. In: 2011 international conference on document analysis and recognition. IEEE, pp 1491–1496
Shi C, Wang C, **ao B et al (2014) End-to-end scene text recognition using tree-structured models. Pattern Recognit 47(9):2853–2866. https://doi.org/10.1016/j.patcog.2014.03.023
Shivakumara P, Phan TQ, Tan CL (2010) A laplacian approach to multi-oriented text detection in video. IEEE Trans Pattern Anal Mach Intell 33(2):412–419. https://doi.org/10.1109/TPAMI.2010.166
Shivakumara P, Sreedhar RP, Phan TQ et al (2012) Multioriented video scene text detection through bayesian classification and boundary growing. IEEE Trans Circ Syst Vid Technol 22(8):1227–1235. https://doi.org/10.1109/TCSVT.2012.2198129
Soni R, Kumar B, Chand S (2017) Text detection and localization in natural scene images using mser and fast guided filter. In: 2017 fourth international conference on image information processing (ICIIP). IEEE, pp 1–6. https://doi.org/10.1109/ICIIP.2017.8313739
Tabassum A, Dhondse SA (2015) Text detection using mser and stroke width transform. In: 2015 fifth international conference on communication systems and network technologies. IEEE, pp 568–571. https://doi.org/10.1109/CSNT.2015.154
Thilagavathy A, Chilambuchelvan A (2019) Fuzzy based edge enhanced text detection algorithm using mser. Clust Comput 22(5):11681–11687. https://doi.org/10.1007/s10586-017-1448-5
Tian S, Lu S, Su B et al (2014) Scene text segmentation with multi-level maximally stable extremal regions. In: 2014 22nd international conference on pattern recognition. IEEE, pp 2703–2708. https://doi.org/10.1109/ICPR.2014.467
Tian S, Pan Y, Huang C et al (2015) Text flow: a unified text detection system in natural scene images. In: Proceedings of the IEEE international conference on computer vision, pp 4651–4659. https://doi.org/10.1109/ICCV.2015.528
Tong G, Dong M, Sun X et al (2022) Natural scene text detection and recognition based on saturation-incorporated multi-channel mser. Knowl-Based Syst 250:109040. https://doi.org/10.1016/j.knosys.2022.109040
Turki H, Halima MB, Alimi AM (2017a) A hybrid method of natural scene text detection using msers masks in hsv space color. In: Ninth international conference on machine vision (ICMV 2016). International Society for Optics and Photonics, p 1034111. https://doi.org/10.1117/12.2268993
Turki H, Halima MB, Alimi AM (2017b) Text detection based on mser and cnn features. In: 2017 14th IAPR international conference on document analysis and recognition (ICDAR). IEEE, pp 949–954. https://doi.org/10.1109/ICDAR.2017.159
Vishnoitanuj (2020) Handwritten-text. Accessed 18 June 2023
Wan Y, Wang X, Lu D (2019) Research on key technology of Chinese text localization in natural scenes. In: Recent developments in intelligent computing, communication and devices: proceedings of ICCD 2017. Springer, pp 387–397. https://doi.org/10.1007/978-981-10-8944-2_45
Wang K, Babenko B, Belongie S (2011) End-to-end scene text recognition. In: 2011 international conference on computer vision. IEEE, pp 1457–1464. https://doi.org/10.1109/ICCV.2011.6126402
Wang T, Wu DJ, Coates A et al (2012) End-to-end text recognition with convolutional neural networks. In: Proceedings of the 21st international conference on pattern recognition (ICPR2012). IEEE, pp 3304–3308
Wang X, Song Y, Zhang Y et al (2017) A hierarchical recursive method for text detection in natural scene images. Multimed Tools Appl 76:26201–26223. https://doi.org/10.1007/s11042-016-4099-2
Wang Y, Shi C, **ao B et al (2018) Crf based text detection for natural scene images using convolutional neural network and context information. Neurocomputing 295:46–58. https://doi.org/10.1016/j.neucom.2017.12.058
Wang Y, **e H, Zha ZJ et al (2020) Contournet: taking a further step toward accurate arbitrary-shaped scene text detection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 11753–11762. https://doi.org/10.1109/CVPR42600.2020.01177
Wolf C, Jolion JM (2006) Object count/area graphs for the evaluation of object detection and segmentation algorithms. Int J Doc Anal Recognit (IJDAR) 8(4):280–296. https://doi.org/10.1007/s10032-006-0014-0
**e E, Zang Y, Shao S et al (2019) Scene text detection with supervised pyramid context network. In: Proceedings of the AAAI conference on artificial intelligence, pp 9038–9045. https://doi.org/10.1609/aaai.v33i01.33019038
Yao C, Bai X, Liu W et al (2012) Detecting texts of arbitrary orientations in natural images. In: 2012 IEEE conference on computer vision and pattern recognition. IEEE, pp 1083–1090. https://doi.org/10.1109/CVPR.2012.6247787
Yao C, Bai X, Sang N et al (2016) Scene text detection via holistic, multi-channel prediction. ar**v:1606.09002
Ye Q, Doermann D (2014) Text detection and recognition in imagery: a survey. IEEE Trans Pattern Anal Mach Intell 37(7):1480–1500. https://doi.org/10.1109/TPAMI.2014.2366765
Yin XC, Zuo ZY, Tian S et al (2016) Text detection, tracking and recognition in video: a comprehensive survey. IEEE Trans Image Process 25(6):2752–2773. https://doi.org/10.1109/TIP.2016.2554321
Zhan F, Xue C, Lu S (2019) Ga-dan: geometry-aware domain adaptation network for scene text detection and recognition. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 9105–9115. https://doi.org/10.1109/ICCV.2019.00920
Zhang SX, Zhu X, Hou JB et al (2020) Deep relational reasoning graph network for arbitrary shape text detection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 9699–9708. https://doi.org/10.1109/CVPR42600.2020.00972
Zhang X, Gao X, Tian C (2018) Text detection in natural scene images based on color prior guided mser. Neurocomputing 307:61–71. https://doi.org/10.1016/j.neucom.2018.03.070
Zhang Y, Huang Y, Zhao D et al (2021) A scene text detector based on deep feature merging. Multimed Tools Appl 80(19):29005–29016. https://doi.org/10.1007/s11042-021-11101-w
Zhang Z, Shen W, Yao C et al (2015) Symmetry-based text line detection in natural scenes. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2558–2567. https://doi.org/10.1109/CVPR.2015.7298871
Zhou X, Yao C, Wen H et al (2017) East: an efficient and accurate scene text detector. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5551–5560. https://doi.org/10.1109/CVPR.2017.283
Zhu W, Lou J, Chen L et al (2017) Scene text detection via extremal region based double threshold convolutional network classification. PloS One 12(8):e0182227. https://doi.org/10.1371/journal.pone.0182227
Zhu Y, Yao C, Bai X (2016) Scene text detection and recognition: recent advances and future trends. Front Comput Sci 10(1):19–36. https://doi.org/10.1007/s11704-015-4488-0
Zuo LQ, Sun HM, Mao QC et al (2019) Natural scene text recognition based on encoder-decoder framework. IEEE Access 7:62616–62623. https://doi.org/10.1109/ACCESS.2019.2916616
Acknowledgements
This work is partially supported by SERB (DST), Govt. of India (Ref. no. EEQ/2018/000-963) and done at CMATER Laboratory, Dept. of CSE, Jadavpur University, Kolkata.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflicts of interest
The authors declare that they have no conflict of interest. The authors also declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Dutta, K., Sarkhel, R., Kundu, M. et al. Natural scene text localization and detection using MSER and its variants: a comprehensive survey. Multimed Tools Appl 83, 55773–55810 (2024). https://doi.org/10.1007/s11042-023-17671-1
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-023-17671-1