Text localization in digital images using a hybrid method

Akoushideh, Alireza; Rasoulnejad, Sayed Mohammad Fallah; Shahbahrami, Asadollah

doi:10.1007/s11042-022-13179-2

Text localization in digital images using a hybrid method

Published: 21 April 2022

Volume 81, pages 34047–34066, (2022)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Alireza Akoushideh ORCID: orcid.org/0000-0001-9958-4613¹,
Sayed Mohammad Fallah Rasoulnejad² &
Asadollah Shahbahrami³

154 Accesses
Explore all metrics

Abstract

Text localization in digital images includes various applications such as routing in auto-driving, postal services, identifying containers in ports, robotic navigation in urban environments, and so on. Multiple methods such as Stroke Width Transform (SWT), Local Adaptive Thresholding (LAT), and Maximally Stable Extremal Regions (MSER) have already been noticed by many researchers. These methods address specific challenges such as text orientation, font variations, size, and non-uniform illuminations. Additionally, deep methods with high accuracy are not optimal regarding computation time and data required for training. Our research considers the SWT, LAT, and MSER methods as a pre-processing step with some improvement techniques to simultaneously deal with the challenges above and improve the text localization result. Maximally Stable Extremal Regions, multilevel binarization, and non-maximum suppression techniques have been used in our proposed approach. Our proposed approach has reasonable results compared with the none-deep state-of-the-art algorithms and deep learning-based methods on the ICDAR 2013 dataset. Recall, precision, and f-measure are 0.7442, 0.9116, and 0.8195, respectively.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

EUR 32.99 /Month

Get 10 units per month
Download Article/Chapter or Ebook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Subscribe now

Buy Now

Price includes VAT (Germany)

Instant access to the full article PDF.

Institutional subscriptions

A Deep Learning Approach for Robust, Multi-oriented, and Curved Text Detection

Article 14 November 2022

MOSTL: An Accurate Multi-Oriented Scene Text Localization

Article 19 February 2021

Text detection and localization in scene images: a broad review

Article 16 April 2021

References

Agrahari A, Ghosh R (2020) Multi-oriented text detection in natural scene images based on the intersection of MSER with the locally binarized image. Proc Comput Sci 171:322–330. https://doi.org/10.1016/j.procs.2020.04.033
Article Google Scholar
Ansari GJ, Shah JH, Yasmin M, Sharif M, Fernandes SL (2018) A novel machine learning approach for scene text extraction. Future Gener Comput Syst 87:328–340. https://doi.org/10.1016/j.future.2018.04.074
Article Google Scholar
Bandyopadhyay A, Hakim D, Funk BE, Kohn EA, Teolis CA, Blankenship G (2016) System and method for locating, tracking, and/or monitoring the status of personnel and/or assets both indoors and outdoors. Patent US9448072B2
Chen K, Yin F, Hussain A, Liu C (2015) Efficient text localization in born-digital images by local contrast-based segmentation. In 2015 13th International Conference on Document Analysis and Recognition (ICDAR), 23–26 Aug. 2015 pp 291–295. https://doi.org/10.1109/ICDAR.2015.7333770
Epshtein B, Ofek E, Wexler Y (2010) Detecting text in natural scenes with stroke width transform. In 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition 13–18 June 2010 pp 2963–2970. https://doi.org/10.1109/CVPR.2010.5540041
Felzenszwalb P, McAllester D, Ramanan D (2008) A discriminatively trained, multiscale, deformable part model. In 2008 IEEE Conference on Computer Vision and Pattern Recognition, 23–28 June 2008 pp 1–8. https://doi.org/10.1109/CVPR.2008.4587597
Jung K, In Kim K, Jain AK (2004) Text information extraction in images and video: a survey. Pattern Recogn 37(5):977–997. https://doi.org/10.1016/j.patcog.2003.10.012
Article Google Scholar
Karatzas D et al (2013) ICDAR 2013 robust reading competition. In 2013 12th International Conference on Document Analysis and Recognition, 25–28 Aug 2013 pp 1484–1493. https://doi.org/10.1109/ICDAR.2013.221
Khlif W, Nayef N, Burie J, Ogier J, Alimi A (2018) Learning text component features via convolutional neural networks for scene text detection. In 2018 13th IAPR International Workshop on Document Analysis Systems (DAS), 24–27 April 2018 pp 79–84. https://doi.org/10.1109/DAS.2018.65
Kumuda T, Lingappa B (2015) Detection and localization of text from natural scene images using texture features pp 1–4
Leibin G, Jizheng C (2017) Natural scene text detection based on SWT, MSER and candidate classification. In 2017 2nd International Conference on Image, Vision and Computing (ICIVC), 2–4 June 2017 pp 26–30. https://doi.org/10.1109/ICIVC.2017.7984452
Liang J, Doermann D, Li H (2005) Camera-based analysis of text and documents: a survey. Int J Document Anal Recognition (IJDAR) 7(2):84–104. https://doi.org/10.1007/s10032-004-0138-z
Article Google Scholar
Long S, He X, Yao C (2020) Scene text detection and recognition: the deep learning era. Int J Comput Vis. https://doi.org/10.1007/s11263-020-01369-0
Lyu P, Liao M, Yao C, Wu W, Bai X (2018) Mask text spotter: an end-to-end trainable neural network for spotting text with arbitrary shapes. IEEE Trans Pattern Anal Mach Intell 43:532–548
Google Scholar
Lyu P, Yao C, Wu W, Yan S, Bai X (2018) Multi-oriented scene text detection via corner localization and region segmentation pp 7553–7563
Msr: Multi-scale shape regression for scene text detection (2019)
Neumann L, Matas J (2015) Efficient scene text localization and recognition with local character refinement. In 2015 13th International Conference on Document Analysis and Recognition (ICDAR), 23–26 Aug pp 746–750. https://doi.org/10.1109/ICDAR.2015.7333861
Neumann L, Matas J (2016) Real-time lexicon-free scene text localization and recognition. IEEE Trans Pattern Anal Mach Intell 38(9):1872–1885. https://doi.org/10.1109/TPAMI.2015.2496234
Article Google Scholar
Shavitt Y, Zilberman N (2011) A geolocation databases study. IEEE J Sel Areas Commun 29(10):2044–2056. https://doi.org/10.1109/JSAC.2011.111214
Article Google Scholar
Shores TS (2018) Applied linear algebra and matrix analysis. Springer, Cham
Book Google Scholar
Sun Y, Dawut A, Hamdulla A (2018) A review: text detection in natural scene image. In 2018 3rd International Conference on Smart City and Systems Engineering (ICSCSE), 29–30 Dec 2018, pp 826–829. https://doi.org/10.1109/ICSCSE.2018.00178
Text detection and localization in natural scene images based on text awareness score 49(4) (2019)
Text detection based on edge enhanced contrast extremal region and tensor voting in natural scene images. Smart Media J 6(4):32–40 (2017)
Viola P, Jones M (2001) Rapid object detection using a boosted cascade of simple features. In proceedings of the 2001 IEEE computer society conference on computer vision and pattern recognition. CVPR 2001, 8–14 Dec. 2001 vol 1 pp I-I. https://doi.org/10.1109/cvpr.2001.990517
Wang Y, Shi C, **ao B, Wang C, Qi C (2018) CRF based text detection for natural scene images using convolutional neural network and context information. Neurocomputing 295:46–58. https://doi.org/10.1016/j.neucom.2017.12.058
Article Google Scholar
Wu Y, Natarajan P (2017) Self-organized text detection with minimal post-processing via border learning. In 2017 IEEE International Conference on Computer Vision (ICCV), 22–29 Oct 2017 pp 5010–5019. https://doi.org/10.1109/ICCV.2017.535
**e E, Zang Y, Shao S, Yu G, Yao C, Li G (2019) Scene text detection with supervised pyramid context network. Proc AAAI Conf Artif Intell 33:9038–9045. https://doi.org/10.1609/aaai.v33i01.33019038
Article Google Scholar
**lin C, Jie Y, **g Z, Waibel A (2004) Automatic detection and recognition of signs from natural scenes. IEEE Trans Image Process 13(1):87–99. https://doi.org/10.1109/TIP.2003.819223
Article Google Scholar
Yao C, Bai X, Shi B, Liu W (2014) Strokelets: a learned multi-scale representation for scene text recognition. In 2014 IEEE Conference on Computer Vision and Pattern Recognition, 23–28 pp 4042–4049. https://doi.org/10.1109/CVPR.2014.515
Ye J, Chen Z, Liu J, Du B (2020) TextFuseNet: scene text detection with richer fused features. In Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence, {IJCAI-20}: International Joint Conferences on Artificial Intelligence Organization, pp 516–522. https://doi.org/10.24963/ijcai.2020/72. [Online]. Available: https://doi.org/10.24963/ijcai.2020/72
Yi C, Tian Y (2012) Localizing text in scene images by boundary clustering, stroke segmentation, and string fragment classification. IEEE Trans Image Process 21(9):4256–4268. https://doi.org/10.1109/TIP.2012.2199327
Article MathSciNet MATH Google Scholar
Yin X, Yin X, Huang K, Hao H (2014) Robust text detection in natural scene images. IEEE Trans Pattern Anal Mach Intell 36(5):970–983. https://doi.org/10.1109/TPAMI.2013.182
Article Google Scholar
Zhou X et al (2017) EAST: an efficient and accurate scene text detector. In 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 21–26 July 2017 pp 2642–2651. https://doi.org/10.1109/CVPR.2017.283

Download references

Funding

The author(s) received no financial support for this article’s research, authorship, and publication.

Author information

Authors and Affiliations

Department of Electrical Engineering, Faculty of Shahid-Chamran, Technical and Vocational University (TVU), Guilan branch, Rasht, Iran
Alireza Akoushideh
Department of Computer Science, Lahijan Islamic Azad University, Lahijan, Iran
Sayed Mohammad Fallah Rasoulnejad
Department of Computer Engineering, Faculty of Engineering, University of Guilan, Rasht, Iran
Asadollah Shahbahrami

Authors

Alireza Akoushideh
View author publications
You can also search for this author in PubMed Google Scholar
Sayed Mohammad Fallah Rasoulnejad
View author publications
You can also search for this author in PubMed Google Scholar
Asadollah Shahbahrami
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Alireza Akoushideh.

Ethics declarations

Conflict of interest

The author(s) declared no potential conflicts of interest concerning this article’s research, authorship, and publication.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Akoushideh, A., Rasoulnejad, S.M.F. & Shahbahrami, A. Text localization in digital images using a hybrid method. Multimed Tools Appl 81, 34047–34066 (2022). https://doi.org/10.1007/s11042-022-13179-2

Download citation

Received: 09 July 2021
Revised: 03 March 2022
Accepted: 03 April 2022
Published: 21 April 2022
Issue Date: September 2022
DOI: https://doi.org/10.1007/s11042-022-13179-2

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

EUR 32.99 /Month

Get 10 units per month
Download Article/Chapter or Ebook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Subscribe now

Buy Now

Price includes VAT (Germany)

Instant access to the full article PDF.

Institutional subscriptions

Text localization in digital images using a hybrid method

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

A Deep Learning Approach for Robust, Multi-oriented, and Curved Text Detection

MOSTL: An Accurate Multi-Oriented Scene Text Localization

Text detection and localization in scene images: a broad review

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Text localization in digital images using a hybrid method

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

A Deep Learning Approach for Robust, Multi-oriented, and Curved Text Detection

MOSTL: An Accurate Multi-Oriented Scene Text Localization

Text detection and localization in scene images: a broad review

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation