Log in

Text localization in digital images using a hybrid method

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Text localization in digital images includes various applications such as routing in auto-driving, postal services, identifying containers in ports, robotic navigation in urban environments, and so on. Multiple methods such as Stroke Width Transform (SWT), Local Adaptive Thresholding (LAT), and Maximally Stable Extremal Regions (MSER) have already been noticed by many researchers. These methods address specific challenges such as text orientation, font variations, size, and non-uniform illuminations. Additionally, deep methods with high accuracy are not optimal regarding computation time and data required for training. Our research considers the SWT, LAT, and MSER methods as a pre-processing step with some improvement techniques to simultaneously deal with the challenges above and improve the text localization result. Maximally Stable Extremal Regions, multilevel binarization, and non-maximum suppression techniques have been used in our proposed approach. Our proposed approach has reasonable results compared with the none-deep state-of-the-art algorithms and deep learning-based methods on the ICDAR 2013 dataset. Recall, precision, and f-measure are 0.7442, 0.9116, and 0.8195, respectively.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price includes VAT (Germany)

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14

Similar content being viewed by others

References

  1. Agrahari A, Ghosh R (2020) Multi-oriented text detection in natural scene images based on the intersection of MSER with the locally binarized image. Proc Comput Sci 171:322–330. https://doi.org/10.1016/j.procs.2020.04.033

    Article  Google Scholar 

  2. Ansari GJ, Shah JH, Yasmin M, Sharif M, Fernandes SL (2018) A novel machine learning approach for scene text extraction. Future Gener Comput Syst 87:328–340. https://doi.org/10.1016/j.future.2018.04.074

    Article  Google Scholar 

  3. Bandyopadhyay A, Hakim D, Funk BE, Kohn EA, Teolis CA, Blankenship G (2016) System and method for locating, tracking, and/or monitoring the status of personnel and/or assets both indoors and outdoors. Patent US9448072B2

  4. Chen K, Yin F, Hussain A, Liu C (2015) Efficient text localization in born-digital images by local contrast-based segmentation. In 2015 13th International Conference on Document Analysis and Recognition (ICDAR), 23–26 Aug. 2015 pp 291–295. https://doi.org/10.1109/ICDAR.2015.7333770

  5. Epshtein B, Ofek E, Wexler Y (2010) Detecting text in natural scenes with stroke width transform. In 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition 13–18 June 2010 pp 2963–2970. https://doi.org/10.1109/CVPR.2010.5540041

  6. Felzenszwalb P, McAllester D, Ramanan D (2008) A discriminatively trained, multiscale, deformable part model. In 2008 IEEE Conference on Computer Vision and Pattern Recognition, 23–28 June 2008 pp 1–8. https://doi.org/10.1109/CVPR.2008.4587597

  7. Jung K, In Kim K, Jain AK (2004) Text information extraction in images and video: a survey. Pattern Recogn 37(5):977–997. https://doi.org/10.1016/j.patcog.2003.10.012

    Article  Google Scholar 

  8. Karatzas D et al (2013) ICDAR 2013 robust reading competition. In 2013 12th International Conference on Document Analysis and Recognition, 25–28 Aug 2013 pp 1484–1493. https://doi.org/10.1109/ICDAR.2013.221

  9. Khlif W, Nayef N, Burie J, Ogier J, Alimi A (2018) Learning text component features via convolutional neural networks for scene text detection. In 2018 13th IAPR International Workshop on Document Analysis Systems (DAS), 24–27 April 2018 pp 79–84. https://doi.org/10.1109/DAS.2018.65

  10. Kumuda T, Lingappa B (2015) Detection and localization of text from natural scene images using texture features pp 1–4

  11. Leibin G, Jizheng C (2017) Natural scene text detection based on SWT, MSER and candidate classification. In 2017 2nd International Conference on Image, Vision and Computing (ICIVC), 2–4 June 2017 pp 26–30. https://doi.org/10.1109/ICIVC.2017.7984452

  12. Liang J, Doermann D, Li H (2005) Camera-based analysis of text and documents: a survey. Int J Document Anal Recognition (IJDAR) 7(2):84–104. https://doi.org/10.1007/s10032-004-0138-z

    Article  Google Scholar 

  13. Long S, He X, Yao C (2020) Scene text detection and recognition: the deep learning era. Int J Comput Vis. https://doi.org/10.1007/s11263-020-01369-0

  14. Lyu P, Liao M, Yao C, Wu W, Bai X (2018) Mask text spotter: an end-to-end trainable neural network for spotting text with arbitrary shapes. IEEE Trans Pattern Anal Mach Intell 43:532–548

    Google Scholar 

  15. Lyu P, Yao C, Wu W, Yan S, Bai X (2018) Multi-oriented scene text detection via corner localization and region segmentation pp 7553–7563

  16. Msr: Multi-scale shape regression for scene text detection (2019)

  17. Neumann L, Matas J (2015) Efficient scene text localization and recognition with local character refinement. In 2015 13th International Conference on Document Analysis and Recognition (ICDAR), 23–26 Aug pp 746–750. https://doi.org/10.1109/ICDAR.2015.7333861

  18. Neumann L, Matas J (2016) Real-time lexicon-free scene text localization and recognition. IEEE Trans Pattern Anal Mach Intell 38(9):1872–1885. https://doi.org/10.1109/TPAMI.2015.2496234

    Article  Google Scholar 

  19. Shavitt Y, Zilberman N (2011) A geolocation databases study. IEEE J Sel Areas Commun 29(10):2044–2056. https://doi.org/10.1109/JSAC.2011.111214

    Article  Google Scholar 

  20. Shores TS (2018) Applied linear algebra and matrix analysis. Springer, Cham

    Book  Google Scholar 

  21. Sun Y, Dawut A, Hamdulla A (2018) A review: text detection in natural scene image. In 2018 3rd International Conference on Smart City and Systems Engineering (ICSCSE), 29–30 Dec 2018, pp 826–829. https://doi.org/10.1109/ICSCSE.2018.00178

  22. Text detection and localization in natural scene images based on text awareness score 49(4) (2019)

  23. Text detection based on edge enhanced contrast extremal region and tensor voting in natural scene images. Smart Media J 6(4):32–40 (2017)

  24. Viola P, Jones M (2001) Rapid object detection using a boosted cascade of simple features. In proceedings of the 2001 IEEE computer society conference on computer vision and pattern recognition. CVPR 2001, 8–14 Dec. 2001 vol 1 pp I-I. https://doi.org/10.1109/cvpr.2001.990517

  25. Wang Y, Shi C, **ao B, Wang C, Qi C (2018) CRF based text detection for natural scene images using convolutional neural network and context information. Neurocomputing 295:46–58. https://doi.org/10.1016/j.neucom.2017.12.058

    Article  Google Scholar 

  26. Wu Y, Natarajan P (2017) Self-organized text detection with minimal post-processing via border learning. In 2017 IEEE International Conference on Computer Vision (ICCV), 22–29 Oct 2017 pp 5010–5019. https://doi.org/10.1109/ICCV.2017.535

  27. **e E, Zang Y, Shao S, Yu G, Yao C, Li G (2019) Scene text detection with supervised pyramid context network. Proc AAAI Conf Artif Intell 33:9038–9045. https://doi.org/10.1609/aaai.v33i01.33019038

    Article  Google Scholar 

  28. **lin C, Jie Y, **g Z, Waibel A (2004) Automatic detection and recognition of signs from natural scenes. IEEE Trans Image Process 13(1):87–99. https://doi.org/10.1109/TIP.2003.819223

    Article  Google Scholar 

  29. Yao C, Bai X, Shi B, Liu W (2014) Strokelets: a learned multi-scale representation for scene text recognition. In 2014 IEEE Conference on Computer Vision and Pattern Recognition, 23–28 pp 4042–4049. https://doi.org/10.1109/CVPR.2014.515

  30. Ye J, Chen Z, Liu J, Du B (2020) TextFuseNet: scene text detection with richer fused features. In Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence, {IJCAI-20}: International Joint Conferences on Artificial Intelligence Organization, pp 516–522. https://doi.org/10.24963/ijcai.2020/72. [Online]. Available: https://doi.org/10.24963/ijcai.2020/72

  31. Yi C, Tian Y (2012) Localizing text in scene images by boundary clustering, stroke segmentation, and string fragment classification. IEEE Trans Image Process 21(9):4256–4268. https://doi.org/10.1109/TIP.2012.2199327

    Article  MathSciNet  MATH  Google Scholar 

  32. Yin X, Yin X, Huang K, Hao H (2014) Robust text detection in natural scene images. IEEE Trans Pattern Anal Mach Intell 36(5):970–983. https://doi.org/10.1109/TPAMI.2013.182

    Article  Google Scholar 

  33. Zhou X et al (2017) EAST: an efficient and accurate scene text detector. In 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 21–26 July 2017 pp 2642–2651. https://doi.org/10.1109/CVPR.2017.283

Download references

Funding

The author(s) received no financial support for this article’s research, authorship, and publication.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Alireza Akoushideh.

Ethics declarations

Conflict of interest

The author(s) declared no potential conflicts of interest concerning this article’s research, authorship, and publication.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Akoushideh, A., Rasoulnejad, S.M.F. & Shahbahrami, A. Text localization in digital images using a hybrid method. Multimed Tools Appl 81, 34047–34066 (2022). https://doi.org/10.1007/s11042-022-13179-2

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-022-13179-2

Keywords

Navigation