Log in

Text recuperated using ontology with stable marriage optimization technique and text visualization using AR

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Written text stands as a cornerstone of communication in our daily lives. However, it is not uncommon for letters to be marred by obscurities, blurriness, erasures, or obstructions, which can lead to misinterpretation and convey unintended meanings. In this study, we present a comprehensive solution to rectify this challenge, comprising three pivotal phases. In the initial phase, we employ an advanced Deep Learning-based text detection and recognition method, specifically utilizing the Text-Block technique to pinpoint textual blocks. In the subsequent phase, we employ a robust combination of database and ontology to reconstruct unclear words. The final phase involves transforming the recovered word into a vivid 3D object through Augmented Reality, leveraging the Vuforia engine. This visualization technique aids visually impaired individuals with inaccurate word comprehension. To validate our approach, we rigorously compared our text detection and recognition methods against prevailing state-of-the-art techniques, achieving unmatched precision. Furthermore, we administered a comprehensive questionnaire to a cohort of visually impaired participants, evaluating the solution against key metrics such as user experience, satisfaction, efficiency, and effectiveness. The results from this survey unequivocally demonstrate the superior quality and efficacy of our proposed methodology.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price includes VAT (Germany)

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12

Similar content being viewed by others

References

  1. Lioupis P, Dadoukis A, Maltezos E, Karagiannidis L, Amditis A, Gonzalez M, Martin J, Cantero D, Larrañaga M (2022) Embedded intelligence for safety and security machine vision applications. In: International conference on image analysis and processing, Springer, pp 37–46

  2. Ouali I, Fourati R, Halima MB, Wali A (2023) A novel method for arabic text detection with interactive visualization. In: 2023 IEEE Symposium on computers and communications (ISCC), IEEE, pp 1046–1050

  3. Kumar P, Rawat P, Chauhan S (2022) Contrastive self-supervised learning: review, progress, challenges and future research directions. Int J Multimed Inf Retrieval 1–28

  4. Bi C, Hu N, Zou Y, Zhang S, Xu S, Yu H (2022) Development of deep learning methodology for maize seed variety recognition based on improved swin transformer. Agronomy 12:1843

  5. Diamantopoulos T, Roth M, Symeonidis A, Klein E (2017) Software requirements as an application domain for natural language processing. Lang Resour Eval 51:495–524

    Article  Google Scholar 

  6. Paredes-Valverde MA, Valencia-García R, Rodríguez-García MÁ, Colomo-Palacios R, Alor-Hernández G (2016) A semantic-based approach for querying linked data using natural language. J Inf Sci 42:851–862

    Article  Google Scholar 

  7. Ouali I, Halima MB, Ali W (2022) Augmented reality for scene text recognition, visualization and reading to assist visually impaired people. Procedia Comput Sci 176:602–611

    Article  Google Scholar 

  8. Ouali I, Sassi MSH, Halima MB, Ali W (2020) A new architecture based ar for detection and recognition of objects and text to enhance navigation of visually impaired people. Procedia Comput Sci 176:602–611

    Article  Google Scholar 

  9. Ouali I, Hadj Sassi MS, Ben Halima M, Wali A (2021) Architecture for real-time visualizing arabic words with diacritics using augmented reality for visually impaired people. In: International conference on advanced information networking and applications, Springer, pp 285–296

  10. Ouali I, Halima MB, Ali W (2022) Real-time application for recognition and visualization of arabic words with vowels based dl and ar. In: 2022 International wireless communications and mobile computing (IWCMC), IEEE, pp 678–683

  11. Ouali I, Halima MB, Wali A (2022) Text detection and recognition using augmented reality and deep learning. In: International conference on advanced information networking and applications, Springer, pp 13–23

  12. Xu H, Wang Q-F, Li Z, Shi Y, Zhou X-D (2022) Texttriangle: An end-to-end textspotter with piecewise linear alignment

  13. Ibrayim M, Mattohti A, Hamdulla A (2022) An effective method for detection and recognition of uyghur texts in images with backgrounds. Information 13:332

  14. Solé Gómez À, García Castaño J, Leškovskỳ P, Otaegui Madurga O (2022) Polyglonet: Multilingual approach for scene text recognition without language constraints. In: International conference on image analysis and processing, Springer, pp 479–490

  15. Dasari SK, Mehta S (2022) Text detection and recognition using fusion neural network architecture. In: 2022 8th International conference on advanced computing and communication systems (ICACCS), vol 1. IEEE, pp 2067–2071

  16. Zhang X, Su Y, Tripathi S, Tu Z (2022) Text spotting transformers. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 9519–9528

  17. Zhong D, Lyu S, Shivakumara P, Pal U, Lu Y (2022) Text proposals with location-awareness-attention network for arbitrarily shaped scene text detection and recognition. Expert Syst Appl 117564

  18. Tong G, Dong M, Sun X, Song Y (2022) Natural scene text detection and recognition based on saturation-incorporated multi-channel mser. Knowl-Based Syst 109040

  19. Mosannafat M, Taherinezhad F, Khotanlou H, Alighardash E (2022) Farsi text detection and localization in videos and images. In: 2022 9th Iranian joint congress on fuzzy and intelligent systems (CFIS), IEEE, pp 1–6

  20. Luo X, Zhu H (2022) A text detection and recognition algorithm for english teaching based on deep learning. Sci Program 2022

  21. Naik MM, Karande MAS, Gaikwad MSA, Heralge MPB, Gurav MSN (2024) Text detection and recognition with speech output in mobile application for assistance to visually challenged person

  22. Chen F, Dou Z-Y (2024) Measuring and mitigating bias in vision-and-language models

  23. Deena G, Raja K et al (2022) Keyword extraction using latent semantic analysis for question generation. J App Sci Eng 26:501–510

    Google Scholar 

  24. Li Z, Guo C, Feng Z, Hwang J-N, Xue X (2024) Multi-view visual semantic embedding

  25. Kordabadi M, Nazari A, Mansoorizadeh M (2022) A movie recommender system based on topic modeling using machine learning methods

  26. Lin S-C, Li M, Lin J (2022) Aggretriever: A simple approach to aggregate textual representation for robust dense passage retrieval. ar**v preprint ar**v:2208.00511

  27. Lin Q, Cao W, He Z (2022) Level-wise aligned dual networks for text–video retrieval. EURASIP J Adv Signal Process 2022:1–20

    Article  Google Scholar 

  28. Ji K, Liu J, Hong W, Zhong L, Wang J, Chen J, Chu W (2022) Cret: Cross-modal retrieval transformer for efficient text-video retrieval. In: Proceedings of the 45th international ACM SIGIR conference on research and development in information retrieval, pp 949–959

  29. Hsieh C-A, Hsieh C-P, Cheng P-J (2024) Mr. right: Multimodal retrieval on representation of image with text

  30. Carlsson F, Eisen P, Rekathati F, Sahlgren M (2024) Cross-lingual and multilingual clip

  31. Srinivasan T, Ren X, Thomason J (2022) Curriculum learning for data-efficient vision-language alignment. ar**v preprint ar**v:2207.14525

  32. Ouali I, Halima MB, Wali A (2023) An augmented reality for an arabic text reading and visualization assistant for the visually impaired. Multimed Tools Appl 1–29

  33. Rehman IU, Ullah S (2022) Gestures and marker based low-cost interactive writing board for primary education. Multimed Tools Appl 81:1337–1356

    Article  Google Scholar 

  34. Kapetanaki A, Krouska A, Troussas C, Sgouropoulou C (2021) A novel framework incorporating augmented reality and pedagogy for improving reading comprehension in special education. In: Novelties in intelligent digital systems, IOS Press, pp 105–110

  35. Rasidin R (2021) Perancangan aplikasi pengenalan objek 3d komponen komputer menggunakan augmented reality berbasis android. Bulletin of Data Science 1:26–31

    Google Scholar 

  36. Syahidi AA, Tolle H, Supianto AA, Arai K (2019) Ar-child: Analysis, evaluation, and effect of using augmented reality as a learning media for preschool children. In: 2019 5th International conference on computing engineering and design (ICCED), IEEE, pp 1–6

  37. Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. ar**v preprint ar**v:1409.1556

  38. Zhou X, Yao C, Wen H, Wang Y, Zhou S, He W, Liang J (2017) East: an efficient and accurate scene text detector. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5551–5560

  39. Kim K-H, Hong S, Roh B, Cheon Y, Park M (2016) Pvanet: Deep but lightweight neural networks for real-time object detection. ar**v preprint ar**v:1608.08021

  40. Ouali I, Ghozzi F, Taktak R, Sassi MSH (2019) Ontology alignment using stable matching. Procedia Comput Sci 159:746–755

    Article  Google Scholar 

Download references

Acknowledgements

This research project was funded by the Deanship of Scientific Research, Princess Nourah bint Abdulrahman University, through the Program of Research Project Funding After Publication, grant No (43- PRFA-P-50).

Author information

Authors and Affiliations

Authors

Contributions

All authors of this study developed the system and wrote and reviewed the manuscript. Imene OUALI: Coding, Writing an original draft, Performing experiments, Analyzing the data, Software, Visualization, Data curation, editing. Mohamed BEN HALIMA: Supervision, Methodology, Validation, Conception and design, Investigation, Formal analysis, and Drafting of the manuscript. Nesrine MASMOUDI: Review, Investigation, Resources, Funding, and QM commented on subsequent versions of the manuscript. Manel AYADI: Review, Investigation, Material preparation, Funding, and QM commented on subsequent versions of the manuscript. Latifa ALMUQREN: Review, Resources, Material preparation, Funding, and QM commented on subsequent versions of the manuscript. Ali WALI: Supervision, Validation, Investigation, Project administration, Conceptualization, Contributing to the study, and Designing the study. All authors have read and approved the final submitted manuscript.

Corresponding author

Correspondence to Imene Ouali.

Ethics declarations

Ethics approval

Not applicable

Research involving human and animal participants

Not applicable

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ouali, I., Ben Halima, M., Masmoudi, N. et al. Text recuperated using ontology with stable marriage optimization technique and text visualization using AR. Multimed Tools Appl (2024). https://doi.org/10.1007/s11042-024-18795-8

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s11042-024-18795-8

Keywords

Navigation