Log in

A novel multi-task learning technique for offline handwritten short answer spotting and recognition

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Off-line examination is still being used in many parts of the world as it is a more economical way of conducting exams when compared to computer-based ones. Automatically and accurately assessing these handwritten exam papers poses a complex challenge, as high accuracy rates as possible are always desirable. Factors such as the attributes of the handwritten images, the presence of numerous classes, challenges related to word boundaries in languages such as Arabic, and the significant intra-class variation in handwritten forms contribute to the enduring complexity of word recognition and word spotting tasks. In order to address the problems, this research proposed a novel joint learning technique for word spotting and word recognition in a multi-task learning setting. A multi-task convolution neural network was employed to materialise the proposed concept. The word spotting task was dealt as a regression task and the other task was word recognition. The typical Faster-RCNN backbone was employed with the Region of Interest (RoI) pooling layer, which was then followed by two consecutive fully connected layers for the word spotting and recognition task. The experimental results are encouraging and demonstrate that the proposed research achieved a significant enhancement in the accuracy of short-answer assessment systems. As a result, the proposed technique can be implemented in short-answer assessment systems to improve both their efficiency and accuracy.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11

Similar content being viewed by others

Data Availability

The datasets generated and analysed during the current study are available from the corresponding author on reasonable request.

References

  1. Sharma A, Jayagopi DB (2018) Automated grading of handwritten essays, pp 279–284

  2. Shaikh E, Mohiuddin I, Manzoor A, Latif G, Mohammad N (2019) Automated grading for handwritten answer sheets using convolutional neural networks, pp 1–6

  3. Suwanwiwat H, Blumenstein M, Pal U (2015) A complete automatic short answer assessment system with student identification, pp 611–615 (IEEE)

  4. Rowtula V, Oota SR, Jawahar CV (2019) Towards automated evaluation of handwritten assessments, pp 426–433

  5. Lin Y, et al (2020) Design and implementation of intelligent scoring system for handwritten short answer based on deep learning, pp 184–189

  6. Suwanwiwat H, Pal U, Blumenstein M (2016) An automatic off-line short answer assessment system using novel hybrid features, pp 1–8

  7. Almazán J, Gordo A, Fornés A, Valveny E (2014) Word spotting and recognition with embedded attributes. IEEE Trans Pattern Anal Mach Intell 36:2552–2566

    Article  Google Scholar 

  8. Mhiri M, Desrosiers C, Cheriet M (2019) Word spotting and recognition via a joint deep embedding of image and text. Pattern Recognit 88:312–320. http://www.sciencedirect.com/science/article/pii/S0031320318304059

  9. Khayyat M, Lam L, Suen CY (2014) Learning-based word spotting system for arabic handwritten documents. Pattern Recognit 47:1021–1030

    Article  Google Scholar 

  10. Feng W, He W, Yin F, Zhang X-Y, Liu C-L (2019) Textdragon: an end-to-end framework for arbitrary shaped text spotting

  11. Singh S, Chauhan V, Barney Smith E (2020) A self controlled rdp approach for feature extraction in online handwriting recognition using deep learning. Applied Intelligence

  12. Giotis AP, Sfikas G, Gatos B, Nikou C (2017) A survey of document image word spotting techniques. Pattern Recognit 68:310–332. http://www.sciencedirect.com/science/article/pii/S0031320317300870

  13. Otsu N (1979) A threshold selection method from gray-level histograms. IEEE Trans Syst Man Cybern 9:62–66

    Article  Google Scholar 

  14. Gatos B, Pratikakis I, Perantonis S (2006) Adaptive degraded document image binarization. Pattern Recognit 39:317–327. http://www.sciencedirect.com/science/article/pii/S0031320305003821

  15. Louloudis G, Gatos B, Pratikakis I, Halatsis C (2009) Text line and word segmentation of handwritten documents. Pattern Recognit 42:3169–3183

    Article  Google Scholar 

  16. Kim G, Govindaraju V (1997) A lexicon driven approach to handwritten word recognition for real-time applications. IEEE Trans Pattern Anal Mach Intell 19:366–379

    Article  Google Scholar 

  17. Caesar T, Gloger, JM, Mandler E (1993) Preprocessing and feature extraction for a handwriting recognition system, pp 408–411

  18. Le AD, Nguyen HT, Nakagawa M (2018) Recognizing unconstrained vietnamese handwriting by attention based encoder decoder model, pp 83–87

  19. Parker JR(1993) Practical Computer Vision Using C(John Wiley & Sons, Inc., USA)

  20. Nigam S, Verma S, Nagabhushan P (2023) Document analysis and recognition: a survey

  21. Sagheer MW, He CL, Nobile N, Suen CY (2010) Holistic urdu handwritten word recognition using support vector machine, pp 1900–1903

  22. Parvez MT, Mahmoud SA (2013) Offline arabic handwritten text recognition: a survey. ACM Computing Surveys (CSUR) 45:1–35

    Article  Google Scholar 

  23. Dutta K, Krishnan P, Mathew M, Jawahar CV (2018) Offline handwriting recognition on devanagari using a new benchmark dataset, pp 25–30

  24. Carbonell M, Villegas M, Fornés A, Lladós J (2018) Joint recognition of handwritten text and named entities with a neural end-to-end model, pp 399–404

  25. Benouareth A, Ennaji A, Sellami M (2007) Arabic handwritten word recognition using hmms with explicit state duration. EURASIP J Adv Signal Process 2008:1–13

    Article  Google Scholar 

  26. Tay YH, Michel Lallican P, Khalid M, Viard-Gaudin C, Knerr S (2001) An offline cursive handwritten word recognition system

  27. Benouareth A, Ennaji A, Sellami M (2008) Semi-continuous hmms with explicit state duration for unconstrained arabic word modeling and recognition. Pattern Recognit Lett 29:1742–1752

    Article  Google Scholar 

  28. Yuan A, Bai G, Yang P, Guo Y, Zhao X (2012) Handwritten english word recognition based on convolutional neural networks, pp 207–212

  29. Bluche T, Ney H, Kermorvant C (2013) Feature extraction with convolutional neural networks for handwritten word recognition, pp 285–289

  30. Zargar S (2021) Introduction to sequence learning models: Rnn, lstm, gru. Department of Mechanical and Aerospace Engineering, North Carolina State University, Raleigh, North Carolina, vol 27606

  31. Teslya N, Mohammed S (2022) Deep learning for handwriting text recognition: existing approaches and challenges, pp 339–346

  32. Simayi W, Ibrayim M, Hamdulla A (2021) Study the preprocessing effect on RNN based online Uyghur handwritten word recognition, pp 1–12

  33. Vaswani A, et al (2017) Attention is all you need. Advances in neural information processing systems, vol 30

  34. Fan A, Lavril T, Grave E, Joulin A, Sukhbaatar S (2020) Addressing some limitations of transformers with feedback memory. ar**v:2002.09402

  35. Yan H, Deng B, Li X, Qiu X (2019) Tener: adapting transformer encoder for named entity recognition

  36. Wick C, Zöllner J, Grüning T (2021) Transformer for handwritten text recognition using bidirectional post-decoding, pp 112–126

  37. Devlin J, Chang M-W, Lee K, Toutanova K (2018) Bert: pre-training of deep bidirectional transformers for language understanding. ar**v:1810.04805

  38. Graves A, Fernández S, Gomez F, Schmidhuber J (2006) Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks, pp 369–376

  39. Graves A, Liwicki M, Bunke H, Schmidhuber J, Fernández S (2007) Unconstrained on-line handwriting recognition with recurrent neural networks. Advances in neural information processing systems, vol 20

  40. Abdurahman F, Sisay E, Fante KA (2021) Ahwr-net: offline handwritten amharic word recognition using convolutional recurrent neural network. SN Appl Sci 3:1–11

    Article  Google Scholar 

  41. Jemni SK, Ammar S, Kessentini Y (2022) Domain and writer adaptation of offline arabic handwriting recognition using deep neural networks. Neural Comput Appl 34:2055–2071

    Article  Google Scholar 

  42. Doetsch P, Kozielski M, Ney H (2014) Fast and robust training of recurrent neural networks for offline handwriting recognition, pp 279–284

  43. Elleuch M, Maalej R, Kherallah M (2016) A new design based-svm of the cnn classifier architecture with dropout for offline arabic handwritten recognition. Proc Comput Sci 80:1712–1723

    Article  Google Scholar 

  44. Rusiol M, Aldavert D, Toledo R, Lladós J (2015) Efficient segmentation-free keyword spotting in historical document collections. Pattern Recognit 48:545–555 http://www.sciencedirect.com/science/article/pii/S0031320314003355

  45. Frinken V, Fischer A, Manmatha R, Bunke H (2012) A novel word spotting method based on recurrent neural networks. IEEE Trans Pattern Anal Mach Intell 34:211–224

    Article  Google Scholar 

  46. Stauffer M, Fischer A, Riesen K (2018) Keyword spotting in historical handwritten documents based on graph matching. Pattern Recognit 81:240–253

    Article  Google Scholar 

  47. Vidal E, Toselli AH, Puigcerver J (2015) High performance query-by-example keyword spotting using query-by-string techniques, pp 741–745

  48. Sudholt S, Fink GA (2018) Attribute cnns for word spotting in handwritten documents. International journal on document analysis and recognition (ijdar) 21:199–218

    Article  Google Scholar 

  49. Tavoli R, Keyvanpour M (2018) A method for handwritten word spotting based on particle swarm optimisation and multi-layer perceptron. IET Software 12:152–159

    Article  Google Scholar 

  50. Stauffer M, Fischer A, Riesen K (2020) Filters for graph-based keyword spotting in historical handwritten documents. Pattern Recognit Lett 134:125–134

    Article  Google Scholar 

  51. Fischer A, Keller A, Frinken V, Bunke H (2012) Lexicon-free handwritten word spotting using character hmms. Pattern Recognit Lett 33:934–942

    Article  Google Scholar 

  52. Rodríguez-Serrano JA, Perronnin F (2009) Handwritten word-spotting using hidden markov models and universal vocabularies. Pattern Recognit 42:2106–2116

    Article  Google Scholar 

  53. Rath TM, Manmatha R (2003) Word image matching using dynamic time war**, vol 2

  54. Wicht B, Fischer A, Hennebert J (2016) Keyword spotting with convolutional deep belief networks and dynamic time war**, pp 113–120

  55. Sudholt S, Fink GA (2016) Phocnet: a deep convolutional neural network for word spotting in handwritten documents, pp 277–282

  56. Wolf F, Fink GA (2020) Annotation-free learning of deep representations for word spotting using synthetic data and self labeling, pp 293–308

  57. Omayio EO, Indu S, Panda J (2023) Word spotting and character recognition of handwritten hindi scripts by integral histogram of oriented displacement (ihod) descriptor. Multimedia Tools and Applications, pp 1–28

  58. Papandreou A, Gatos B, Zagoris K (2016) An adaptive zoning technique for word spotting using dynamic time war**, pp 387–392

  59. Jeong C, Kim S (2005) A document image preprocessing system for keyword spotting, pp 440–443

  60. Rothacker L, Rusinol M, Fink GA (2013) Bag-of-features hmms for segmentation-free word spotting in handwritten documents, pp 1305–1309

  61. Sfikas G, Retsinas G, Gatos B (2016). Zoning aggregated hypercolumns for keyword spotting, pp 283–288

  62. Tang R, Wang W, Tu Z, Lin J (2018) An experimental analysis of the power consumption of convolutional neural networks for keyword spotting, pp 5479–5483

  63. Kumari L, Sharma A (2022) A review of deep learning techniques in document image word spotting. Archives of Computational Methods in Engineering, pp 1–22

  64. Khotanzad (1988) Distortion invariant character recognition by a multi-layer perceptron and back-propagation learning, pp 625–632

  65. Rohlicek JR, Russell W, Roukos S, Gish H (1989) Continuous hidden markov modeling for speaker-independent word spotting, pp 627–630

  66. Rose RC, Paul DB (1990) A hidden markov model based keyword recognition system, pp 129–132

  67. Jain AK, Namboodiri AM (2003) Indexing and retrieval of on-line handwritten documents 3:655

    Google Scholar 

  68. Gatos B, Pratikakis I (2009) Segmentation-free word spotting in historical printed documents, pp 271–275

  69. Nagy G, Lopresti D (2006) Interactive document processing and digital libraries, p 8

  70. Tarafdar A, Pal U, Ramel J-Y, Ragot N, Chaudhuri BB (2014) Word spotting in bangla and english graphical documents, pp 3044–3049

  71. Cao H, Bhardwaj A, Govindaraju V (2009) A probabilistic method for keyword retrieval in handwritten document images. Pattern Recognit 42:3374–3382

    Article  Google Scholar 

  72. Caruana R (1997) Multitask learning. Mach Learn 28:41–75

    Article  Google Scholar 

  73. Cheikhrouhou A, Kessentini Y, Kanoun S (2021) Multi-task learning for simultaneous script identification and keyword spotting in document images. Pattern Recognit 113:107832

    Article  Google Scholar 

  74. Mondal T, Das A, Ming Z (2022) Exploring multi-tasking learning in document attribute classification. Pattern Recognition Letters, vol 157

  75. Girshick R, Donahue J, Darrell T, Malik J (2013) Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE computer society conference on computer vision and pattern recognition

  76. He K, Gkioxari G, Dollár P, Girshick R (2017) Mask r-cnn, pp 2961–2969

  77. Wang X, Zhu L, Wu Y, Yang Y (2023) Symbiotic attention for egocentric action recognition with object-centric alignment. IEEE Trans Pattern Anal Mach Intell 45:6605–6617

    Article  Google Scholar 

  78. Tzutalin (2015) Labelimg. git code

  79. Ren S, He K, Girshick R, Sun J (2015) Faster r-cnn: towards real-time object detection with region proposal networks. Advances in neural information processing systems, vol 28

  80. Das A, Suwanwiwat H, Pal U, Blumenstein M (2020) Icfhr 2020 competition on short answer assessment and thai student signature and name components recognition and verification (sasigcom 2020), pp 222–227

  81. Suwanwiwat H, Das A, Saqib M, Pal U (2021) Benchmarked multi-script thai scene text dataset and its multi-class detection solution. Multimedia Tools and Applications, vol 80

  82. Suwanwiwat H, Das A, Pal U, Blumenstein M (2018) An investigation of discrete hidden markov models on handwritten short answer assessment system, pp 1–8

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Abhijit Das.

Ethics declarations

Competing interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Das, A., Suwanwiwat, H. & Pal, U. A novel multi-task learning technique for offline handwritten short answer spotting and recognition. Multimed Tools Appl 83, 53441–53465 (2024). https://doi.org/10.1007/s11042-023-17606-w

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-023-17606-w

Keywords

Navigation