Abstract
Off-line examination is still being used in many parts of the world as it is a more economical way of conducting exams when compared to computer-based ones. Automatically and accurately assessing these handwritten exam papers poses a complex challenge, as high accuracy rates as possible are always desirable. Factors such as the attributes of the handwritten images, the presence of numerous classes, challenges related to word boundaries in languages such as Arabic, and the significant intra-class variation in handwritten forms contribute to the enduring complexity of word recognition and word spotting tasks. In order to address the problems, this research proposed a novel joint learning technique for word spotting and word recognition in a multi-task learning setting. A multi-task convolution neural network was employed to materialise the proposed concept. The word spotting task was dealt as a regression task and the other task was word recognition. The typical Faster-RCNN backbone was employed with the Region of Interest (RoI) pooling layer, which was then followed by two consecutive fully connected layers for the word spotting and recognition task. The experimental results are encouraging and demonstrate that the proposed research achieved a significant enhancement in the accuracy of short-answer assessment systems. As a result, the proposed technique can be implemented in short-answer assessment systems to improve both their efficiency and accuracy.
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11042-023-17606-w/MediaObjects/11042_2023_17606_Fig1_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11042-023-17606-w/MediaObjects/11042_2023_17606_Fig2_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11042-023-17606-w/MediaObjects/11042_2023_17606_Fig3_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11042-023-17606-w/MediaObjects/11042_2023_17606_Fig4_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11042-023-17606-w/MediaObjects/11042_2023_17606_Fig5_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11042-023-17606-w/MediaObjects/11042_2023_17606_Fig6_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11042-023-17606-w/MediaObjects/11042_2023_17606_Fig7_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11042-023-17606-w/MediaObjects/11042_2023_17606_Fig8_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11042-023-17606-w/MediaObjects/11042_2023_17606_Fig9_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11042-023-17606-w/MediaObjects/11042_2023_17606_Fig10_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11042-023-17606-w/MediaObjects/11042_2023_17606_Fig11_HTML.png)
Similar content being viewed by others
Data Availability
The datasets generated and analysed during the current study are available from the corresponding author on reasonable request.
References
Sharma A, Jayagopi DB (2018) Automated grading of handwritten essays, pp 279–284
Shaikh E, Mohiuddin I, Manzoor A, Latif G, Mohammad N (2019) Automated grading for handwritten answer sheets using convolutional neural networks, pp 1–6
Suwanwiwat H, Blumenstein M, Pal U (2015) A complete automatic short answer assessment system with student identification, pp 611–615 (IEEE)
Rowtula V, Oota SR, Jawahar CV (2019) Towards automated evaluation of handwritten assessments, pp 426–433
Lin Y, et al (2020) Design and implementation of intelligent scoring system for handwritten short answer based on deep learning, pp 184–189
Suwanwiwat H, Pal U, Blumenstein M (2016) An automatic off-line short answer assessment system using novel hybrid features, pp 1–8
Almazán J, Gordo A, Fornés A, Valveny E (2014) Word spotting and recognition with embedded attributes. IEEE Trans Pattern Anal Mach Intell 36:2552–2566
Mhiri M, Desrosiers C, Cheriet M (2019) Word spotting and recognition via a joint deep embedding of image and text. Pattern Recognit 88:312–320. http://www.sciencedirect.com/science/article/pii/S0031320318304059
Khayyat M, Lam L, Suen CY (2014) Learning-based word spotting system for arabic handwritten documents. Pattern Recognit 47:1021–1030
Feng W, He W, Yin F, Zhang X-Y, Liu C-L (2019) Textdragon: an end-to-end framework for arbitrary shaped text spotting
Singh S, Chauhan V, Barney Smith E (2020) A self controlled rdp approach for feature extraction in online handwriting recognition using deep learning. Applied Intelligence
Giotis AP, Sfikas G, Gatos B, Nikou C (2017) A survey of document image word spotting techniques. Pattern Recognit 68:310–332. http://www.sciencedirect.com/science/article/pii/S0031320317300870
Otsu N (1979) A threshold selection method from gray-level histograms. IEEE Trans Syst Man Cybern 9:62–66
Gatos B, Pratikakis I, Perantonis S (2006) Adaptive degraded document image binarization. Pattern Recognit 39:317–327. http://www.sciencedirect.com/science/article/pii/S0031320305003821
Louloudis G, Gatos B, Pratikakis I, Halatsis C (2009) Text line and word segmentation of handwritten documents. Pattern Recognit 42:3169–3183
Kim G, Govindaraju V (1997) A lexicon driven approach to handwritten word recognition for real-time applications. IEEE Trans Pattern Anal Mach Intell 19:366–379
Caesar T, Gloger, JM, Mandler E (1993) Preprocessing and feature extraction for a handwriting recognition system, pp 408–411
Le AD, Nguyen HT, Nakagawa M (2018) Recognizing unconstrained vietnamese handwriting by attention based encoder decoder model, pp 83–87
Parker JR(1993) Practical Computer Vision Using C(John Wiley & Sons, Inc., USA)
Nigam S, Verma S, Nagabhushan P (2023) Document analysis and recognition: a survey
Sagheer MW, He CL, Nobile N, Suen CY (2010) Holistic urdu handwritten word recognition using support vector machine, pp 1900–1903
Parvez MT, Mahmoud SA (2013) Offline arabic handwritten text recognition: a survey. ACM Computing Surveys (CSUR) 45:1–35
Dutta K, Krishnan P, Mathew M, Jawahar CV (2018) Offline handwriting recognition on devanagari using a new benchmark dataset, pp 25–30
Carbonell M, Villegas M, Fornés A, Lladós J (2018) Joint recognition of handwritten text and named entities with a neural end-to-end model, pp 399–404
Benouareth A, Ennaji A, Sellami M (2007) Arabic handwritten word recognition using hmms with explicit state duration. EURASIP J Adv Signal Process 2008:1–13
Tay YH, Michel Lallican P, Khalid M, Viard-Gaudin C, Knerr S (2001) An offline cursive handwritten word recognition system
Benouareth A, Ennaji A, Sellami M (2008) Semi-continuous hmms with explicit state duration for unconstrained arabic word modeling and recognition. Pattern Recognit Lett 29:1742–1752
Yuan A, Bai G, Yang P, Guo Y, Zhao X (2012) Handwritten english word recognition based on convolutional neural networks, pp 207–212
Bluche T, Ney H, Kermorvant C (2013) Feature extraction with convolutional neural networks for handwritten word recognition, pp 285–289
Zargar S (2021) Introduction to sequence learning models: Rnn, lstm, gru. Department of Mechanical and Aerospace Engineering, North Carolina State University, Raleigh, North Carolina, vol 27606
Teslya N, Mohammed S (2022) Deep learning for handwriting text recognition: existing approaches and challenges, pp 339–346
Simayi W, Ibrayim M, Hamdulla A (2021) Study the preprocessing effect on RNN based online Uyghur handwritten word recognition, pp 1–12
Vaswani A, et al (2017) Attention is all you need. Advances in neural information processing systems, vol 30
Fan A, Lavril T, Grave E, Joulin A, Sukhbaatar S (2020) Addressing some limitations of transformers with feedback memory. ar**v:2002.09402
Yan H, Deng B, Li X, Qiu X (2019) Tener: adapting transformer encoder for named entity recognition
Wick C, Zöllner J, Grüning T (2021) Transformer for handwritten text recognition using bidirectional post-decoding, pp 112–126
Devlin J, Chang M-W, Lee K, Toutanova K (2018) Bert: pre-training of deep bidirectional transformers for language understanding. ar**v:1810.04805
Graves A, Fernández S, Gomez F, Schmidhuber J (2006) Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks, pp 369–376
Graves A, Liwicki M, Bunke H, Schmidhuber J, Fernández S (2007) Unconstrained on-line handwriting recognition with recurrent neural networks. Advances in neural information processing systems, vol 20
Abdurahman F, Sisay E, Fante KA (2021) Ahwr-net: offline handwritten amharic word recognition using convolutional recurrent neural network. SN Appl Sci 3:1–11
Jemni SK, Ammar S, Kessentini Y (2022) Domain and writer adaptation of offline arabic handwriting recognition using deep neural networks. Neural Comput Appl 34:2055–2071
Doetsch P, Kozielski M, Ney H (2014) Fast and robust training of recurrent neural networks for offline handwriting recognition, pp 279–284
Elleuch M, Maalej R, Kherallah M (2016) A new design based-svm of the cnn classifier architecture with dropout for offline arabic handwritten recognition. Proc Comput Sci 80:1712–1723
Rusiol M, Aldavert D, Toledo R, Lladós J (2015) Efficient segmentation-free keyword spotting in historical document collections. Pattern Recognit 48:545–555 http://www.sciencedirect.com/science/article/pii/S0031320314003355
Frinken V, Fischer A, Manmatha R, Bunke H (2012) A novel word spotting method based on recurrent neural networks. IEEE Trans Pattern Anal Mach Intell 34:211–224
Stauffer M, Fischer A, Riesen K (2018) Keyword spotting in historical handwritten documents based on graph matching. Pattern Recognit 81:240–253
Vidal E, Toselli AH, Puigcerver J (2015) High performance query-by-example keyword spotting using query-by-string techniques, pp 741–745
Sudholt S, Fink GA (2018) Attribute cnns for word spotting in handwritten documents. International journal on document analysis and recognition (ijdar) 21:199–218
Tavoli R, Keyvanpour M (2018) A method for handwritten word spotting based on particle swarm optimisation and multi-layer perceptron. IET Software 12:152–159
Stauffer M, Fischer A, Riesen K (2020) Filters for graph-based keyword spotting in historical handwritten documents. Pattern Recognit Lett 134:125–134
Fischer A, Keller A, Frinken V, Bunke H (2012) Lexicon-free handwritten word spotting using character hmms. Pattern Recognit Lett 33:934–942
Rodríguez-Serrano JA, Perronnin F (2009) Handwritten word-spotting using hidden markov models and universal vocabularies. Pattern Recognit 42:2106–2116
Rath TM, Manmatha R (2003) Word image matching using dynamic time war**, vol 2
Wicht B, Fischer A, Hennebert J (2016) Keyword spotting with convolutional deep belief networks and dynamic time war**, pp 113–120
Sudholt S, Fink GA (2016) Phocnet: a deep convolutional neural network for word spotting in handwritten documents, pp 277–282
Wolf F, Fink GA (2020) Annotation-free learning of deep representations for word spotting using synthetic data and self labeling, pp 293–308
Omayio EO, Indu S, Panda J (2023) Word spotting and character recognition of handwritten hindi scripts by integral histogram of oriented displacement (ihod) descriptor. Multimedia Tools and Applications, pp 1–28
Papandreou A, Gatos B, Zagoris K (2016) An adaptive zoning technique for word spotting using dynamic time war**, pp 387–392
Jeong C, Kim S (2005) A document image preprocessing system for keyword spotting, pp 440–443
Rothacker L, Rusinol M, Fink GA (2013) Bag-of-features hmms for segmentation-free word spotting in handwritten documents, pp 1305–1309
Sfikas G, Retsinas G, Gatos B (2016). Zoning aggregated hypercolumns for keyword spotting, pp 283–288
Tang R, Wang W, Tu Z, Lin J (2018) An experimental analysis of the power consumption of convolutional neural networks for keyword spotting, pp 5479–5483
Kumari L, Sharma A (2022) A review of deep learning techniques in document image word spotting. Archives of Computational Methods in Engineering, pp 1–22
Khotanzad (1988) Distortion invariant character recognition by a multi-layer perceptron and back-propagation learning, pp 625–632
Rohlicek JR, Russell W, Roukos S, Gish H (1989) Continuous hidden markov modeling for speaker-independent word spotting, pp 627–630
Rose RC, Paul DB (1990) A hidden markov model based keyword recognition system, pp 129–132
Jain AK, Namboodiri AM (2003) Indexing and retrieval of on-line handwritten documents 3:655
Gatos B, Pratikakis I (2009) Segmentation-free word spotting in historical printed documents, pp 271–275
Nagy G, Lopresti D (2006) Interactive document processing and digital libraries, p 8
Tarafdar A, Pal U, Ramel J-Y, Ragot N, Chaudhuri BB (2014) Word spotting in bangla and english graphical documents, pp 3044–3049
Cao H, Bhardwaj A, Govindaraju V (2009) A probabilistic method for keyword retrieval in handwritten document images. Pattern Recognit 42:3374–3382
Caruana R (1997) Multitask learning. Mach Learn 28:41–75
Cheikhrouhou A, Kessentini Y, Kanoun S (2021) Multi-task learning for simultaneous script identification and keyword spotting in document images. Pattern Recognit 113:107832
Mondal T, Das A, Ming Z (2022) Exploring multi-tasking learning in document attribute classification. Pattern Recognition Letters, vol 157
Girshick R, Donahue J, Darrell T, Malik J (2013) Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE computer society conference on computer vision and pattern recognition
He K, Gkioxari G, Dollár P, Girshick R (2017) Mask r-cnn, pp 2961–2969
Wang X, Zhu L, Wu Y, Yang Y (2023) Symbiotic attention for egocentric action recognition with object-centric alignment. IEEE Trans Pattern Anal Mach Intell 45:6605–6617
Tzutalin (2015) Labelimg. git code
Ren S, He K, Girshick R, Sun J (2015) Faster r-cnn: towards real-time object detection with region proposal networks. Advances in neural information processing systems, vol 28
Das A, Suwanwiwat H, Pal U, Blumenstein M (2020) Icfhr 2020 competition on short answer assessment and thai student signature and name components recognition and verification (sasigcom 2020), pp 222–227
Suwanwiwat H, Das A, Saqib M, Pal U (2021) Benchmarked multi-script thai scene text dataset and its multi-class detection solution. Multimedia Tools and Applications, vol 80
Suwanwiwat H, Das A, Pal U, Blumenstein M (2018) An investigation of discrete hidden markov models on handwritten short answer assessment system, pp 1–8
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Competing interest
The authors declare that they have no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Das, A., Suwanwiwat, H. & Pal, U. A novel multi-task learning technique for offline handwritten short answer spotting and recognition. Multimed Tools Appl 83, 53441–53465 (2024). https://doi.org/10.1007/s11042-023-17606-w
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-023-17606-w