A novel multi-task learning technique for offline handwritten short answer spotting and recognition

Das, Abhijit; Suwanwiwat, Hemmaphan; Pal, Umapada

doi:10.1007/s11042-023-17606-w

A novel multi-task learning technique for offline handwritten short answer spotting and recognition

Published: 20 November 2023

Volume 83, pages 53441–53465, (2024)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

150 Accesses
Explore all metrics

Abstract

Off-line examination is still being used in many parts of the world as it is a more economical way of conducting exams when compared to computer-based ones. Automatically and accurately assessing these handwritten exam papers poses a complex challenge, as high accuracy rates as possible are always desirable. Factors such as the attributes of the handwritten images, the presence of numerous classes, challenges related to word boundaries in languages such as Arabic, and the significant intra-class variation in handwritten forms contribute to the enduring complexity of word recognition and word spotting tasks. In order to address the problems, this research proposed a novel joint learning technique for word spotting and word recognition in a multi-task learning setting. A multi-task convolution neural network was employed to materialise the proposed concept. The word spotting task was dealt as a regression task and the other task was word recognition. The typical Faster-RCNN backbone was employed with the Region of Interest (RoI) pooling layer, which was then followed by two consecutive fully connected layers for the word spotting and recognition task. The experimental results are encouraging and demonstrate that the proposed research achieved a significant enhancement in the accuracy of short-answer assessment systems. As a result, the proposed technique can be implemented in short-answer assessment systems to improve both their efficiency and accuracy.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

EUR 32.99 /Month

Get 10 units per month
Download Article/Chapter or Ebook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Automated Short-Answer Grading Using Deep Neural Networks and Item Response Theory

Handwriting Recognition and Automatic Scoring for Descriptive Answers in Japanese Language Tests

Grading Chinese Answers on Specialty Subjective Questions

Data Availability

The datasets generated and analysed during the current study are available from the corresponding author on reasonable request.

References

Sharma A, Jayagopi DB (2018) Automated grading of handwritten essays, pp 279–284
Shaikh E, Mohiuddin I, Manzoor A, Latif G, Mohammad N (2019) Automated grading for handwritten answer sheets using convolutional neural networks, pp 1–6
Suwanwiwat H, Blumenstein M, Pal U (2015) A complete automatic short answer assessment system with student identification, pp 611–615 (IEEE)
Rowtula V, Oota SR, Jawahar CV (2019) Towards automated evaluation of handwritten assessments, pp 426–433
Lin Y, et al (2020) Design and implementation of intelligent scoring system for handwritten short answer based on deep learning, pp 184–189
Suwanwiwat H, Pal U, Blumenstein M (2016) An automatic off-line short answer assessment system using novel hybrid features, pp 1–8
Almazán J, Gordo A, Fornés A, Valveny E (2014) Word spotting and recognition with embedded attributes. IEEE Trans Pattern Anal Mach Intell 36:2552–2566
Article Google Scholar
Mhiri M, Desrosiers C, Cheriet M (2019) Word spotting and recognition via a joint deep embedding of image and text. Pattern Recognit 88:312–320. http://www.sciencedirect.com/science/article/pii/S0031320318304059
Khayyat M, Lam L, Suen CY (2014) Learning-based word spotting system for arabic handwritten documents. Pattern Recognit 47:1021–1030
Article Google Scholar
Feng W, He W, Yin F, Zhang X-Y, Liu C-L (2019) Textdragon: an end-to-end framework for arbitrary shaped text spotting
Singh S, Chauhan V, Barney Smith E (2020) A self controlled rdp approach for feature extraction in online handwriting recognition using deep learning. Applied Intelligence
Giotis AP, Sfikas G, Gatos B, Nikou C (2017) A survey of document image word spotting techniques. Pattern Recognit 68:310–332. http://www.sciencedirect.com/science/article/pii/S0031320317300870
Otsu N (1979) A threshold selection method from gray-level histograms. IEEE Trans Syst Man Cybern 9:62–66
Article Google Scholar
Gatos B, Pratikakis I, Perantonis S (2006) Adaptive degraded document image binarization. Pattern Recognit 39:317–327. http://www.sciencedirect.com/science/article/pii/S0031320305003821
Louloudis G, Gatos B, Pratikakis I, Halatsis C (2009) Text line and word segmentation of handwritten documents. Pattern Recognit 42:3169–3183
Article Google Scholar
Kim G, Govindaraju V (1997) A lexicon driven approach to handwritten word recognition for real-time applications. IEEE Trans Pattern Anal Mach Intell 19:366–379
Article Google Scholar
Caesar T, Gloger, JM, Mandler E (1993) Preprocessing and feature extraction for a handwriting recognition system, pp 408–411
Le AD, Nguyen HT, Nakagawa M (2018) Recognizing unconstrained vietnamese handwriting by attention based encoder decoder model, pp 83–87
Parker JR(1993) Practical Computer Vision Using C(John Wiley & Sons, Inc., USA)
Nigam S, Verma S, Nagabhushan P (2023) Document analysis and recognition: a survey
Sagheer MW, He CL, Nobile N, Suen CY (2010) Holistic urdu handwritten word recognition using support vector machine, pp 1900–1903
Parvez MT, Mahmoud SA (2013) Offline arabic handwritten text recognition: a survey. ACM Computing Surveys (CSUR) 45:1–35
Article Google Scholar
Dutta K, Krishnan P, Mathew M, Jawahar CV (2018) Offline handwriting recognition on devanagari using a new benchmark dataset, pp 25–30
Carbonell M, Villegas M, Fornés A, Lladós J (2018) Joint recognition of handwritten text and named entities with a neural end-to-end model, pp 399–404
Benouareth A, Ennaji A, Sellami M (2007) Arabic handwritten word recognition using hmms with explicit state duration. EURASIP J Adv Signal Process 2008:1–13
Article Google Scholar
Tay YH, Michel Lallican P, Khalid M, Viard-Gaudin C, Knerr S (2001) An offline cursive handwritten word recognition system
Benouareth A, Ennaji A, Sellami M (2008) Semi-continuous hmms with explicit state duration for unconstrained arabic word modeling and recognition. Pattern Recognit Lett 29:1742–1752
Article Google Scholar
Yuan A, Bai G, Yang P, Guo Y, Zhao X (2012) Handwritten english word recognition based on convolutional neural networks, pp 207–212
Bluche T, Ney H, Kermorvant C (2013) Feature extraction with convolutional neural networks for handwritten word recognition, pp 285–289
Zargar S (2021) Introduction to sequence learning models: Rnn, lstm, gru. Department of Mechanical and Aerospace Engineering, North Carolina State University, Raleigh, North Carolina, vol 27606
Teslya N, Mohammed S (2022) Deep learning for handwriting text recognition: existing approaches and challenges, pp 339–346
Simayi W, Ibrayim M, Hamdulla A (2021) Study the preprocessing effect on RNN based online Uyghur handwritten word recognition, pp 1–12
Vaswani A, et al (2017) Attention is all you need. Advances in neural information processing systems, vol 30
Fan A, Lavril T, Grave E, Joulin A, Sukhbaatar S (2020) Addressing some limitations of transformers with feedback memory. ar**v:2002.09402
Yan H, Deng B, Li X, Qiu X (2019) Tener: adapting transformer encoder for named entity recognition
Wick C, Zöllner J, Grüning T (2021) Transformer for handwritten text recognition using bidirectional post-decoding, pp 112–126
Devlin J, Chang M-W, Lee K, Toutanova K (2018) Bert: pre-training of deep bidirectional transformers for language understanding. ar**v:1810.04805
Graves A, Fernández S, Gomez F, Schmidhuber J (2006) Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks, pp 369–376
Graves A, Liwicki M, Bunke H, Schmidhuber J, Fernández S (2007) Unconstrained on-line handwriting recognition with recurrent neural networks. Advances in neural information processing systems, vol 20
Abdurahman F, Sisay E, Fante KA (2021) Ahwr-net: offline handwritten amharic word recognition using convolutional recurrent neural network. SN Appl Sci 3:1–11
Article Google Scholar
Jemni SK, Ammar S, Kessentini Y (2022) Domain and writer adaptation of offline arabic handwriting recognition using deep neural networks. Neural Comput Appl 34:2055–2071
Article Google Scholar
Doetsch P, Kozielski M, Ney H (2014) Fast and robust training of recurrent neural networks for offline handwriting recognition, pp 279–284
Elleuch M, Maalej R, Kherallah M (2016) A new design based-svm of the cnn classifier architecture with dropout for offline arabic handwritten recognition. Proc Comput Sci 80:1712–1723
Article Google Scholar
Rusiol M, Aldavert D, Toledo R, Lladós J (2015) Efficient segmentation-free keyword spotting in historical document collections. Pattern Recognit 48:545–555 http://www.sciencedirect.com/science/article/pii/S0031320314003355
Frinken V, Fischer A, Manmatha R, Bunke H (2012) A novel word spotting method based on recurrent neural networks. IEEE Trans Pattern Anal Mach Intell 34:211–224
Article Google Scholar
Stauffer M, Fischer A, Riesen K (2018) Keyword spotting in historical handwritten documents based on graph matching. Pattern Recognit 81:240–253
Article Google Scholar
Vidal E, Toselli AH, Puigcerver J (2015) High performance query-by-example keyword spotting using query-by-string techniques, pp 741–745
Sudholt S, Fink GA (2018) Attribute cnns for word spotting in handwritten documents. International journal on document analysis and recognition (ijdar) 21:199–218
Article Google Scholar
Tavoli R, Keyvanpour M (2018) A method for handwritten word spotting based on particle swarm optimisation and multi-layer perceptron. IET Software 12:152–159
Article Google Scholar
Stauffer M, Fischer A, Riesen K (2020) Filters for graph-based keyword spotting in historical handwritten documents. Pattern Recognit Lett 134:125–134
Article Google Scholar
Fischer A, Keller A, Frinken V, Bunke H (2012) Lexicon-free handwritten word spotting using character hmms. Pattern Recognit Lett 33:934–942
Article Google Scholar
Rodríguez-Serrano JA, Perronnin F (2009) Handwritten word-spotting using hidden markov models and universal vocabularies. Pattern Recognit 42:2106–2116
Article Google Scholar
Rath TM, Manmatha R (2003) Word image matching using dynamic time war**, vol 2
Wicht B, Fischer A, Hennebert J (2016) Keyword spotting with convolutional deep belief networks and dynamic time war**, pp 113–120
Sudholt S, Fink GA (2016) Phocnet: a deep convolutional neural network for word spotting in handwritten documents, pp 277–282
Wolf F, Fink GA (2020) Annotation-free learning of deep representations for word spotting using synthetic data and self labeling, pp 293–308
Omayio EO, Indu S, Panda J (2023) Word spotting and character recognition of handwritten hindi scripts by integral histogram of oriented displacement (ihod) descriptor. Multimedia Tools and Applications, pp 1–28
Papandreou A, Gatos B, Zagoris K (2016) An adaptive zoning technique for word spotting using dynamic time war**, pp 387–392
Jeong C, Kim S (2005) A document image preprocessing system for keyword spotting, pp 440–443
Rothacker L, Rusinol M, Fink GA (2013) Bag-of-features hmms for segmentation-free word spotting in handwritten documents, pp 1305–1309
Sfikas G, Retsinas G, Gatos B (2016). Zoning aggregated hypercolumns for keyword spotting, pp 283–288
Tang R, Wang W, Tu Z, Lin J (2018) An experimental analysis of the power consumption of convolutional neural networks for keyword spotting, pp 5479–5483
Kumari L, Sharma A (2022) A review of deep learning techniques in document image word spotting. Archives of Computational Methods in Engineering, pp 1–22
Khotanzad (1988) Distortion invariant character recognition by a multi-layer perceptron and back-propagation learning, pp 625–632
Rohlicek JR, Russell W, Roukos S, Gish H (1989) Continuous hidden markov modeling for speaker-independent word spotting, pp 627–630
Rose RC, Paul DB (1990) A hidden markov model based keyword recognition system, pp 129–132
Jain AK, Namboodiri AM (2003) Indexing and retrieval of on-line handwritten documents 3:655
Google Scholar
Gatos B, Pratikakis I (2009) Segmentation-free word spotting in historical printed documents, pp 271–275
Nagy G, Lopresti D (2006) Interactive document processing and digital libraries, p 8
Tarafdar A, Pal U, Ramel J-Y, Ragot N, Chaudhuri BB (2014) Word spotting in bangla and english graphical documents, pp 3044–3049
Cao H, Bhardwaj A, Govindaraju V (2009) A probabilistic method for keyword retrieval in handwritten document images. Pattern Recognit 42:3374–3382
Article Google Scholar
Caruana R (1997) Multitask learning. Mach Learn 28:41–75
Article Google Scholar
Cheikhrouhou A, Kessentini Y, Kanoun S (2021) Multi-task learning for simultaneous script identification and keyword spotting in document images. Pattern Recognit 113:107832
Article Google Scholar
Mondal T, Das A, Ming Z (2022) Exploring multi-tasking learning in document attribute classification. Pattern Recognition Letters, vol 157
Girshick R, Donahue J, Darrell T, Malik J (2013) Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE computer society conference on computer vision and pattern recognition
He K, Gkioxari G, Dollár P, Girshick R (2017) Mask r-cnn, pp 2961–2969
Wang X, Zhu L, Wu Y, Yang Y (2023) Symbiotic attention for egocentric action recognition with object-centric alignment. IEEE Trans Pattern Anal Mach Intell 45:6605–6617
Article Google Scholar
Tzutalin (2015) Labelimg. git code
Ren S, He K, Girshick R, Sun J (2015) Faster r-cnn: towards real-time object detection with region proposal networks. Advances in neural information processing systems, vol 28
Das A, Suwanwiwat H, Pal U, Blumenstein M (2020) Icfhr 2020 competition on short answer assessment and thai student signature and name components recognition and verification (sasigcom 2020), pp 222–227
Suwanwiwat H, Das A, Saqib M, Pal U (2021) Benchmarked multi-script thai scene text dataset and its multi-class detection solution. Multimedia Tools and Applications, vol 80
Suwanwiwat H, Das A, Pal U, Blumenstein M (2018) An investigation of discrete hidden markov models on handwritten short answer assessment system, pp 1–8

Download references

Author information

Abhijit Das and Hemmaphan Suwanwiwat are contributed equally to this work.

Authors and Affiliations

Department of Computer Sc., and Engg., Thapar University, Patiala, Punjab, India
Abhijit Das
Department of Computer Sc., and Information Sc., BITS Pilani Hyderabad Campus, Hyderabad, Telengana, India
Abhijit Das
Information Technology Academy, James Cook University, Cairns, Queensland, Australia
Hemmaphan Suwanwiwat
Computer Vision & Pattern Recognition Unit, Indian Statistical Institute, Kolkata, India
Umapada Pal

Authors

Abhijit Das
View author publications
You can also search for this author in PubMed Google Scholar
Hemmaphan Suwanwiwat
View author publications
You can also search for this author in PubMed Google Scholar
Umapada Pal
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Abhijit Das.

Ethics declarations

Competing interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Das, A., Suwanwiwat, H. & Pal, U. A novel multi-task learning technique for offline handwritten short answer spotting and recognition. Multimed Tools Appl 83, 53441–53465 (2024). https://doi.org/10.1007/s11042-023-17606-w

Download citation

Received: 21 June 2022
Revised: 01 September 2023
Accepted: 24 October 2023
Published: 20 November 2023
Issue Date: May 2024
DOI: https://doi.org/10.1007/s11042-023-17606-w

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

EUR 32.99 /Month

Get 10 units per month
Download Article/Chapter or Ebook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A novel multi-task learning technique for offline handwritten short answer spotting and recognition

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Automated Short-Answer Grading Using Deep Neural Networks and Item Response Theory

Handwriting Recognition and Automatic Scoring for Descriptive Answers in Japanese Language Tests

Grading Chinese Answers on Specialty Subjective Questions

Data Availability

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Competing interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

A novel multi-task learning technique for offline handwritten short answer spotting and recognition

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Automated Short-Answer Grading Using Deep Neural Networks and Item Response Theory

Handwriting Recognition and Automatic Scoring for Descriptive Answers in Japanese Language Tests

Grading Chinese Answers on Specialty Subjective Questions

Data Availability

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Competing interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation