Abstract
Devanagari ancient manuscript recognition framework is drawing a lot of considerations from researchers nowadays. Devanagari ancient manuscripts are rare and delicate documents. To exploit the priceless information included in these documents, these documents are being digitized. Optical character recognition process is being used for the recognition of these documents. This paper presents a system for improvement in recognition of Devanagari ancient manuscripts using AdaBoost and Bagging methodologies. Discrete cosine transform (DCT) zigzag is used for feature extraction. Decision tree, Naïve Bayes and support vector machine classifiers are used for the recognition of basic characters segmented from Devanagari ancient manuscripts. A dataset of 5484 pre-segmented characters of Devanagari ancient documents is considered for experimental work. Maximum recognition accuracy of 90.70% has been achieved using DCT zigzag features and RBF-SVM classifier. AdaBoost and Bagging ensemble methods are used with the base classifiers to improve the accuracy. Maximum accuracy of 91.70% is achieved for adaptive boosting (AdaBoost) with RBF-SVM. Various parameters for performance measures such as precision, recall, F-measure, false acceptance rate, false rejection rate and RMSE are used for assessing the quality of the ensemble methods.
Similar content being viewed by others
References
Abualigah LMQ (2019) Feature selection and enhanced krill herd algorithm for text document clustering. Stud Comput Intell. https://doi.org/10.1007/978-3-030-10674-4
Abualigah LMQ, Hanandeh ES (2015) Applying genetic algorithms to information retrieval using vector space model. Int J Comput Sci Eng Appl 5(1):19–28
Abualigah LM, Khader AT (2017) Unsupervised text feature selection technique based on hybrid particle swarm optimization algorithm with genetic operators for the text clustering. J Supercomput 73(11):4773–4795
Abualigah LM, Khader AT, Al-Betar MA, Alomari OA (2017) Text feature selection with a robust weight scheme and dynamic dimension reduction to text document clustering. Expert Syst Appl 84:24–36
Abualigah LM, Khader AT, Hanandeh ES (2018) Hybrid clustering analysis using improved krill herd algorithm. Appl Intell 48(11):4047–4071
Alkhateeb J, Ren J, Jiang J, Ipson SS, Abed HE (2008) Word based handwritten Arabic scripts recognition using DCT features and neural network classifier. In: Proceedings of the 5th international multi-conference on systems, signals and devices, pp 1–5
Ameta D (2017) Ensemble classifier approach in breast cancer detection and malignancy grading: a review. Int J Manag Public Sect Inf Commun Technol (IJMPICT) 8(1):17–26
Bansal S, Paliwal K (2018) Handwritten character recognition system using Gabor filter and SVM classifier. Int J Digit Appl Contemp Res 6(9):1–5
Chung Y, Kim N, Park C, Lee JH (2018) Improved neighborhood search for collaborative filtering. Int J Fuzzy Log Intell Syst 18(1):29–40
Dabbaghchian S, Ghaemmaghami MP, Aghagolzadeh A (2010) Feature extraction using discrete cosine transform with discrimination power analysis with a face recognition technology. Pattern Recogn 43(4):1431–1440
Dattatray VJ, Raghunath SH (2008) Radon and discrete cosine transforms based feature extraction and dimensionality reduction approach for face recognition. Sig Process 88(10):2604–2609
Dietterich T (2000) Ensemble methods in machine learning. In: Proceedings of first international workshop on multiple classifier systems, pp 1–15
Jiang S, Frigui H, Calhoun AW (2014) Text-independent speaker identification using soft bag-of-words feature representation. Int J Fuzzy Log Intell Syst 14(4):240–248
Khodadad I, Sid-Ahmed M, Abdel-Raheem E (2011) Online Arabic/Persian character recognition using neural network classifier and DCT features. In: Proceedings of the 54th international midwest symposium on circuits and systems, pp 1–4
Kim JS, Jeong JS (2015) Pattern recognition of ship navigational data using support vector machine. Int J Fuzzy Log Intell Syst 15(4):268–276
Kim K, Choi H, Oh K (2017) Object detection using ensemble of linear classifiers with fuzzy adaptive boosting. EURASIP J Image Video Process. https://doi.org/10.1186/s13640-017-0189-y
Kleber F, Sablatnig R, Gau M, and Miklas H (2008) Ancient document analysis based on text line extraction. In: Proceedings of the 19th international conference on pattern recognition, pp 1–4
Kumar M, **dal MK, Sharma RK (2014) A novel hierarchical techniques for offline handwritten Gurmukhi character recognition. Natl Acad Sci Lett 37(6):567–572
Kumar M, **dal MK, Sharma RK, **dal SR (2018a) Character and numeral recognition for non-Indic and Indic scripts: a survey. Artif Intell Rev. https://doi.org/10.1007/s10462-017-9607-x
Kumar M, **dal SR, **dal MK, Lehal GS (2018b) Improved recognition results of medieval handwritten Gurmukhi manuscripts using boosting and bagging methodologies. Neural Process Lett. https://doi.org/10.1007/s11063-018-9913-6
Kuncheva LI (2005) Combining Pattern Classifiers: Methods and Algorithms. Wiley, New York
Lawgali A, Bouridane A, Angelova M, Ghassemlooy Z (2011) Handwritten Arabic character recognition: which feature extraction method. Int J Adv Sci Technol 34:1–8
Lee H, Kim S (2016) Black-box classifier interpretation using decision tree and fuzzy logic-based classifier implementation. Int J Fuzzy Log Intell Syst 16(1):27–35
Ling CX, Huang J, Zhang H (2003) AUC: a statistically consistent and more discriminating measure than accuracy. In: Proceedings of the 18th international joint conference on artificial intelligence (IJCAI’03), pp 329–341
Liu N, Han W (2007) Recognition of human faces using discrete cosine transform filtered trace feature. In: Proceedings of the 6th international conference on information, communications and signal processing (ICICS), pp 1–5
Mitchell T (1997) Machine learning. McGraw-Hill, New York City
Monro DM, Rakshit S, Zhang D (2007) DCT-based iris recognition. IEEE Trans Pattern Anal Mach Intell 29(4):586–595
Ngo CW, Chan CK (2005) Video text detection and segmentation for optical character recognition. Multimed Syst 10(3):261–272
Parisi R, Claudio ED, Lucarelli G, Orlandi G (1998) Car plate recognition by neural networks and image processing. Proc IEEE Int Symp Circuits Syst 3:195–198
Quacimy BE, Kerroum MA, Hammouch A (2014) Feature extraction based on DCT for handwritten digit recognition. Int J Comput Sci Issues 11(6):27–33
Quo L, Boukir S (2014) Ensemble margin framework for image classification. In: Proceedings of the IEEE international conference on image processing, France, pp 4231–4235
Quo L, Boukir S (2017) Building an ensemble classifier using ensemble margin. Application to image classification. In: Proceedings of the 2017 IEEE international conference on image processing, Bei**g, pp 4492–4496
Ramteke SP, Gurjar AA, Deshmukh DS (2018) A streamlined OCR system for handwritten Marathi text document classification and recognition using SVM-ACS algorithm. Int J Intell Eng Syst 11(3):186–195
Rokach L (2010) Ensemble methods for classifiers. In: Data mining and knowledge discovery handbook, pp 957–998. https://datajobs.com/data-science-repo/Ensemble-Methods-[Lior-Rokach].pdf
Santana LEA, Silva L, Canuto AM, Pintro F, Vale KO (2010) A comparative analysis of genetic algorithm and ant colony optimization to select attributes for a heterogeneous ensemble of classifiers. In: IEEE congress evolutionary computation (CEC), pp 1–8
Wang S, Yao X (2013) Relationships between diversity of classification ensembles and single-class performance measures. Knowl Data Eng 25(1):206–219
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
During our research, we suffered a lot from the lack of a public dataset. Thus, we do not have a benchmark to compare our algorithm with others. A public dataset may help other researchers working on similar projects as ours. So we decide to share our raw data for experimental work.
Conflict of interest
The authors declare that they have no conflict of interest.
Additional information
Communicated by V. Loia.
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Narang, S.R., **dal, M.K. & Kumar, M. Devanagari ancient character recognition using DCT features with adaptive boosting and bootstrap aggregating. Soft Comput 23, 13603–13614 (2019). https://doi.org/10.1007/s00500-019-03897-5
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00500-019-03897-5