Log in

Devanagari ancient character recognition using DCT features with adaptive boosting and bootstrap aggregating

  • Methodologies and Application
  • Published:
Soft Computing Aims and scope Submit manuscript

Abstract

Devanagari ancient manuscript recognition framework is drawing a lot of considerations from researchers nowadays. Devanagari ancient manuscripts are rare and delicate documents. To exploit the priceless information included in these documents, these documents are being digitized. Optical character recognition process is being used for the recognition of these documents. This paper presents a system for improvement in recognition of Devanagari ancient manuscripts using AdaBoost and Bagging methodologies. Discrete cosine transform (DCT) zigzag is used for feature extraction. Decision tree, Naïve Bayes and support vector machine classifiers are used for the recognition of basic characters segmented from Devanagari ancient manuscripts. A dataset of 5484 pre-segmented characters of Devanagari ancient documents is considered for experimental work. Maximum recognition accuracy of 90.70% has been achieved using DCT zigzag features and RBF-SVM classifier. AdaBoost and Bagging ensemble methods are used with the base classifiers to improve the accuracy. Maximum accuracy of 91.70% is achieved for adaptive boosting (AdaBoost) with RBF-SVM. Various parameters for performance measures such as precision, recall, F-measure, false acceptance rate, false rejection rate and RMSE are used for assessing the quality of the ensemble methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price includes VAT (Germany)

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

References

  • Abualigah LMQ (2019) Feature selection and enhanced krill herd algorithm for text document clustering. Stud Comput Intell. https://doi.org/10.1007/978-3-030-10674-4

    Article  Google Scholar 

  • Abualigah LMQ, Hanandeh ES (2015) Applying genetic algorithms to information retrieval using vector space model. Int J Comput Sci Eng Appl 5(1):19–28

    Google Scholar 

  • Abualigah LM, Khader AT (2017) Unsupervised text feature selection technique based on hybrid particle swarm optimization algorithm with genetic operators for the text clustering. J Supercomput 73(11):4773–4795

    Article  Google Scholar 

  • Abualigah LM, Khader AT, Al-Betar MA, Alomari OA (2017) Text feature selection with a robust weight scheme and dynamic dimension reduction to text document clustering. Expert Syst Appl 84:24–36

    Article  Google Scholar 

  • Abualigah LM, Khader AT, Hanandeh ES (2018) Hybrid clustering analysis using improved krill herd algorithm. Appl Intell 48(11):4047–4071

    Article  Google Scholar 

  • Alkhateeb J, Ren J, Jiang J, Ipson SS, Abed HE (2008) Word based handwritten Arabic scripts recognition using DCT features and neural network classifier. In: Proceedings of the 5th international multi-conference on systems, signals and devices, pp 1–5

  • Ameta D (2017) Ensemble classifier approach in breast cancer detection and malignancy grading: a review. Int J Manag Public Sect Inf Commun Technol (IJMPICT) 8(1):17–26

    Google Scholar 

  • Bansal S, Paliwal K (2018) Handwritten character recognition system using Gabor filter and SVM classifier. Int J Digit Appl Contemp Res 6(9):1–5

    Google Scholar 

  • Chung Y, Kim N, Park C, Lee JH (2018) Improved neighborhood search for collaborative filtering. Int J Fuzzy Log Intell Syst 18(1):29–40

    Article  Google Scholar 

  • Dabbaghchian S, Ghaemmaghami MP, Aghagolzadeh A (2010) Feature extraction using discrete cosine transform with discrimination power analysis with a face recognition technology. Pattern Recogn 43(4):1431–1440

    Article  Google Scholar 

  • Dattatray VJ, Raghunath SH (2008) Radon and discrete cosine transforms based feature extraction and dimensionality reduction approach for face recognition. Sig Process 88(10):2604–2609

    Article  Google Scholar 

  • Dietterich T (2000) Ensemble methods in machine learning. In: Proceedings of first international workshop on multiple classifier systems, pp 1–15

  • Jiang S, Frigui H, Calhoun AW (2014) Text-independent speaker identification using soft bag-of-words feature representation. Int J Fuzzy Log Intell Syst 14(4):240–248

    Article  Google Scholar 

  • Khodadad I, Sid-Ahmed M, Abdel-Raheem E (2011) Online Arabic/Persian character recognition using neural network classifier and DCT features. In: Proceedings of the 54th international midwest symposium on circuits and systems, pp 1–4

  • Kim JS, Jeong JS (2015) Pattern recognition of ship navigational data using support vector machine. Int J Fuzzy Log Intell Syst 15(4):268–276

    Article  Google Scholar 

  • Kim K, Choi H, Oh K (2017) Object detection using ensemble of linear classifiers with fuzzy adaptive boosting. EURASIP J Image Video Process. https://doi.org/10.1186/s13640-017-0189-y

    Article  Google Scholar 

  • Kleber F, Sablatnig R, Gau M, and Miklas H (2008) Ancient document analysis based on text line extraction. In: Proceedings of the 19th international conference on pattern recognition, pp 1–4

  • Kumar M, **dal MK, Sharma RK (2014) A novel hierarchical techniques for offline handwritten Gurmukhi character recognition. Natl Acad Sci Lett 37(6):567–572

    Article  Google Scholar 

  • Kumar M, **dal MK, Sharma RK, **dal SR (2018a) Character and numeral recognition for non-Indic and Indic scripts: a survey. Artif Intell Rev. https://doi.org/10.1007/s10462-017-9607-x

    Article  Google Scholar 

  • Kumar M, **dal SR, **dal MK, Lehal GS (2018b) Improved recognition results of medieval handwritten Gurmukhi manuscripts using boosting and bagging methodologies. Neural Process Lett. https://doi.org/10.1007/s11063-018-9913-6

    Article  Google Scholar 

  • Kuncheva LI (2005) Combining Pattern Classifiers: Methods and Algorithms. Wiley, New York

    MATH  Google Scholar 

  • Lawgali A, Bouridane A, Angelova M, Ghassemlooy Z (2011) Handwritten Arabic character recognition: which feature extraction method. Int J Adv Sci Technol 34:1–8

    Google Scholar 

  • Lee H, Kim S (2016) Black-box classifier interpretation using decision tree and fuzzy logic-based classifier implementation. Int J Fuzzy Log Intell Syst 16(1):27–35

    Article  Google Scholar 

  • Ling CX, Huang J, Zhang H (2003) AUC: a statistically consistent and more discriminating measure than accuracy. In: Proceedings of the 18th international joint conference on artificial intelligence (IJCAI’03), pp 329–341

  • Liu N, Han W (2007) Recognition of human faces using discrete cosine transform filtered trace feature. In: Proceedings of the 6th international conference on information, communications and signal processing (ICICS), pp 1–5

  • Mitchell T (1997) Machine learning. McGraw-Hill, New York City

    MATH  Google Scholar 

  • Monro DM, Rakshit S, Zhang D (2007) DCT-based iris recognition. IEEE Trans Pattern Anal Mach Intell 29(4):586–595

    Article  Google Scholar 

  • Ngo CW, Chan CK (2005) Video text detection and segmentation for optical character recognition. Multimed Syst 10(3):261–272

    Article  Google Scholar 

  • Parisi R, Claudio ED, Lucarelli G, Orlandi G (1998) Car plate recognition by neural networks and image processing. Proc IEEE Int Symp Circuits Syst 3:195–198

    Google Scholar 

  • Quacimy BE, Kerroum MA, Hammouch A (2014) Feature extraction based on DCT for handwritten digit recognition. Int J Comput Sci Issues 11(6):27–33

    Google Scholar 

  • Quo L, Boukir S (2014) Ensemble margin framework for image classification. In: Proceedings of the IEEE international conference on image processing, France, pp 4231–4235

  • Quo L, Boukir S (2017) Building an ensemble classifier using ensemble margin. Application to image classification. In: Proceedings of the 2017 IEEE international conference on image processing, Bei**g, pp 4492–4496

  • Ramteke SP, Gurjar AA, Deshmukh DS (2018) A streamlined OCR system for handwritten Marathi text document classification and recognition using SVM-ACS algorithm. Int J Intell Eng Syst 11(3):186–195

    Google Scholar 

  • Rokach L (2010) Ensemble methods for classifiers. In: Data mining and knowledge discovery handbook, pp 957–998. https://datajobs.com/data-science-repo/Ensemble-Methods-[Lior-Rokach].pdf

  • Santana LEA, Silva L, Canuto AM, Pintro F, Vale KO (2010) A comparative analysis of genetic algorithm and ant colony optimization to select attributes for a heterogeneous ensemble of classifiers. In: IEEE congress evolutionary computation (CEC), pp 1–8

  • Wang S, Yao X (2013) Relationships between diversity of classification ensembles and single-class performance measures. Knowl Data Eng 25(1):206–219

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Munish Kumar.

Ethics declarations

During our research, we suffered a lot from the lack of a public dataset. Thus, we do not have a benchmark to compare our algorithm with others. A public dataset may help other researchers working on similar projects as ours. So we decide to share our raw data for experimental work.

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Communicated by V. Loia.

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Narang, S.R., **dal, M.K. & Kumar, M. Devanagari ancient character recognition using DCT features with adaptive boosting and bootstrap aggregating. Soft Comput 23, 13603–13614 (2019). https://doi.org/10.1007/s00500-019-03897-5

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00500-019-03897-5

Keywords

Navigation