Abstract
Named entity recognition (NER) is the process of recognizing and classifying important information (entities) in text. Proper nouns, such as a person’s name, an organization’s name, or a location’s name, are examples of entities. The NER is one of the important modules in applications like human resources, customer support, search engines, content classification, and academia. In this work, we consider NER for low-resource Indian languages like Hindi and Marathi. The transformer-based models have been widely used for NER tasks. We consider different variations of BERT like base-BERT, RoBERTa, and AlBERT and benchmark them on publicly available Hindi and Marathi NER datasets. We provide an exhaustive comparison of different monolingual and multilingual transformer-based models and establish simple baselines currently missing in the literature. We show that the monolingual MahaRoBERTa model performs the best for Marathi NER whereas the multilingual XLM-RoBERTa performs the best for Hindi NER. We also perform cross-language evaluation and present mixed observations.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
Multicase BERT: https://huggingface.co/bert-base-multilingual-cased
Indic BERT: https://huggingface.co/ai4bharat/indic-bert
Xlm-roberta: https://huggingface.co/xlm-roberta-base
Roberta-Marathi: https://huggingface.co/flax-community/roberta-base-mr
Roberta-Hindi: https://huggingface.co/flax-community/roberta-hindi
indic-transformers-hi-roberta: https://huggingface.co/neuralspace-reverie/indic-transformers-hi-roberta
MahaBERT: https://huggingface.co/l3cube-pune/marathi-bert
MahaRoBERTa: https://huggingface.co/l3cube-pune/marathi-roberta
MahaAlBERT: https://huggingface.co/l3cube-pune/marathi-albert-v2.
References
Grishman R, Sundheim BM (1996) Message understanding conference-6: a brief history (1996)
Maybury M (1999) Advances in automatic text summarization. MIT Press
Davenport TH, Klahr P (1998) Managing customer support knowledge. California Manage Rev 40(3):195–208
Wu Y, Schuster M, Chen Z, Le QV, Norouzi M, Macherey W, Krikun M, Cao Y, Gao Q, Macherey K et al (2016) Google’s neural machine translation system: bridging the gap between human and machine translation. ar**v preprint ar**v:1609.08144
Savelsbergh MW (1990) An efficient implementation of local search algorithms for constrained routing problems. Eur J Operat Res 47(1):75–85
Finkel JR, Grenager T, Manning CD (2005) Incorporating non-local information into information extraction systems by gibbs sampling, pp 363–370
Joshi R (2022) L3cube-mahacorpus and mahabert: marathi monolingual corpus, marathi bert language models, and resources. ar**v preprint ar**v:2202.01159
Joshi R, Goel P, Joshi R (2019) Deep learning for hindi text classification: a comparison. In: International conference on intelligent human computer interaction. Springer, pp 94–101
Kulkarni A, Mandhane M, Likhitkar M, Kshirsagar G, Jagdale J, Joshi R (2022) Experimental evaluation of deep learning models for marathi text classification. In: Proceedings of the 2nd international conference on recent trends in machine learning, IoT, smart cities and applications. Springer, pp 605–613
Kulkarni A, Mandhane M, Likhitkar M, Kshirsagar G, Joshi R (2021) L3cubemahasent: a marathi tweet-based sentiment analysis dataset. In: Proceedings of the eleventh workshop on computational approaches to subjectivity, sentiment and social media analysis, pp 213–220
Velankar A, Patil H, Gore A, Salunke S, Joshi R (2021) Hate and offensive speech detection in hindi and marathi. ar**v preprint ar**v:2110.12200
Seon CN, Ko Y, Kim JS, Seo J (2001) Named entity recognition using machine learning methods and pattern-selection rules. In: NLPRS. Citeseer, pp 229–236
Alfred R, Leong LC, On CK, Anthony P (2014) Malay named entity recognition based on rule-based approach
Shao Y, Hardmeier C, Nivre J (2016) Multilingual named entity recognition using hybrid neural networks
Xu K, Zhou Z, Hao T, Liu W (2017) A bidirectional lstm and conditional random fields approach to medical named entity recognition, pp 355–365
Ekbal A, Bandyopadhyay S (2010) Named entity recognition using support vector machine: a language independent approach. Int J Electr Comput Syst Eng 4(2):155–170
Patil NV, Patil AS, Pawar BV (2017) Hmm based named entity recognition for inflectional language, pp 565–572. https://doi.org/10.1109/COMPTELIX.2017.8004034
Matthew Honnibal Ines Montani SVL, Boyd A (2020) spacy: industrial-strength natural language processing in python. https://doi.org/10.5281/zenodo.1212303
Goodfellow I, Bengio Y, Courville A (2016) Deep learning. MIT Press
Lothritz C, Allix K, Veiber L, Bissyand T, Klein J (2020) Evaluating pretrained transformer-based models on the task of fine-grained named entity recognition, pp 3750–3760. https://doi.org/10.18653/v1/2020.coling-main.334
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser L, Polosukhin I (2017) Attention is all you need
Devlin J, Chang MW, Lee K, Toutanova K (2019) Bert: pre-training of deep bidirectional transformers for language understanding
Liu Y, Ott M, Goyal N, Du J, Joshi M, Chen D, Levy O, Lewis M, Zettlemoyer L, Stoyanov V (2019) Roberta: a robustly optimized Bert pretraining approach
Kakwani D, Kunchukuttan A, Golla S, Gokul N, Bhattacharyya A, Khapra MM, Kumar P (2020) inlpsuite: monolingual corpora, evaluation benchmarks and pre-trained multilingual language models for Indian languages, pp 4948–4961
Krishnarao AA, Gahlot H, Srinet A, Kushwaha D (2009) A comparative study of named entity recognition for Hindi using sequential learning algorithms, pp 1164–1169
Srihari RK (2000) A hybrid approach for named entity and sub-type tagging. In: Sixth applied natural language processing conference, pp 247–254
Albawi S, Mohammed TA, Al-Zawi S (2017) Understanding of a convolutional neural network. In: 2017 international conference on engineering and technology (ICET), pp 1–6. IEEE
Schmidhuber J, Hochreiter S et al (1997) Long short-term memory. Neural Comput 9(8):1735–1780
Yang G, Xu H (2020) A residual Bilstm model for named entity recognition. IEEE Access 8:227,710–227,718. https://doi.org/10.1109/ACCESS.2020.3046253
Shah H, Bhandari P, Mistry K, Thakor S, Patel M, Ahir K (2016) Study of named entity recognition for Indian languages. Int J Inf 6(1):11–25
Bhattacharjee K, Mehta S, Kumar A, Mehta R, Pandya D, Chaudhari P, Verma D et al (2019) Named entity recognition: a survey for Indian languages 1:217–220
Patil N, Patil AS, Pawar B (2016) Issues and challenges in Marathi named entity recognition. Int J Nat Lang Comput (IJNLC) 5(1):15–30
Singh TD, Ekbal A, Bandyopadhyay S (2008) Manipuri POS tagging using CRF and SVM: a language independent approach, pp 240–245 (2008)
Shishtla PM, Gali K, **ali P, Varma V (2008) Experiments in telugu ner: a conditional random field approach
Shelke R, Thakore DS (2020) A novel approach for named entity recognition on Hindi language using residual Bilstm network
Murthy R, Kunchukuttan A, Bhattacharyya P (2018) Judicious selection of training data in assisting language for multilingual neural NER. In: Proceedings of the 56th annual meeting of the association for computational linguistics (Volume 2: Short Papers). Association for Computational Linguistics, Melbourne, Australia, pp 401–406. https://doi.org/10.18653/v1/P18-2064. https://aclanthology.org/P18-2064
Murthy R, Kunchukuttan A, Bhattacharyya P (2018) Judicious selection of training data in assisting language for multilingual neural NER, pp 401–406. https://doi.org/10.18653/v1/P18-2064
Ijcnlp-08 workshop on NER for south and south east Asian languages. http://ltrc.iiit.ac.in/ner-ssea-08/
Pan X, Zhang B, May J, Nothman J, Knight K, Ji H (2017) Cross-lingual name tagging and linking for 282 languages, pp 1946–1958. https://doi.org/10.18653/v1/P17-1178. https://aclanthology.org/P17-1178
Acknowledgements
This work was done under the L3Cube Pune mentorship program. We would like to express our gratitude towards our mentors at L3Cube for their continuous support and encouragement.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Litake, O., Sabane, M., Patil, P., Ranade, A., Joshi, R. (2023). Mono Versus Multilingual BERT: A Case Study in Hindi and Marathi Named Entity Recognition. In: Gunjan, V.K., Zurada, J.M. (eds) Proceedings of 3rd International Conference on Recent Trends in Machine Learning, IoT, Smart Cities and Applications. Lecture Notes in Networks and Systems, vol 540. Springer, Singapore. https://doi.org/10.1007/978-981-19-6088-8_56
Download citation
DOI: https://doi.org/10.1007/978-981-19-6088-8_56
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-19-6087-1
Online ISBN: 978-981-19-6088-8
eBook Packages: EngineeringEngineering (R0)