Abstract
In this paper, we investigate the effectiveness of classical approaches to active learning in the problem of document segmentation with the aim of reducing the size of the training sample. A modified approach to selection of document images for labeling and subsequent model training is presented. The results of active learning are compared to those of transfer learning on fully labeled data. The paper also investigates how the problem domain of a training set, on which a model is initialized for transfer learning, affects the subsequent uptraining of the model.
Similar content being viewed by others
Notes
REFERENCES
Settles, B., Active learning literature survey, Technical report no. 1648, University of Wisconsin-Madison, 2009.
Scheffer, T., Decomain, C., and Wrobel, S., Active hidden Markov models for information extraction, Proc. Int. Symp. Intelligent Data Analysis, 2001, pp. 309–318.
Dagan, I. and Engelson, S., Committee-based sampling for training probabilistic classifiers, Proc. 12th Int. Conf. Machine Learning, 1995, pp. 150–157.
Culotta, A. and McCallum, A., Reducing labeling effort for structured prediction tasks, Proc. 20th Natl. Conf. Artificial Intelligence, 2005, pp. 746–751.
Brust, C., Käding, C., and Denzler, J., Active learning for deep object detection, Proc. 14th Int. Jt. Conf. Computer Vision, Imaging and Computer Graphics Theory and Applications, 2019, pp. 181–190.
Kao, C., Lee, T., et al., Localization-aware active learning for object detection, Proc. 14th Asian Conf. Computer Vision, 2018, pp. 506–522.
Roy, S., Unmesh, A., and Namboodiri, V., Deep active learning for object detection, Proc. 29th British Machine Vision Conf., 2018.
Aghdam, H., Gonzalez-Garcia, A., et al., Active learning for deep detection neural networks, Proc. 17th IEEE/CVF Int. Conf. Computer Vision, 2019, pp. 3671–3679.
Lv, X., Duan, F., et al., Deep active learning for surface defect detection, Sensors, 2020, vol. 20, no. 6.
Lin, T., Maire, M., et al., Microsoft COCO: Common objects in context, Lect. Notes Comput. Sci., 2014, vol. 8693, pp. 740–755.
Zhong, X., Tang, J., and Yepes, A., PubLayNet: Largest dataset ever for document layout analysis, Proc. Int. Conf. Document Analysis and Recognition (ICDAR), 2019, pp. 1015–1022.
Belyaeva, O.V., Perminov, A.I., and Kozlov, I.S., Synthetic data usage for document segmentation models fine-tuning, Tr. Inst. Sist. Program. Ross. Akad. Nauk, 2020, vol. 32, no. 4, pp. 189–202. https://doi.org/10.15514/ISPRAS-2020-32(4)-14
Shen, Z., Zhao, J., et al., OLALA: Object-level active learning for efficient document layout annotation, 2021.
Ren, S., He, K., et al., Faster R-CNN: Towards real-time object detection with region proposal networks, Proc. 28th Int. Conf. Neural Information Processing Systems, 2015, pp. 91–99.
He, K., Gkioxari, G., et al., Mask R-CNN, Proc. IEEE Int. Conf. Computer Vision (ICCV), 2017, pp. 2980–2988.
Author information
Authors and Affiliations
Corresponding authors
Ethics declarations
The authors declare that they have no conflicts of interest.
Additional information
Translated by Yu. Kornienko
Rights and permissions
About this article
Cite this article
Kiranov, D.M., Ryndin, M.A. & Kozlov, I.S. Active Learning and Transfer Learning for Document Segmentation. Program Comput Soft 49, 566–573 (2023). https://doi.org/10.1134/S0361768823070046
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1134/S0361768823070046