Adaptive active learning through k-nearest neighbor optimized local density clustering

Ji, **a; Ye, WanLi; Li, XueJun; Zhao, Peng; Yao, Sheng

doi:10.1007/s10489-022-04169-w

Adaptive active learning through k-nearest neighbor optimized local density clustering

Published: 04 November 2022

Volume 53, pages 14892–14902, (2023)
Cite this article

Applied Intelligence Aims and scope Submit manuscript

**a Ji ORCID: orcid.org/0000-0002-2820-0405¹,
WanLi Ye¹,
XueJun Li¹,
Peng Zhao¹ &
…
Sheng Yao¹

447 Accesses
1 Altmetric
Explore all metrics

Abstract

Active learning iteratively constructs a refined training set to train an effective classifier with as few labeled instances as possible. In areas where labeling is expensive, active learning plays an important and irreplaceable role. The main challenge of active learning is to correctly identify critical samples. One of the current mainstream methods is to mine the potential data structure based on clustering and then identify key instances. However, the existing methods all adopt deterministic strategies, and the number of key samples is only related to the number of samples to be classified. The internal structure information of the sample clusters to be classified is not used. After analysis and verification, this deterministic key sample selection strategy has serious label waste. This is a serious problem that urgently needs to be solved in active learning. To this end, we propose an adaptive active learning algorithm based on density clustering (AAKC). Firstly, we introduce k-nearest neighbor information to redefine the local density of the instance. The new sample density can clearly express the local structural information of the sample. Secondly, we developed an adaptive key instance selection strategy based on the k-nearest neighbor sample density, which can adaptively select the necessary number of instance queries according to the structural information of the instance clusters to be classified, avoiding label waste. The experimental results of comparison with other algorithms show that our algorithm uses fewer labels to achieve better classification accuracy and has excellent stability.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

EUR 32.99 /Month

Get 10 units per month
Download Article/Chapter or Ebook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Subscribe now

Buy Now

Price includes VAT (Germany)

Instant access to the full article PDF.

Institutional subscriptions

Three-way active learning through clustering selection

Article 03 March 2020

Improving self-training with density peaks of data and cut edge weight statistic

Article 04 April 2020

Tri-partition cost-sensitive active learning through kNN

Article 11 October 2017

References

Li Y, Fan B, Zhang W, Ding W, Yin J (2021) Deep active learning for object detection. Inf Sci 579:418–433
Article MathSciNet Google Scholar
Deng C, Liu X, Li C, Tao D (2018) Active multi-kernel domain adaptation for hyperspectral image classification. Pattern Recogn 77:306–315
Article Google Scholar
Cao X, Yao J, Xu Z, Meng D (2020) Hyperspectral image classification with convolutional neural network and active learning. IEEE Trans Geosci Remote Sens 58(7):4604–4616
Article Google Scholar
Haut JM, Paoletti ME, Plaza J, Li J, Plaza A (2018) Active learning with convolutional neural networks for hyperspectral image classification using a new Bayesian approach. IEEE Trans Geosci Remote Sens 56(11):6440–6461
Article Google Scholar
Kansizoglou I, Bampis L, Gasteratos A (2019) An active learning paradigm for online audio-visual emotion recognition. IEEE Trans Affect Comput 13(2):756–768
Article Google Scholar
Reyes O, Ventura S (2018) Evolutionary strategy to perform batch-mode active learning on multi-label data. ACM Trans Intell Syst Technol 9(4):1–26
Article Google Scholar
Guo J, Pang Z, Bai M, **e P, Chen Y (2021) Dual generative adversarial active learning. Appl Intell 51(8):5953–5964
Article Google Scholar
McCallumzy AK, Nigamy K (1998) Employing EM and pool-based active learning for text classification. In: Proceedings of the international conference on machine learning, pp 359–367
Dasgupta S, Hsu D (2008) Hierarchical sampling for active learning. In: Proceedings of the 25th international conference on machine learning, pp 208–215
Wang M, Min F, Zhang ZH, Wu YX (2017) Active learning through density clustering. Expert Syst Appl 85:305–317
Article Google Scholar
**e J, Gao H, **e W (2016) K-nearest neighbor optimized density peak fast searching clustering algorithm. Chin Sci Inf Sci 46(2):258–280
Google Scholar
Rodriguez A, Laio A (2014) Clustering by fast search and find of density peaks. Science 344 (6191):1492–1496
Article Google Scholar
Huang SJ, ** R, Zhou ZH (2010) Active learning by querying informative and representative examples. Adv Neural Inf Process Syst 23:892–900
Google Scholar
Seung HS, Opper M, Sompolinsky H (1992) Query by committee. In: Proceedings of the fifth annual workshop on computational learning theory, pp 287–294
Gilad-Bachrach R, Navot A, Tishby N (2003) Kernel query by committee (KQBC). Leibniz Cent Hebr Univ Jerus Israel Tech Rep 88:2004
Google Scholar
Min F, Zhang SM, Ciucci D, Wang M (2020) Three-way active learning through clustering selection. Int J Mach Learn Cybern 11(5):1033–1046
Article Google Scholar
Wang M, Zhang YY, Min F, Deng LP, Gao L (2020) A two-stage density clustering algorithm. Soft Comput 24:17797–17819
Article Google Scholar
Blake C (1998) UCI repository of machine learning databases. http://www.ics.uci.edu/mlearn/MLRepository. Accessed 01 Dec 2021
Han J, Pei J, Tong H (2022) Data mining: concepts and techniques. Morgan Kaufmann
**ang Z, Zhang L (2012) Research on an optimized C4. 5 algorithm based on rough set theory. In: 2012 international conference on management of e-commerce and e-government, pp 272–274
Rish I (2001) An empirical study of the naive Bayes classifier. In: IJCAI 2001 workshop on empirical methods in artificial intelligence, vol 3. no. 22, pp 41–46
Liaw A, Wiener M (2002) Classification and regression by randomForest. R News 2(3):18–22
Google Scholar
Cortes EA, Martinez MG, Rubio NG (2007) Multiclass corporate failure prediction by Adaboost. M1. Int Adv Econ Res 13(3):301–312
Article Google Scholar
Ruan YX, Lin HT, Tsai MF (2014) Improving ranking performance with cost-sensitive ordinal classification via regression. Inf Retr 17(1):1–20
Article Google Scholar
Cai YD, Feng KY, Lu WC, Chou KC (2006) Using LogitBoost classifier to predict protein structural classes. J Theor Biol 238(1):172–176
Article MATH Google Scholar
Quinlan JR (1996) Bagging, boosting, and C4. 5. In: AAAI/IAAI, vol 1. pp 725–730
Afshar S, Mosleh M, Kheyrandish M (2013) Presenting a new multiclass classifier based on learning automata. Neurocomputing 104:97–104
Article Google Scholar
Suoliang Z, Tianshu Z, Ming L, Kunlun L, Baozong Y (2010) An experimental study of classifier filtering, 361–364
Frank E, Hall MA, Witten IH (2016) The WEKA Workbench. Online appendix for “Data mining: practical machine learning tools and techniques”, Morgan Kaufmann, Fourth Edition, 2016
Cai D, He X (2011) Manifold adaptive experimental design for text categorization. IEEE Trans Knowl Data Eng 24(4):707–719
Article Google Scholar
Munoz-Mari J, Tuia D, Camps-Valls G (2012) Semisupervised classification of remote sensing images with active queries. IEEE Trans Geosci Remote Sens 50(10):3751–3763
Article Google Scholar

Download references

Acknowledgements

This work was supported in part by the Natural Science Foundation of China under Grant 61972001, in part by the General Project of Anhui Natural Science Foundation under Grant 1908085MF188 and 2108085MF212, and in part by the Key Projects of Natural Science Foundation of Anhui Province Colleges and Universities under Grant KJ2020A0041.

Author information

Authors and Affiliations

The School of Computer Science and Technology, Anhui University, Hefei, China
**a Ji, WanLi Ye, XueJun Li, Peng Zhao & Sheng Yao

Authors

**a Ji
View author publications
You can also search for this author in PubMed Google Scholar
WanLi Ye
View author publications
You can also search for this author in PubMed Google Scholar
XueJun Li
View author publications
You can also search for this author in PubMed Google Scholar
Peng Zhao
View author publications
You can also search for this author in PubMed Google Scholar
Sheng Yao
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to **a Ji.

Ethics declarations

Conflict of Interests

The authors declare that they have no conflict of interest.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Ji, X., Ye, W., Li, X. et al. Adaptive active learning through k-nearest neighbor optimized local density clustering. Appl Intell 53, 14892–14902 (2023). https://doi.org/10.1007/s10489-022-04169-w

Download citation

Accepted: 11 September 2022
Published: 04 November 2022
Issue Date: June 2023
DOI: https://doi.org/10.1007/s10489-022-04169-w

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

EUR 32.99 /Month

Get 10 units per month
Download Article/Chapter or Ebook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Subscribe now

Buy Now

Price includes VAT (Germany)

Instant access to the full article PDF.

Institutional subscriptions

Adaptive active learning through k-nearest neighbor optimized local density clustering

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Three-way active learning through clustering selection

Improving self-training with density peaks of data and cut edge weight statistic

Tri-partition cost-sensitive active learning through kNN

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of Interests

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Adaptive active learning through k-nearest neighbor optimized local density clustering

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Three-way active learning through clustering selection

Improving self-training with density peaks of data and cut edge weight statistic

Tri-partition cost-sensitive active learning through kNN

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of Interests

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation