Log in

A feature weighted K-nearest neighbor algorithm based on association rules

  • Original Research
  • Published:
Journal of Ambient Intelligence and Humanized Computing Aims and scope Submit manuscript

Abstract

K-nearest neighbors (kNN) is a popular machine learning algorithm because of its clarity, simplicity, and efficacy. kNN has numerous drawbacks, including ignoring issues like class distribution, feature relevance, neighbor contribution, and the number of individuals for each class. In particular, some features could be more important than others for classifying a data point, and increasing their weight in the distance computation can make the kNN algorithm more accurate. Researchers propose different feature weightings, such as correlation-based feature selection, mutual information, and chi-square feature selection. This paper presents a new feature weighting technique based on association rules and information gain. The proposed approach gives a good performance compared to other similar methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price includes VAT (Germany)

Instant access to the full article PDF.

Algorithm 1
Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

Data availability

The datasets used during the current study are freely available in the UCI repository (Lichman, 2013) and Kaggle (www.kaggle.com/datasets).

Code Availability

The code is available upon reasonable request.

References

  • Agrawal R, Imieliński T, Swami A (1993) Mining association rules between sets of items in large databases. In: Proceedings of the 1993 ACM SIGMOD international conference on Management of data, pp 207–216

  • Agrawal R, Srikant R et al (1994) Fast algorithms for mining association rules. In: Proc. 20th int. conf. very large data bases, VLDB, Santiago, Chile, pp 487–499

  • Aguilera J, González LC, Montes-y Gómez M, et al (2018) A new weighted k-nearest neighbor algorithm based on newton’s gravitational force. In: Iberoamerican Congress on Pattern Recognition, Springer, pp 305–313

  • Almomany A, Ayyad WR, Jarrah A (2022) Optimized implementation of an improved knn classification algorithm using intel fpga platform: Covid-19 case study. J King Saud Univ Comput Inf Sci 34(6):3815–3827

    Google Scholar 

  • AlSukker A, Khushaba R, Al-Ani A (2010) Optimizing the k-nn metric weights using differential evolution. In: 2010 International Conference on Multimedia Computing and Information Technology (MCIT), IEEE, pp 89–92

  • Altman NS (1992) An introduction to kernel and nearest-neighbor nonparametric regression. Am Stat 46(3):175–185

    Article  MathSciNet  Google Scholar 

  • Asuncion A, Newman D (2007) Uci machine learning repository

  • Bhattacharya G, Ghosh K, Chowdhury AS (2017) Granger causality driven ahp for feature weighted knn. Pattern Recogn 66:425–436

    Article  Google Scholar 

  • Biswas N, Chakraborty S, Mullick SS et al (2018) A parameter independent fuzzy weighted k-nearest neighbor classifier. Pattern Recogn Lett 101:80–87

    Article  Google Scholar 

  • Chakravarthy SS, Bharanidharan N, Rajaguru H (2023) Deep learning-based metaheuristic weighted k-nearest neighbor algorithm for the severity classification of breast cancer. IRBM 44(3):100749

    Article  Google Scholar 

  • Chen Y, Hao Y (2017) A feature weighted support vector machine and k-nearest neighbor algorithm for stock market indices prediction. Expert Syst Appl 80:340–355

    Article  Google Scholar 

  • Cortes C, Vapnik V (1995) Support-vector networks. Mach Learn 20:273–297

    Article  Google Scholar 

  • Davis JV, Kulis B, Jain P, et al (2007) Information-theoretic metric learning. In: Proceedings of the 24th international conference on Machine learning, pp 209–216

  • Derrac J, García S, Herrera F (2014) Fuzzy nearest neighbor algorithms: taxonomy, experimental analysis and prospects. Inf Sci 260:98–119

    Article  Google Scholar 

  • Duda R, Hart P, Stork DG (2001) Pattern classification. Hoboken

  • Fahad LG, Tahir SF (2021) Activity recognition in a smart home using local feature weighting and variants of nearest-neighbors classifiers. J Ambient Intell Hum Comput 12:2355–2364

    Article  Google Scholar 

  • Friedman JH (2002) Stochastic gradient boosting. Comput Stat Data Anal 38(4):367–378

    Article  MathSciNet  Google Scholar 

  • Ganaie M, Tanveer M, Initiative ADN et al (2022) Knn weighted reduced universum twin svm for class imbalance learning. Knowl Based Syst 245:108578

    Article  Google Scholar 

  • García-Laencina PJ, Sancho-Gómez JL, Figueiras-Vidal AR et al (2009) K nearest neighbours with mutual information for simultaneous classification and missing data imputation. Neurocomputing 72(7–9):1483–1493

    Article  Google Scholar 

  • Han J, Pei J, Yin Y (2000) Mining frequent patterns without candidate generation. ACM Sigmod Record 29(2):1–12

    Article  Google Scholar 

  • Han J, Kamber M, Mining D (2006) Concepts and techniques. Morgan Kaufmann, pp 94104–3205

  • Hssina B, Merbouha A, Ezzikouri H et al (2014) A comparative study of decision tree id3 and c4. 5. Int J Adv Comput Sci Appl 4(2):13–19

    Google Scholar 

  • Huang GB, Zhou H, Ding X et al (2011) Extreme learning machine for regression and multiclass classification. IEEE Trans Syst Man Cybern Part B (Cybern) 42(2):513–529

    Article  Google Scholar 

  • Huang J, Wei Y, Yi J, et al (2018) An improved knn based on class contribution and feature weighting. In: 2018 10th international conference on measuring technology and mechatronics automation (ICMTMA), IEEE, pp 313–316

  • Jiao L, Geng X, Pan Q (2019) Bp \( k \) nn: \( k \)-nearest neighbor classifier with pairwise distance metrics and belief function theory. IEEE Access 7:48935–48947

    Article  Google Scholar 

  • Karabulut B, Arslan G, Ünver HM (2019) A weighted similarity measure for k-nearest neighbors algorithm. Celal Bayar Univ J Sci 15(4):393–400

    Google Scholar 

  • Kononenko I, Šimec E, Robnik-Šikonja M (1997) Overcoming the myopia of inductive learning algorithms with relieff. Appl Intell 7:39–55

    Article  Google Scholar 

  • Kuok CM, Fu A, Wong MH (1998) Mining fuzzy association rules in databases. ACM Sigmod Record 27(1):41–46

    Article  Google Scholar 

  • Li D, Gu M, Liu S et al (2022) Continual learning classification method with the weighted k-nearest neighbor rule for time-varying data space based on the artificial immune system. Knowl Based Syst 240:108145

    Article  Google Scholar 

  • Liu M, Vemuri BC (2012) A robust and efficient doubly regularized metric learning approach. In: Computer Vision–ECCV 2012: 12th European Conference on Computer Vision, Florence, Italy, October 7–13, 2012, Proceedings, Part IV 12, Springer, pp 646–659

  • Lu S, Yue Y, Liu X et al (2022) A novel unbalanced weighted knn based on svm method for pipeline defect detection using eddy current measurements. Meas Sci Technol 34(1):014001

    Article  Google Scholar 

  • Mendel JM, John RB (2002) Type-2 fuzzy sets made simple. IEEE Trans Fuzzy Syst 10(2):117–127

    Article  Google Scholar 

  • Nagaraj P, Saiteja K, Ram KK, et al (2022) University recommender system based on student profile using feature weighted algorithm and knn. In: 2022 International Conference on Sustainable Computing and Data Communication Systems (ICSCDS), IEEE, pp 479–484

  • Rodríguez-Fdez I, Canosa A, Mucientes M, et al (2015) Stac: a web platform for the comparison of algorithms using statistical tests. In: 2015 IEEE international conference on fuzzy systems (FUZZ-IEEE), IEEE, pp 1–8

  • Scherf M, Brauer W (1997) Feature selection by means of a feature weighting approach. Citeseer

  • Su MY (2011) Real-time anomaly detection systems for denial-of-service attacks by weighted k-nearest-neighbor classifiers. Expert Syst Appl 38(4):3492–3498

    Article  Google Scholar 

  • Sun L, Zhang J, Ding W et al (2022) Feature reduction for imbalanced data classification using similarity-based feature clustering with adaptive weighted k-nearest neighbors. Inf Sci 593:591–613

    Article  Google Scholar 

  • Tang B, He H (2015) Enn: extended nearest neighbor method for pattern recognition [research frontier]. IEEE Comput Intell Mag 10(3):52–60

    Article  Google Scholar 

  • Tsang IW, Cheung PM, Kwok JT (2005) Kernel relevant component analysis for distance metric learning. In: Proceedings. 2005 IEEE International Joint Conference on Neural Networks, 2005, IEEE, pp 954–959

  • Weinberger KQ, Saul LK (2009) Distance metric learning for large margin nearest neighbor classification. J Mach Learn Res 10(2)

  • Wilson DR, Martinez TR (1997) Improved heterogeneous distance functions. J Artif Intell Res 6:1–34

    Article  MathSciNet  Google Scholar 

  • Witten IH, Frank E, Hall MA, et al (2016) Practical machine learning tools and techniques. In: Data Mining, Morgan Kaufmann, p 4

  • **e P, **ng E (2014) Large scale distributed distance metric learning. ar**v preprint ar**v:1412.5949

  • Yang W, Wang Z, Sun C (2015) A collaborative representation based projections method for feature extraction. Pattern Recogn 48(1):20–27

    Article  Google Scholar 

  • Yue G, Qu Y, Deng A et al (2023) Neuro-weighted multi-functional nearest-neighbour classification. Expert Syst 40(5):e13125

    Article  Google Scholar 

  • Zhang C, Liu C, Zhang X et al (2017) An up-to-date comparison of state-of-the-art classification algorithms. Expert Syst Appl 82:128–150

    Article  Google Scholar 

  • Zhang H, Wang Z, **a W et al (2022) Weighted adaptive knn algorithm with historical information fusion for fingerprint positioning. IEEE Wirel Commun Lett 11(5):1002–1006

    Article  Google Scholar 

  • Zhang X, **ao H, Gao R et al (2022) K-nearest neighbors rule combining prototype selection and local feature weighting for classification. Knowl Based Syst 243:108451

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Contributions

The authors confirm their contribution to the paper as follows: Study conception and design: YM, MEf, KAB, Data collection: YM, YB, Analysis and interpretation of results: MEf, RF. Draft manuscript preparation: YM, KAB, YB. All authors reviewed the results and approved the final version of the manuscript.

Corresponding author

Correspondence to Youness Manzali.

Ethics declarations

Conflict of interest

The authors report no conflicts of interest or Conflict of interest.

Consent to participate

All authors consent to participate.

Consent for publication

All authors consent to publication.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Manzali, Y., Barry, K.A., Flouchi, R. et al. A feature weighted K-nearest neighbor algorithm based on association rules. J Ambient Intell Human Comput 15, 2995–3008 (2024). https://doi.org/10.1007/s12652-024-04793-z

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12652-024-04793-z

Keywords

Navigation