Log in

A novel filter feature selection algorithm based on relief

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

The Relief algorithm is a feature selection algorithm that uses the nearest neighbor to weight attributes. However, Relief only considers the correlation between features, which leads to a low classification accuracy on noisy datasets whose interaction effect is weak. To overcome the weaknesses of Relief, a novel feature selection algorithm, named Multidirectional Relief (MRelief), is proposed. The MRelief algorithm includes four improvements. First, the multidirectional neighbor search method, which finds all neighbors within a distance threshold from different orientations, is included to obtain regularly distributed neighbors. Therefore, the weights provided by MRelief are more accurate than those provided by Relief. Second, a novel objective function that incorporates the instances’ force coefficients is introduced to reduce the influence of noise. Thus, the new objective function improves the classification accuracy of MRelief. Third, subset generation is introduced to the MRelief algorithm and combined with the maximum Pearson maximum distance (MPMD) to generate a promising candidate subset for feature selection. Finally, a novel multiclass margin definition is proposed and introduced to the MRelief algorithm to handle multiclass data. As demonstrated by extensive experiments on eleven UCI datasets and eleven real-world gene expression benchmarking datasets, MRelief is significantly better than other algorithms including LPLIR, ReliefF, LLH-Relief, MultiSURF, MSLIR-NN, MRMR, MPMD and STIR in our study.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

References

  1. Craven M, DiPasquoa D, Freitagb D, McCalluma A, Mitchella T, Nigama K, Slatterya S (2000) Learning to construct knowledge bases from the world wide web. Artif Intell 118(1–2):69–113

    Article  Google Scholar 

  2. Roweis ST, Saul LK (2000) Nonlinear dimensionality reduction by locally linear embedding. Science 290(5500):2323–2326

    Article  Google Scholar 

  3. Kushwaha N, Pant M (2018) Link based BPSO for feature selection in big data text clustering. Future Gener Comput Syst 82:190–199

    Article  Google Scholar 

  4. Blum AL, Rivest RL (1992) Training a 3-node neural networks is NP-complete. Neural Netw 5(1):117–127

    Article  Google Scholar 

  5. Men M, Zhong P, Wang Z, Lin Q (2020) Distributed learning for supervised multiview feature selection. Appl Intell 50(9):2749–2769

    Article  Google Scholar 

  6. **ang S, Shen XT, Ye JP (2015) Efficient nonconvex sparse group feature selection via continuous and discrete optimization. Artif Intell 224:28–50

    Article  MathSciNet  Google Scholar 

  7. Zhang P, Gao WF, Liu GX (2018) Feature selection considering weighted relevancy. Appl Intell 48(12):4615–4625

    Article  Google Scholar 

  8. Cohen JP, Ding W, Kuhlman C, Chen AJ, Di LP (2016) Rapid building detection using machine learning. Appl Intell 45(2):443–457

    Article  Google Scholar 

  9. **ao J, Cao HW, Jiang X, Jiang XY, Gu X, **e L (2017) GMDH-based semi-supervised feature selection for customer classification. Knowledge-Based Syst 132:236–246

    Article  Google Scholar 

  10. Belkoura S, Zanin M, LaTorre A (2019) Fostering interpretability of data mining models through data perturbation. Expert Syst Appl 137:191–201

    Article  Google Scholar 

  11. Ji CY, Li Y, Fan JH, Lan SM (2019) A novel simplification method for 3D geometric point cloud based on the importance of point. IEEE ACCESS 7:129029–129042

    Article  Google Scholar 

  12. Zheng YF, Li Y, Wang G, Chen YP, Xu Q, Fan JH, Cui XT (2018) A novel hybrid algorithm for feature selection. Pers Ubiquit Comput 22(5–6):971–985

    Article  Google Scholar 

  13. Chen YP, Li Y, Wang G, Zheng YP, Xu Q, Fan JH, Cui XT (2017) A novel bacterial foraging optimization algorithm for feature selection. Expert Syst Appl 83:1–17

    Article  Google Scholar 

  14. Turabieh H, Mafarja M, Li XD (2019) An introduction to variable and feature selection. Expert Syst Appl 3:1157–1182

    Google Scholar 

  15. Cui XT, Li Y, Fan JH, Wang T, Zheng YF (2020) A hybrid improved dragonfly algorithm for feature selection. IEEE ACCESS 8:155619–155629

    Article  Google Scholar 

  16. Guyon I, Gunn S, Nikravesh M, Zadeh LA (2006) Feature extraction, foundations and applications. Springer

    Book  Google Scholar 

  17. Peng H, Long F, Ding C (2005) Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy. IEEE Trans Pattern Anal Mach Intell 27(8):1226–1238

    Article  Google Scholar 

  18. Urbanowicz RJ, Meeker M, Cava WL, Olson RS, Moore JH (2018) Relief-based feature selection: introduction and review. J Biomed Informat 85:189–203

    Article  Google Scholar 

  19. Li Y, Wang G, Chen HL, Dong H, Zhu X, Wang S (2011) An improved particle swarm optimization for feature selection. J Bionic Eng 8(2):191–200

    Article  Google Scholar 

  20. Nazemi A, Dehghan M (2015) A neural network method for solving support vector classification problems. Neurocomputing 152:369–376

    Article  Google Scholar 

  21. Kim DH, Abraham A, Cho JH (2007) A hybrid genetic algorithm and bacterial foraging approach for global optimization. Inf Sci 177(18):3918–3937

    Article  Google Scholar 

  22. Kashef S, Nezamabadi-pour H (2015) An advanced ACO algorithm for feature subset selection. Neurocomputing 147:271–279

    Article  Google Scholar 

  23. Mohapatra P, Chakravarty S, Dash PK (2015) An improved cuckoo search based extreme learning machine for medical data classification. Swarm Evol Compu 24:25–49

    Article  Google Scholar 

  24. Karaboga D, Gorkemli B, Ozturk C, Karaboga N (2014) A comprehensive survey: artificial bee colony (ABC) algorithm and applications. Artif Intell Rev 42(1):21–57

    Article  Google Scholar 

  25. Chen H, Li WD, Yang X (2020) A whale optimization algorithm with chaos mechanism based on quasi-opposition for global optimization problems. Expert Syst Appl 158:113612

    Article  Google Scholar 

  26. Abdel-Basset M, Mohamed R, Mirjalili S (2021) A novel whale optimization algorithm integrated with Nelder-Mead simplex for multi-objective optimization problems. Knowledge-Based Syst 212:106619

    Article  Google Scholar 

  27. Mafarja M, Aljarah I, Heidari AA, Faris H, Fournier-Viger P, Li XD, Mirjalili S (2018) Binary dragonfly optimization for feature selection using time-varying transfer functions. Knowledge-Based Syst 161:185–204

    Article  Google Scholar 

  28. Reyes O, Morell C (2015) Scalable extensions of the ReliefF algorithm for weighting and selecting features on the multi-label learning context. Neurocomputing 161:168–182

    Article  Google Scholar 

  29. Yilmaz T, Yazici A, Kitsuregawa M (2014) RELIEF-MM: effective modality weighting for multimedia information retrieval. Multimedia Syst 20(4):389–413

    Article  Google Scholar 

  30. Sun Y (2007) Iterative RELIEF for feature weighting: algorithms, theories, and applications. IEEE Trans Pattern Anal Mach Intell 29(6):1035–1051

    Article  Google Scholar 

  31. Tan B, Zhang L (2020) Local preserving logistic I-relief for semi-supervised feature selection. Neurocomputing 399:48–64

    Article  Google Scholar 

  32. Greene CS, Penrod NM, Kiralis J, Moore JH (2009) Spatially uniform relieff (SURF) for computationally-efficient filtering of gene-gene interactions. BioData Min 2(1):5

    Article  Google Scholar 

  33. Urbanowicz RJ, Olson RS, Schmit P, Meeker M, Moore JH (2018) Benchmarking relief-based feature selection methods for bioinformatics data mining. J Biomed Informat 85:168–188

    Article  Google Scholar 

  34. McKinney BA, White BC, Grill DE, Li PW, Kennedy RB, Poland GA, Oberg AL (2013) ReliefSeq: a gene-wise adaptive-K nearest-neighbor feature selection tool for finding gene-gene interactions and main effects in mRNA-Seq gene expression data. PLoS One 8:e81527

    Article  Google Scholar 

  35. Robnik-Sikonja M, Kononenko I (2003) Theoretical and empirical analysis of ReliefF and RReliefF. Mach Learn 53(1–2):23–69

    Article  Google Scholar 

  36. Le TT, Urbanowicz RJ, Moore JH, Mckinney BA (2018) Statistical inference relief (STIR) feature selection. Bioinformatics 35(8):1358–1365

    Article  Google Scholar 

  37. Zheng YF, Li Y, Wang G, Chen YP, Xu Q, Fan JH, Cui XT (2019) A novel hybrid algorithm for feature selection based on whale optimization algorithm. IEEE ACCESS 7:14908–14923

    Article  Google Scholar 

  38. Cortes C, Vapnik V (1995) Support-vector network. Mach Learn 20(3):273–297

    Article  MATH  Google Scholar 

  39. Leopold E, Kindermann J (2001) Text categorization with support vector machines. How to represent texts in input space? Mach Learn 46(1–3):423–444

    MATH  Google Scholar 

  40. Tang BG, Zhang L (2019) Multi-class semi-supervised logistic i-relief feature selection based on nearest neighbor. In: Pacific-Asia conference on knowledge discovery and data mining, pp 281–292

  41. Zhang L, Huang X, Zhou W (2019) Logistic local hyperplane-relief: a feature weighting method for classification. Knowledge-Based Syst 181:104741

    Article  Google Scholar 

  42. Demšar J (2006) Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 7:1–30

    MathSciNet  MATH  Google Scholar 

Download references

Acknowledgments

This research was supported by the Department of Science and Technology of Jilin Province (No. 20190303135SF).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jiahao Fan.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Cui, X., Li, Y., Fan, J. et al. A novel filter feature selection algorithm based on relief. Appl Intell 52, 5063–5081 (2022). https://doi.org/10.1007/s10489-021-02659-x

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-021-02659-x

Keywords

Navigation