Log in

Nearest neighbors methods for support vector machines

  • Published:
Annals of Operations Research Aims and scope Submit manuscript

    We’re sorry, something doesn't seem to be working properly.

    Please try refreshing the page. If that doesn't work, please contact support so we can address the problem.

Abstract

A key issue in the practical applicability of the support vector machine methodology is the identification of the support vectors in very large data sets, a problem to which a great deal of attention has been given in the literature. In the present article we propose methods based on sampling and nearest neighbors, that allow for an efficient implementation of an approximate solution to the classification problem and, at least in some problems, will help in identifying a significant fraction of the support vectors in large data sets at low cost. The performance of the proposed method is evaluated in different examples and some of its theoretical properties are discussed.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2

Similar content being viewed by others

Notes

  1. \(\hat{d}\) represents the median of the squared euclidean distance between data from different classes in the feature space, as suggested by Wu and Wang (2009).

  2. For more information on the LIBSVM library and the datasets go to www.csie.ntu.edu.tw/~cjlin/libsvm/

  3. For more information on the e1071 package go to cran.r-project.org/web/packages/e1071/.

References

  • Amato, G., Rabitti, F., Savino, P., & Zezula, P. (2003). Region proximity in metric spaces and its use for approximate similarity search. ACM Transactions on Information Systems, 2(21), 192–227.

    Article  Google Scholar 

  • Ben-Hur, A., Ong, C.-S., Sonnenburg, S., Scholkopf, B., & Ratsch, G. (2008). Support vector machines and kernels for computational biology. PLoS Computer Biology, 4(10). http://svmcompbio.tuebingen.mpg.de/.

  • Brito, M. R., Chavez, E. L., Quiroz, A. J., & Yukich, J. E. (1997). Connectivity of the mutual k-nearest neighbor graph in outlier detection and clustering. Statistics and Probability Letters, 35, 33–42.

    Article  Google Scholar 

  • Brito, M. R., Quiroz, A. J., & Yukich, J. E. (2002). Graph theoretic procedures for dimension identification. Journal of Multivariate Analysis, 81, 67–84.

    Article  Google Scholar 

  • Brito, M. R., Quiroz, A. J., & Yukich, J. E. (2013). Intrinsic dimension identification via graph-theoretic methods. Journal of Multivariate Analysis, 116, 263–277.

    Article  Google Scholar 

  • Byvatov, E., & Schneider, G. (2003). Support vector machine applications in bioinformatics. Applied Bioinformatics, 2(2), 67–77.

    Google Scholar 

  • Cao, B., Zhan, D., & Wu. X. (2009). Application of svm in financial research. In CSO international joint conference on computational sciences and optimization, Vol. 2, pp. 507–511.

  • Chavez, E., Navarro, G., Baez-Yates, R., & Marroquin, J. L. (2001). ACM Computing Surveys, 3(33), 273–321.

    Article  Google Scholar 

  • Cortes, C., & Vapnik, V. (1995). Support vector networks. Machine Learning, 20, 1–25.

    Google Scholar 

  • Cristianini, N., & Shawe-Taylor, J. (2000). An introduction to support vector machines and other kernel-based learning methods. Cambridge: Cambridge University Press.

    Book  Google Scholar 

  • Fan, R.-E., Chen, P.-H., & Lin, C.-J. (2005). Working set selection using second order information for training support vector machines. Journal of Machine Learning Research, 6, 1889–1918.

    Google Scholar 

  • Ferris, M. C., & Munson, T. S. (2002). Interior point methods for massive support vector machines. SIAM Journal on Optimization, 13(3), 783–804.

    Article  Google Scholar 

  • Friedman, J. H., Baskett, F., & Shustek, J. (1975). An algorithm for finding nearest neighbors. IEEE Transactions on Computers, C–24(10), 1000–1006.

    Article  Google Scholar 

  • Friedman, J. H., Bentley, J. L., & Finkel, R. (1978). An algorithm for finding best matches in logarithmic expected time. ACM Transactions on Mathematical Software, 3(3), 209–226.

    Article  Google Scholar 

  • Gonzalez-Lima, M. D., Hager, W. W., & Zhang, H. (2011). An affine-scaling interior-point method for continuous knapsack constraints with application to support vector machines. SIAM Journal on Optimization, 21(1), 361–390.

    Article  Google Scholar 

  • Janson, S. (2002). On concentration of probability. Contemporary Combinatorics, 10, 289–301.

    Google Scholar 

  • Joachims, T. (1998). Text categorization with support vector machines: Learning with many relevant features. In C. Nédellec & C. Rouveirol (Eds.), Proceedings of the 10th European conference on machine learning (ECML-98), pp. 137–142. Springer, Berlin.

  • Jung, J. H., Leary, D. P. O., & Tits, A. L. (2008). Adaptive constraint reduction for training support vector machines. Electronic Transactions on Numerical Analysis, 31, 156–177.

    Google Scholar 

  • Lin, C.-J. (2001). On the convergence of the decomposition method for support vector machines. IEEE Transactions on Neural Networks, 12, 1288–1298.

    Article  Google Scholar 

  • Navarro, G. (2002). Searching in metric spaces by spatial approximation. The VLDB Journal, 11(1), 28–46.

    Article  Google Scholar 

  • Noble, W. S. (2004). Support vector machine applications in computational biology. In B. Schoelkopf, K. Tsuda, & J.-P. Vert (Eds.), Kernel methods in computational biology (pp. 71–92). Cambridge: MIT Press.

  • Osuna, E. E., Freund, R., & Girosi, F. (1997a). Support vector machines: Training and applications. Technical report A.I. memo no. 1602, CBCL paper no. 144, Massachusetts Institute of Technology, Cambridge, MA, USA.

  • Osuna, E. E., Freund, R., & Girosi, F. (1997b). Training support vector vector machines: An application to face detection. In IEEE conference on computer vision and pattern recognition, pp. 130–136.

  • Patella, M., & Ciaccia, P. (2009). Approximate similarity search: A multi-faceted problem. Journal of Discrete Algorithms, 1(7), 36–48.

    Article  Google Scholar 

  • Platt, J. (1998). Fast training of support vector machines using sequential minimal optimization. In B. Scholkopf, C. Burges, & A. Smola (Eds.), Advances in kernel methods, support vector learning (pp. 41–65). Cambridge, MA: MIT Press.

    Google Scholar 

  • Schilling, M. F. (1986). Mutual and shared neighbor probabilities: Finite and infinite dimensional results. Advances in Applied Probability, 18, 388–405.

    Article  Google Scholar 

  • Serafini, T., Zanghirati, G., & Zanni, L. (2003). Gradient projection methods for quadratic programs and applications in training support vector machines. Optimization Methods and Software, 20, 353–378.

    Article  Google Scholar 

  • Serafini, T., & Zanni, L. (2005). On the working set selection in gradient-based descomposition techniques for support vector machines. Optimization Methods and Software, 20, 586–593.

    Google Scholar 

  • Uribe, R., Navarro, G., Barrientos, R. J., Marín, M. (2006). An index data structure for searching in metric space databases. In Proceeding of international conference on computational science 2006 (ICC 2006). Lecture notes in computer science (Vol. 3991, pp. 611–617). Springer.

  • Vapnik, V. (1995). The nature of statistical learning theory. New York: Springer.

    Book  Google Scholar 

  • Woodsend, K., & Gondzio, J. (2011). Exploiting separability in large scale linear support vector machine training. Computational Optimization and Applications, 49, 241–269.

    Article  Google Scholar 

  • Wu, K.-P., & Wang, S.-D. (2009). Choosing the kernel parameters for support vector machines by the inter-cluster distance in the feature space. Pattern Recognition, 42, 710–717.

    Article  Google Scholar 

  • Zanni, L. (2006). An improved gradient projection-based decomposition techniques for support vector machines. Computational Management Science, 3, 131–145.

    Article  Google Scholar 

  • Zarruk, D. (2012) Resolución del problema de Máquinas de Vectores de Soporte mediante búsqueda de k-vecinos más cercanos. Undergraduate mathematics thesis, Departamento de Matemáticas, Universidad de los Andes, Bogotá, Colombia.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to M. D. González-Lima.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Camelo, S.A., González-Lima, M.D. & Quiroz, A.J. Nearest neighbors methods for support vector machines. Ann Oper Res 235, 85–101 (2015). https://doi.org/10.1007/s10479-015-1956-8

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10479-015-1956-8

Keywords

Navigation