Search
Search Results
-
k Nearest Neighbors
K Nearest Neighbors (kNN) is a powerful and intuitive data mining model for classification and regression tasks. As an instance-based or memory-based... -
A Novel Classification Algorithm Based on the Synergy Between Dynamic Clustering with Adaptive Distances and K-Nearest Neighbors
This paper introduces a novel supervised classification method based on dynamic clustering (DC) and K-nearest neighbor (KNN) learning algorithms,...
-
Nearest Neighbors
We discuss k-nearest neighbor (kNN) classification and regression. We introduce several distance and similarity metrics and explain how to resolve... -
The local linear functional kNN estimator of the conditional expectile: uniform consistency in number of neighbors
The main purpose of the present paper is to investigate the problem of the nonparametric estimation of the expectile regression in which the response...
-
SNN-PDM: An Improved Probability Density Machine Algorithm Based on Shared Nearest Neighbors Clustering Technique
Probability density machine (PDM) is a novel algorithm which was proposed recently for addressing class imbalance learning (CIL) problem. PDM can...
-
Nearest Neighbors of Multivariate Runs
We investigate the joint distributions of the number of nearest neighbor contacts between different objects in the context of runs-related statistics... -
Nearest Neighbors of Multivariate Runs
We investigate the joint distributions of the number of nearest neighbor contacts between different objects in the context of runs-related statistics... -
Uniform consistency in number of neighbors of the kNN estimator of the conditional quantile model
We are interested in the efficiency of the nonparametric estimation of the conditional quantile when the response variable is a scalar given a...
-
Uniform consistency and uniform in number of neighbors consistency for nonparametric regression estimates and conditional U-statistics involving functional data
U -statistics represent a fundamental class of statistics arising from modeling quantities of interest defined by multi-subject responses. U -statistics...
-
A Comparison of Full Information Maximum Likelihood and Machine Learning Missing Data Analytical Methods in Growth Curve Modeling
Missing data are inevitable in longitudinal studies. Traditional methods, such as the full information maximum likelihood (FIML), are commonly used... -
Discriminant Analysis, Nearest Neighbor, and Support Vector Machine
This chapter covers three related machine learning techniques: discriminant analysis (DA), support vector machineSupport vector machine (SVM), and... -
A power-controlled reliability assessment for multi-class probabilistic classifiers
In multi-class classification, the output of a probabilistic classifier is a probability distribution of the classes. In this work, we focus on a...
-
Localization processes for functional data analysis
We propose an alternative to k -nearest neighbors for functional data whereby the approximating neighboring curves are piecewise functions built from...
-
Natural-neighborhood based, label-specific undersampling for imbalanced, multi-label data
This work presents a novel undersampling scheme to tackle the imbalance problem in multi-label datasets. We use the principles of the natural nearest...
-
A topological data analysis based classifier
Topological Data Analysis (TDA) is an emerging field that aims to discover a dataset’s underlying topological information. TDA tools have been...
-
Nearest neighbors estimation for long memory functional data
In this paper, we consider the asymptotic properties of the nearest neighbors estimation for long memory functional data. Under some regularity...
-
Prediction in non-sampled areas under spatial small area models
In this article we study the prediction problem in small geographic areas in the situation where the survey data does not cover a substantial...
-
Density Peak Clustering Using Grey Wolf Optimization Approach
Density peak clustering (DPC) finds the center of the cluster as the point with high density and a large distance from the center of the other...
-
Estimating the prevalence of anemia rates among children under five in Peruvian districts with a small sample size
In this paper we attempt to answer the following question: “Is it possible to obtain reliable estimates for the prevalence of anemia rates in...
-
The Third Competition on Spatial Statistics for Large Datasets
Given the computational challenges involved in calculating the maximum likelihood estimates for large spatial datasets, there has been significant...