Search
Search Results
-
Clustering Via Decision Tree Construction
Clustering is an exploratory data analysis task. It aims to find the intrinsic structure of data by organizing data objects into similarity groups or... -
A New Theoretical Framework for K-Means-Type Clustering
One of the fundamental clustering problems is to assign n points into k clusters based on the minimal sum-of-squares(MSSC), which is known to be... -
The Mathematics of Learning: Dealing with Data *
Learning is key to develo** systems tailored to a broad range of data analysis and information extraction tasks. We outline the mathematical... -
Web Page Classification*
This chapter describes systems that automatically classify web pages into meaningful categories. It first defines two types of web page... -
Sequential Pattern Mining by Pattern-Growth: Principles and Extensions*
Sequential pattern mining is an important data mining problem with broad applications. However, it is also a challenging problem since the mining may... -
Incremental Mining on Association Rules
The discovery of association rules has been known to be useful in selective marketing, decision analysis, and business management. An important... -
A Feature/Attribute Theory for Association Mining and Constructing the Complete Feature Set
A correct selection of features (attributes) is vital in data mining. For this aim, the complete set of features is constructed. Here are some... -
Web Mining – Concepts, Applications and Research Directions
From its very beginning, the potential of extracting valuable knowledge from the Web has been quite evident. Web mining, i.e. the application of data... -
Mining Association Rules from Tabular Data Guided by Maximal Frequent Itemsets
We propose the use of maximal frequent itemsets (MFIs) to derive association rules from tabular datasets. We first present an efficient method to... -
Privacy-Preserving Data Mining
The growth of data mining has raised concerns among privacy advocates. Some of this is based on a misunderstanding of what data mining does. The... -
Logical Regression Analysis: From Mathematical Formulas to Linguistic Rules
Data mining means the discovery of knowledge from (a large amount of)data, and so data mining should provide not only predictions but also knowledge... -
kNN Join for Dynamic High-Dimensional Data: A Parallel Approach
The k nearest neighbor (kNN) join operation is a fundamental task that combines two high-dimensional databases, enabling data points in the User... -
Multi-level Storage Optimization for Intermediate Data in AI Model Training
As Transformer-based large models become the mainstream of AI training, the development of hardware devices (e.g., GPUs) cannot keep up with the... -
Take a Close Look at the Optimization of Deep Kernels for Non-parametric Two-Sample Tests
The maximum mean discrepancy (MMD) test with deep kernel is a powerful method to distinguish whether two samples are drawn from the same... -
Balanced Hop-Constrained Path Enumeration in Signed Directed Graphs
Hop-constrained path enumeration, which aims to output all the paths from two distinct vertices within the given hops, is one of the fundamental... -
Probabilistic Reverse Top-k Query on Probabilistic Data
Reverse top-k queries have received much attention from research communities. The result of reverse top-k queries is a set of objects, which had the... -
Bayesian Network-Based Multi-objective Estimation of Distribution Algorithm for Feature Selection Tailored to Regression Problems
Feature selection is an essential pre-processing step in Machine Learning for improving the performance of models, reducing the time of predictions,... -
Applying Genetic Algorithms to Validate a Conjecture in Graph Theory: The Minimum Dominating Set Problem
This paper presents a case study where the interdisciplinary approach between artificial intelligence, specifically genetic algorithms, and discrete... -
Multiresolution Controller Based on Window Function Networks for a Quanser Helicopter
To improve neural network (NN) performance, new activation functions, such as ReLU, GELU, and SELU, to name a few, have been proposed. Windows-based... -
Smart Noise Detection for Statistical Disclosure Attacks
While anonymization systems like mix networks can provide privacy to their users by, e.g., hiding their communication relationships, several traffic...