Search Results - Springer

Chapter

Introduction

In recent years, with the development of the information age, the amount of data has grown dramatically. At the same time, dirty data have already existed in various types of databases. Due to the negative imp...

Zhixin Qi, Hongzhi Wang, Zejiao Dong in Dirty Data Processing for Machine Learning (2024)

Chapter

Dirty Data Impacts on Regression Models

Due to the negative influence of dirty data on the accuracy of regression models, the relation between the data quality and model results is able to be used in the selection of proper regression models and dir...

Zhixin Qi, Hongzhi Wang, Zejiao Dong in Dirty Data Processing for Machine Learning (2024)

Book

Dirty Data Processing for Machine Learning

Zhixin Qi, Hongzhi Wang, Zejiao Dong (2024)

Chapter and Conference Paper

ANSWER: Automatic Index Selector for Knowledge Graphs

Efficient access to knowledge graphs is identified as the basic premise to make full use of knowledge graphs. Since the query processing efficiency is mainly affected by index configuration, it is necessary to...

Zhixin Qi, Haoran Zhang, Hongzhi Wang, Zemin Chao in Web and Big Data (2024)

Chapter

Density-Based Clustering for Incomplete Data

In real world, missing values exist in a lot of data sets and cause data incompleteness. However, traditional missing value imputation methods are not suitable for density-based clustering and affect the accur...

Zhixin Qi, Hongzhi Wang, Zejiao Dong in Dirty Data Processing for Machine Learning (2024)

Chapter

Cost-Sensitive Decision Tree Induction on Dirty Data

As the rapid growth of data in our society, dirty data are increasingly common. In the process of cost-sensitive decision tree induction, dirty data in training data sets have negative impacts on the selection...

Zhixin Qi, Hongzhi Wang, Zejiao Dong in Dirty Data Processing for Machine Learning (2024)

Chapter

Impacts of Dirty Data on Classification and Clustering Models

Since dirty data have negative influence on the accuracy of machine learning models, the relation between data quality and model results could be used in the selection of the proper model and data cleaning str...

Zhixin Qi, Hongzhi Wang, Zejiao Dong in Dirty Data Processing for Machine Learning (2024)

Chapter

Incomplete Data Classification with View-Based Decision Tree

Missing values bring negative influence in data analyses and decrease the accuracy of machine learning models. Since traditional classification methods are only able to be adopted on complete data sets, this c...

Zhixin Qi, Hongzhi Wang, Zejiao Dong in Dirty Data Processing for Machine Learning (2024)

Chapter

Feature Selection on Inconsistent Data

With the explosive growth of data size, inconsistent data appear more frequently. Due to inconsistent data detection and repairing in data preprocessing, feature selection approaches are lack of efficiency. To...

Zhixin Qi, Hongzhi Wang, Zejiao Dong in Dirty Data Processing for Machine Learning (2024)

Chapter and Conference Paper

Multi-SQL: An Automatic Multi-model Data Management System

Nowadays, data in applications become diverse and large in scale. In order to meet the increasing demand for multi-model data management, multi-model databases have evolved into huge systems with many knobs. H...

Yu Yan, Hongzhi Wang, Yutong Wang, Zhixin Qi, Jian Ma, Chang Liu… in Web and Big Data (2023)

Chapter and Conference Paper

Dirty-Data Impacts on Regression Models: An Experimental Evaluation

Data quality issues have attracted widespread attentions due to the negative impacts of dirty data on regression model results. The relationship between data quality and the accuracy of results could be applie...

Zhixin Qi, Hongzhi Wang in Database Systems for Advanced Applications (2021)

Article

TAILOR: time-aware facility location recommendation based on massive trajectories

In traditional facility location recommendations, the objective is to select the best locations which maximize the coverage or convenience of users. However, since users’ behavioral habits are often influenced...

Zhixin Qi, Hongzhi Wang, Tao He, Chunnan Wang… in Knowledge and Information Systems (2020)

Article

A survey of query result diversification

Nowadays, in information systems such as web search engines and databases, diversity is becoming increasingly essential and getting more and more attention for improving users’ satisfaction. In this sense, que...

Kai** Zheng, Hongzhi Wang, Zhixin Qi, Jianzhong Li… in Knowledge and Information Systems (2017)

Chapter and Conference Paper

Capture Missing Values with Inference on Knowledge Base

Data imputation is a basic step for data cleaning. Traditional data imputation approaches are lack of accuracy in the absence of knowledge. Involving knowledge base in imputation could overcome this shortcomin...

Zhixin Qi, Hongzhi Wang, Fanshan Meng… in Database Systems for Advanced Applications (2017)

14 Result(s)

Introduction

Dirty Data Impacts on Regression Models

Dirty Data Processing for Machine Learning

ANSWER: Automatic Index Selector for Knowledge Graphs

Density-Based Clustering for Incomplete Data

Cost-Sensitive Decision Tree Induction on Dirty Data

Impacts of Dirty Data on Classification and Clustering Models

Incomplete Data Classification with View-Based Decision Tree

Feature Selection on Inconsistent Data

Multi-SQL: An Automatic Multi-model Data Management System

Dirty-Data Impacts on Regression Models: An Experimental Evaluation

TAILOR: time-aware facility location recommendation based on massive trajectories

A survey of query result diversification

Capture Missing Values with Inference on Knowledge Base

Our Content

Other Sites

Help & Contacts