Search
Search Results
-
On histogram-based regression and classification with incomplete data
We consider the problem of nonparametric regression with possibly incomplete covariate vectors. The proposed estimators, which are based on histogram...
-
Data Preparation
This chapter focuses on data preparation, a crucial step in the analytics process to ensure that the data used for modeling is of the highest... -
Statistical Learning of Large-Scale Genetic Data: How to Run a Genome-Wide Association Study of Gene-Expression Data Using the 1000 Genomes Project Data
Teaching statistics through engaging applications to contemporary large-scale datasets is essential to attracting students to the field. To this end,...
-
Using Density and Fuzzy Clustering for Data Cleaning and Segmental Description of Livestock Data
The cluster algorithms density-based clustering with noise and fuzzy c-means were used to edit and group a large, noisy data set from a livestock...
-
Foundations I: Introductory Data Analysis with R
In disciplinary research—from Anthropology to Zoology (and every discipline in between!)—studies produce data from multiple variables. Most research... -
Dimension reduction and visualization of multiple time series data: a symbolic data analysis approach
Exploratory analysis and visualization of multiple time series data are essential for discovering the underlying dynamics of a series before...
-
Automatic data-based bin width selection for rose diagram
A rose diagram is a representation that circularly organizes data with the bin width as the central angle. This diagram is widely used to display and...
-
New models for symbolic data analysis
Symbolic data analysis (SDA) is an emerging area of statistics concerned with understanding and modelling data that takes distributional form (i.e. symbols...
-
Data Display
This chapter presents the basic concepts of plotting in Python. It also provides help in turning Python plots into good-looking figures for... -
Change point detection in text data
The analysis of text data using artificial intelligence and statistical methods has become increasingly important in recent years. One application is...
-
Description of Data and Essential Probability Models
This chapter portrays how to make sense of gathered data before performing formal statistical inference. The topics covered are types of data, how to... -
Summarization of Data and Theoretical Distributions
An observation or experiment with two or more possible outcomes, where the occurrence of any outcome is determined by chance, is referred to as a... -
Practical Implementation of Machine Learning Techniques and Data Analytics Using R
In this digital era all E-commerce activities are based on the modern recommendation systems where a company wants to analyse the buying pattern of... -
Parametric and Non-parametric Bayesian Imputation for Right Censored Survival Data
A common feature of much survival data is censoring due to incompletely observed lifetimes. Survival analysis methods have been designed to take... -
Fiducial-Based Statistical Intervals for Zero-Inflated Gamma Data
In practice, it is not uncommon to observe count data that possess excessive zeros (i.e., zero inflation) relative to the assumed discrete...
-
Statistics and Data Analysis in an ANOVA Model
In Chap. 1 we reviewed a set of fundamental statistical concepts and tools and used them to summarize the... -
Estimating Sample Skewness from Sample Data Summaries and Associated Evaluation of Normality
AbstractWe propose a method to estimate a sample skewness from the given summary statistics and give explicit formulas for the most common scenarios....
-
On regression and classification with possibly missing response variables in the data
This paper considers the problem of kernel regression and classification with possibly unobservable response variables in the data, where the...
-
Zero-inflated Poisson-Akash distribution for count data with excessive zeros
Over-dispersed models are often used whenever the variation is more than what in point of fact is anticipated by a model. One of the reasons behind...
-
Investigating Variable Selection Techniques Under Missing Data: A Simulation Study
Variable selection is one of the most pervasive problems researchers face, especially with the increased ease in data collection arising from online...