Abstract
Machine Learning (ML) has been categorized as a branch of Artificial Intelligence (AI) under the Computer Science domain wherein programmable machines imitate human learning behavior with the help of statistical methods and data. The Healthcare industry is one of the largest and busiest sectors in the world, functioning with an extensive amount of manual moderation at every stage. Most of the clinical documents concerning patient care are hand-written by experts, selective reports are machine-generated. This process elevates the chances of misdiagnosis thereby, imposing a risk to a patient's life. Recent technological adoptions for automating manual operations have witnessed extensive use of ML in its applications. The paper surveys the applicability of ML approaches in automating medical systems. The paper discusses most of the optimized statistical ML frameworks that encourage better service delivery in clinical aspects. The universal adoption of various Deep Learning (DL) and ML techniques as the underlying systems for a variety of wellness applications, is delineated by challenges and elevated by myriads of security. This work tries to recognize a variety of vulnerabilities occurring in medical procurement, admitting the concerns over its predictive performance from a privacy point of view. Finally providing possible risk delimiting facts and directions for active challenges in the future.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
In this era of technology and advancements, we have come across multiple transformations made by ML/DL systems in industries such as governance, manufacturing, and transportation. Over the past couple of years, the utilization of intelligent systems has increased manifold in various domains, including our routine life. One such realm is healthcare [1, 2], which earlier had been impervious to large-scale technological disruptions. The Healthcare industry across the globe has evolved extensively with the advent of machine intelligence. Nasr et al. [3] explore current state-of-the-art smart healthcare systems, highlighting significant topics such as wearable and smartphone devices for fitness monitoring, ML for illness prediction, and assistive frameworks, including social robots designed for assisted living environments. Bharadwaj et al. [4] confer applications of ML algorithms integrated with the Healthcare Internet of Things (H-IoT) in terms of their compensations, choice, and potential future aspects. The acceptance of ML/DL techniques has sustained exceptional results in versatile tasks such as brain tumor segmentation [5], Saliva sample classification of COPD patients [6], Chronic Neurological Disorder assistance [7], anomaly recognition in the Artificial Pancreas [8], clinical image reconstruction [9], cancerous cell classification, to name a few. It is expected that in the coming years' intelligent software systems will take over much of the human labor, put by radiologists and physicists in examining medical documents. ML will transform conventional medical practice and research. Healthcare has emerged as an active application area for ML/DL models in achieving human-level performance in various pathological tasks [10]. Some of the investigations reported that the intelligent models outperformed clinical experts in certain respects. Esteva et al. [11] illustrate the categorization of skin lesions with a single CNN evaluated against 21 board-certified dermatologists on biopsy-proven clinical diagnosis of the scariest skin cancer The findings show that AI can classify skin cancer with a degree of accuracy equivalent to dermatologists. Rajpurkar et al. [16].
Dhief et al. [17] presented an extensive review of IoT frameworks and state-of-art techniques used in healthcare and voice pathology surveillance systems whereas Alhussein et al. [18] investigated the voice abnormality detection system using DL on mobile healthcare frameworks. Researchers and physicians are reviewing numerous approaches to utilize the skill of DL methods for Intensive Care Unit (ICUs) and critically acclaimed concerns [19,20,21], similarly, Ganainy et al. [22] proposed a real-time consultation system in the clinical context which forecasts the Mean Arterial Pressure (MAP) values’ current status at the ease of bed accessibility using new ML structures. The majority of intelligent applications utilizing customer records have received disappointing results at some point in their performance due to their obsession with metrics [23,24,25]. Envisioning the privacy concerns that arise while dealing with data transmission or analysis to model a predictive system settles at a compromising state [26,27,28]. This paper attempts to acknowledge the diverse techniques of ML and their diligence in the Healthcare ecosystem. A brief of subsequent sections is provided next [29,30,31,32].
-
This paper shares a concise statistical background of ML Algorithms while discussing multiple ML models, their application in clinical aspects, along with certain hindrances, and any possible solutions to tackle those shortcomings.
-
This paper outlines various challenges related to medical analysis using ML and DL techniques.
-
This paper analyses and lists different heterogeneous sources contributing to healthcare data and the flaws associated.
-
This paper describes the applications of ML in healthcare for medical prognosis, computer-aided detection, diagnosis, and treatment. Further, the associated drawbacks are outlined as well.
-
This paper lists different types of vulnerabilities in the ML pipeline and their sources. Further, the work highlights various techniques to avoid information breaches and preserve the privacy of data for clinical users.
The remainder of the paper is organized as follows. Section 2 presents the various ML algorithms, their applications, and their mathematical background. Section 3 presents the different applications of ML in the healthcare systems and tries to bring the present scenario where utilization of the intelligent systems to automate regular tasks is demonstrated. Section 4 witnesses the probable vulnerabilities that are encountered during the preparation of ML models in the healthcare pipeline. Section 5 presents a study to recognize the privacy challenges concerning the involvement of AI systems and various approaches to preserving privacy concerns. Conclusively, Sect. 6 presents imminent prospects and areas that require further research followed by the chapter conclusion in Sect. 7.
2 Background of ML Algorithms
The majority of develo** countries have invested their time and money in advanced technical prospects that in some way or other prove to be cost-effective in the long run. Development is often associated with the advent of automated machinery and mechanical systems as we grow towards becoming a data-centric world. Management and effective use of data at the industrial level is an irksome task if humans run the errands, this is where the applicability of various ML/DL-based intelligent systems gain its importance. ML algorithms are developed specifically for supporting models to solve a problem in different domains (e.g., Healthcare, Fintech, Industrial, etc.) [33]. Okay et al. [34] demonstrate that applying (Interpretable Machine Learning) IML models to sophisticated and difficult-to-interpret ML approaches provides thorough interpretability while preserving accuracy, which is challenging when crucial medical choices are at stake. Ileberi et al. [35] implement an ML-based framework, Synthetic Minority over-sampling Technique (SMOTE), for credit card scam exposure since it outstrips other prevailing methodologies. Ahsan et al. [36] propose a unique prognostics framework based on statistics-driven ML modeling for forecasting qualification test results of electronic components, allowing a decrease in qualification test cost and time. Hari et al. [37] offer a supervised ML method built by modeling the behavior of Gallium Nitride (GaN) power electronic devices for reliably forecasting the current waveforms and switching voltage of these innovative devices. Seng et al. [38] concentrate on how computer vision (CV) and ML practices may be applied to existing vinification actions and vineyard organizations to obtain industry-relevant outcomes. Rehman et al. [39] provide an ML technique for the localization of brain tumor cells utilizing the textonmap image on FLAIR scans of Magnetic Resonance Images (MRI). Singh et al. [40] offer a unique ensemble-based classification technique that combines AI, fog computing, and smart health to create a reliable platform for the early identification of COVID-19 infection. Comparatively, Vyas et al. [41] offer an ML model powered by a multimodal method for assessing a patient's readiness to suggest the hospital plays an important part in action design based on patient choice. Some ML algorithms and their purposes are discussed in the forthcoming sections. A summary of the different ML algorithms discussed in this chapter is depicted in Fig. 1.
2.1 Regression Models
Regression analysis is a statistical modeling method that aims to define a relationship between a dependent and independent variable (linear or polynomial) [42]. This predictive modeling technique can be utilized for forecasting, time-series modeling, predictive analysis, etc. Various types of regression methodologies subsisting are Linear, Polynomial, Logistic, Multivariate Regression, Ridge, and Bayesian Linear Regression. Some of these are discussed next.
2.1.1 Linear Regression
Linear regression models have transformed the statistical view of supervised learning for quantitative response prediction of a relation linking the independent (input vector) and dependent variable (output vector). The relationship is represented by a linear function (regression technique) with a formidable perfection. In the ML arena, Linear regression models outperform simplicity while preserving considerable interest and ease of interpretability. Velez et al. [43], presented a straightforward definition of ML as “the capacity to explain or show human eccentricities in understandable terms”. Linear regression targets to access a direct relationship (function) f that justifies the relationship between an input vector x having dimension d and a real-value output y (i.e., f(x)) as
where \({\alpha }_{0}\) \(\in\) \(R\) is identified as the intercept of the function and \(\alpha {R}^{d}\) is the coefficient vector corresponding to the individual input variables. To calculate the regression coefficients \({\alpha }_{0}\) and \(\alpha\), a training set (\({\rm A}\), \({\rm A}\)) is required where A ∈ \({R}^{k \times d}\) denotes k training inputs, \({a}_{1}, {a}_{2}, \dots .., {a}_{k}\) and, \({\rm A}\) denotes k training outputs where each \({a}_{i }\in {R}^{d}\) is affirmed with the real-entity output \({B}_{i}\). The prime objective is to reduce the empirical risk, quantifying via \({\alpha }_{j}\) the relation between predictor \({A}_{j}\) and the response, for each \(j=\mathrm{1,2},3\dots , d.\) Loss functions are a measure of the amount of deviation resulting from the actual outputs concerning model performance. The least squared estimate is one of the widely used loss functions for regression models and also has minimal variance amongst all unbiased linear estimates. Working a regression model by reducing the Residual Sum of Squares (RSS) between the predicted outputs and the labels is expressed as [44]
Certain downsides include high variance, where a model may properly reflect the data set but may overfit to noisy or otherwise unrepresentative training data, reducing prediction accuracy and making it unsuitable for fitting. However, alternative approaches like Linear Dimension Reduction (LDR), this approach generates a low-dimensional linear map** of the original high-dimensional or noisy data that maintains some characteristic of interest, denoises or compresses the data, extracts important feature spaces, and other benefits, further, forward or backward elimination allows to avoid overfitting and reduce robustness. The processing and manipulation of data are often associated with noise, creating a diminishing impact on the model's performance [45]. The link between regularization and robustness due to noise is represented as:
In this regard, the noise is expected to vary accordingly to an uncertainty set \(u\in {R}^{k\times d}\), and the learner inherits the robust behavior, where \(g\) is a convex function that calculates the remainder [46]. Regression models can sometimes renounce the correct interpretability due to a significant no of features against fewer data, to overcome the shortcomings and multicollinearity, various feature selection strategies are applied.
2.1.2 Shrinkage Models
To produce a more predictable model the value of regression coefficients is depreciated with the help of some regularization methods also known as Shrinkage methods at the cost of importing some bias in model ascertainment. The principal intention behind shrinkage methods is penalizing the regression coefficients on the loss function towards a fundamental point, like the mean. Some common shrinkage methods include Ridge Regression which penalizes the norm-2 of the regression coefficients
where \(\partial\) controls shrinkage magnitude, lasso regression penalizes norm-1 and tries to minimize the quantity by
Least Absolute Shrinkage and Selection Operator (Lasso) Regression is an extension of linear regression supplemented by shrinkage. The lasso approach favors models with fewer parameters, well-suited for models with high degrees of multicollinearity, or for develo** automation of some rudiments of model selection. Lasso models are more interpretable as compared to ridge regression due to large \(\partial\) which compels some of the estimated coefficients to be equivalent to absolute zero. The estimation accuracy of subset selection is driven solely by the disturbance present in the input dataset, to reduce the effect of foreign particles and to shun numerical issues, the Tikhonov regularization term (\(\frac{1}{2\wedge } \parallel \alpha {\parallel }_{2}^{2}\)) with weight \(\wedge\)> 0 is introduced along with the cutting plane approach [47].
2.1.3 Regression Models Beyond Linearity
Linear correlation is naturally extended to complex non-linear terms, which may apprehend composite relationships between predictors and regressors. Non-linear regression models extend to include step functions, exponential, local regression, smoothing, regression splines, and polynomial regression into the Familia. Otherwise, the Generalized Additive Models (GAMs) [48] maintain the additivity of the original predictors \({x}_{1}, \dots ., {x}_{n}\), and the relation between every feature and the response y is expressed using nonlinear functions \({g}_{j}\left({x}_{j}\right)\) such as
To preserve a certain level of predictors interpretability concerning linear models, GAMs escalate the flexibility and accuracy of prediction with the aid of non-parametric models such as boosting and random forest. The predictors are expressed in the form of \({x}_{i}\times {x}_{j}\). The efficacy of GAMs is underrepresented in scenarios where observations exceed predictors. Piecewise affine forms appear as suitable models when the correlated function is found separable, discontinuous, or fuzzy to complex nonlinear expressions [49, 50].
2.2 Classification
Classification refers to segregation or map** of unlabelled data items (entity α) based on a trained dataset (\(A, B\)) where every \({\alpha }_{i}\) has a predefined class relative \({B}_{i}\) in a specific category. Classification admits multiclass and binary approaches including logistic regression, Linear Discriminant Analysis (LDA), Support Vector Machines (SVMs), and decision tree mechanisms [51].
2.2.1 Logistic Regression
In critical domain functional relationship \(\left(y= {g}_{x}\right)\) between \(y\) and \(x\) is absent. Considering this situation, the relation between \(y\) and \(x\) has to be described in a general way by a framing a probability function \(E\left(\frac{y}{x}\right) ,\) considering that the train data preserves independent bits from\(E\). Here the label \(y\) is assumed to be binary, i.e., \(y\in \left\{\mathrm{0,1}\right\},\) the finest class membership conclusion is to choose the label \(y\) that amplifies the distribution \(E\left(\frac{y}{x}\right)\) imperatively. Logistic regression examines the probability of belonging to a class for one in the two categories of the dataset by [52]
The prominent decision boundary between the binary classes is marked by a hyperplane (that maximizes the measure of deviation) is described as \({\alpha }_{0}+{\alpha }^{\top }x=0\). The parameters \({\alpha }_{0}\) and \(\alpha\) are obtained by maximum-likelihood estimation method
To conclude at a globally optimal solution, \({1}^{st}\) order method such as gradient descent for positioning a differential function's local bottom, taking recurrent steps in the conflicting course of the function's incline at the current point, in the steepest descent direction and \({2}^{nd}\) order such as Newton's method where each iteration entails fitting a parabola to the graph of a differential function at a trial value p and then determining the minimum or maximum of that parabola (called saddle point), come into play. Further tuning of the logistic regression models can be achieved by variable selection to avoid overfitting, forward selection to add variables, or backward elimination to withdraw variables based on the statistical relevance of the coefficients.
2.2.2 Decision Trees
Classification is often associated with a non-parametric model, Decision Trees (DT) for a conclusive decision on any hypothetical or real-world instance using distribution rules expressed as a tree data structure. Statistical indicators (such as mean, median, or mode) recline the intuitive prediction of the model on the segmented training data. DTs are good for large datasets with less dimension and can handle both numerical and categorical values. Entropy is calculated for each candidate i.e., the average weighted probability, and combined them to find the average of each node, represented as \(H\left(s\right)= - {\sum }_{i=1}^{j} {p}_{i}log{p}_{i}\), where ‘H’ represents the entropy for the given weight ‘s’ and ‘\({p}_{i}\)’ if the frequency of the probability of an element per class ‘i’ in the data. Subtlety, the Gini Impurity is given as \(Gini=1- {\sum }_{i=0}^{j} {\left({p}_{i}\right)}^{2}\) evaluates the impurity of each candidate node and hence the root with the least impurity can be picked easily. Similarly, the Information Gain (IG) which quantifies the quantity of split is represented as
simplifying it to \(IG\left(s, a\right)\). This can be estimated as
where ‘H(s)’ is the entropy for the data given the variable ‘a’. To avoid overfitting of data, pruning along with other techniques such as Smit and Konin are taken into consideration. Pruning of a tree is an essential measure to ensure unbiased decisions, represented as
where ‘R(T)’ is the total misclassification rate of terminal nodes, ‘T’ no of terminal nodes and ‘\({R}_{\alpha }\left(T\right)\)’ is the cost complexity measure. Various recursive procedures help in the splitting of training datasets to parse them through segmentation. Since recursive procedures have a distinguished greedy nature, it has failed at times to settle at global optimum, giving chances to implement certain other alternatives such as the heuristic approach based on mathematical programming paradigms (i.e., linear optimization) and dynamic programming. Consider an example of a simple classification tree, where the tree determines the health status and need of exercising for elderly people based on their activities. Figure 2 represents the decision process. Okaty et al. [53] propose a fresh stratum-based DT model for precise localization of anatomical landmarks in clinical image scrutiny. Liang et al. [54] provide an effective and privacy-preserving DT classification strategy for health monitoring systems (PPDT). They turn a DT classifier into a boolean trajectory, then encode with symmetric key encryption. Zhu et al. [55] present a novel Multi-ringed (MR) Forest framework based on DTs for the reduction of false positives in pulmonary node detection. Various algorithms that utilize fed data to generate decision trees are Classification and Regression Tree (CART), Iterative Dichotomiser 3 (ID3), ID 4.5, etc.
2.2.3 SVM
Under the hood of supervised machine learning algorithms in the statistical learning category, SVMs receive vital attention in the optimization approaches. SVMs intend to identify a hyperplane with a maximum margin separating two significant classified classes. Given a training set \(\left(A, B\right)\) with \(m\) training inputs where \(A\in {R}^{k\times d}\) and \(B\in {\left\{-1, 1\right\}}^{k}\) being the binary response variable, SVM identifies the margin of separation as \({w}^{\top }+\gamma =0\). Provided, \(w\) represents the vector of coefficients for input variables and \(\gamma\) is the intercept of the distinguishing hyperplane [56].
2.2.3.1 Hard margin SVM
Hard margin SVM is known as the simplest version of SVMs that proceeds with an assumption that a hyperplane exists which physically separates data into two different classes avoiding misclassification. This optimization technique is categorized as a linearly constrained convex quadratic problem. Following this model's training, a hyperplane is identified which separates the data kee** the distance to the closest data point from the margin of separation maximum. The distance of a data point \({a}_{i}\) to the hyperplane is given by
where \({\| w\| }_{2}\) expresses the norm-2. Therefore, the data points with labels \(B= -1\) are on one side of the hyperplane such that \({w}^{\top }{a}_{i}+\gamma \le 1\) while the data point with labels \(B= -1\) are on the other side \({w}^{\top }{a}_{i}+\gamma \ge 1\). Now to find the hyperplane an optimization function has to be dealt with,
s.t., \({B}_{i}\left({w}^{\top }{a}_{i}+\gamma \right)\ge 1 {\forall }_{i=1,\dots , k}\), \(w \in\) \({R}^{k}\), \(\gamma \in\) \(R\), which is recognized as a convex quadratic problem. Often the accuracy of optimization by forcing the separability of data on a linear hyperplane is traded off which rules out the practicability of this version of SVM, this is where soft-margin SVMs outperform hard-margin SVMs.
2.2.3.2 Soft margin SVM
The convex quadratic problem becomes infeasible when data is not separable on linear terms. An alternative to this problem exists by minimizing the errors average. To minimize the data points tinkering on the unfavorable side of the hyperplane a slack variable \({\xi }_{i}\ge 0\) in the constraints of the objective function is introduced which is then penalized as a proxy. The soft-margin escalation problem is discussed as
where \(w\in\) \({R}^{t}\), \(\gamma \in\) \(R\), \({\xi }_{i}\ge 0\). Considering another alternative as to introduce an error term \({\xi }_{i}\) in the objective function using the squared hinge loss function \({\sum }_{i}^{k}{\xi }_{i}^{2}\) instead of the hinge loss function \({\sum }_{i}^{k}{\xi }_{i}\) to attain specificity of soft-margin SVM. The misclassification rate of this optimization strategy maximizes when norm-2 is replaced with norm-1 leading to linear optimization problems.
2.2.3.3 Sparse SVM
Various approaches have been proposed to deal with sparsity (feature selection in classification model) in SVMs among which 1-norm, elastic net (both 1-norm and 2-norm) are common. The approach is applied to the model which tunes bias to one of the norms using a hyperparameter [57]. The number of features selected can be modeled in the soft-margin optimization problem by using binary variables \({\rm Z}\in {\left\{0, 1\right\}}^{d}\) where \({\rm Z}_{j}=1\) indicates feature \(j\) is selected else \({\rm Z}_{j}=0\). A constraint restricting the feature number for an optimum desired reach can be resulting in a mixed-integer quadratic catch as
s.t. \({B}_{i}\left({w}^{\top }{a}_{i}+\gamma \right)\ge 1-{\xi }_{i}\), \(where {\forall }_{i=1,\dots , k}\), \(w\in\) \({R}^{t}\), \(\gamma \in\) \(R\), \({\xi }_{i}\ge 0\), \(s.t.\, {\sum }_{j=1}^{d} {\rm Z}_{j}=r\).
2.2.4 SVR
Support Vector Regression (SVR) is a supervised machine learning technique that is designed to handle regression difficulties. Regression analysis comes in handy while observing the relationship between one or more predictor variables and dependent variables since it can balance the complexity of the model and prediction error [58]. SVR is an extension to classic SVM that is introduced for binary classification buttressing the core idea of recognizing a linear function \(f\left(x\right)={w}^{\top }a+\gamma\) approximated with a tolerance variable \(\varepsilon ,\) training set (\(A, B\)) where \(B\in R\) [59]. SVR has shown optimal performance in handling high-dimensional data that deals with regression problems. SVR uses a similar approach to SVM to perform classification using hyper-planes defined by a few support vectors and can easily handle non-linear regression competently [60]. However, a linear function might not always be derivable thus slack variables \({\xi }_{i}^{-}\ge 0 \& {\xi }_{i}^{+}\ge 0\) expressing deviations from the expected tolerance are introduced and minimized similar to the way of soft-margin SVMs. Following, the optimization problem is stated.
Hyperparameter (P) tuning further adjusts the weight on deviation from tolerance\(\varepsilon\). This deviation is the \(\varepsilon\)-insensitive loss function \({\left|\xi \right|}_{\varepsilon } ,\) given by
2.3 Clustering
Clustering is a widely used class of supervised learning that focuses mainly on the grou** of a set of objects into smaller clusters of similar genera. This common statistical data analysis technique finds its application in the domains of pattern recognition, bioinformatics, data compression, image analysis, and information retrieval. Healthcare sectors collect massive amounts of data from various healthcare service providers, and this data may include information such as patient information, medical tests, and treatment specifics. Because of the intricacy of the data obtained, analyzing the data for decision-making on a patient's health state is tough. Numerous strategies, such as clustering, are currently used by healthcare practitioners to determine a patient's health state. Clustering is an unsupervised learning method that divides huge datasets into smaller groups based on related properties [61]. This method is usually used to find commonalities between data points. The most common use of unlabeled learning (Unsupervised learning) has been to generate a cluster or group of items in a dataset. Given an input \(A\in {R}^{k\times d}\), which includes k unlabelled observations, \({a}_{1}, {a}_{2}, \dots .., {a}_{k}\) with \({a}_{i}\in {R}^{d}\), clustering aims to procure \(K\) subsets of \(A\), i.e., individual clusters, which are homogeneous as well as separated. The cluster estimation acts as a tuning parameter that needs to be corrected before examining the clusters. The degree of separation and homogeneity can be modeled based on the different criteria which give rise to several types of clustering algorithms such as K-means Clustering, Capacitated Clustering, Hierarchical Clustering, etc.
2.3.1 K-Means
K-means clustering or minimum sum of squares clustering is a vector quantization method that aims to partition the \(m\) no. of data observations into \({\rm K}\) disjoint clusters with an affiliated minimum central mean for each sample. The decision on the cluster proportions is considered by close examination of the elbow curve, or similarity indicators, such as Calinski-Harabasz index, silhouette values, or via statistical programming approaches [62]. Binary variables described as \({\varphi }_{ij}=\{1 i \in cluster j 0 otherwise\) and the centroid \({\varphi }_{j} \in {R}^{d }\) of each cluster \(j\), the difficulty of reduction in cluster variance is provided as a nonlinear equation [63]
\(s.t.\, {\sum }_{j=1}^{\rm K} {\varphi }_{ij}=1,\forall i=\mathrm{1,2},\dots .,k\), \(\forall j=\mathrm{1,2},\dots ,{\rm K} , {\varphi }_{j} \in {R}^{d}\). Introduction of the variable \({\varphi }_{ij}\) which denotes the distance of observation \(i\) from centroid\(j\), the following linear dimensional formula is obtained as
\(s.t. \;{\sum }_{j=1}^{\rm K}{\varphi }_{ij}=1, \forall i=\mathrm{1,2},\dots .,k and \forall j=\mathrm{1,2},\dots , {\rm K}.\) Apart from the above-mentioned methods several other alternatives such as the heuristic approach based on gradient method, bundle approach, and a column generation approach are in practice. Figure 3 represents the clusters with K-means as their centroid, all classified distinctly.
2.3.2 Capacitated Clustering
The Capacitated Centred Clustering Problem (CCCP) aims to catalogue a bunch of clusters with a limited capacity and correlation indicated by the similarity index of the cluster’s mean. Considering a group of expected clusters from \(1, 2, \dots ., {\rm K},\) CCCP can be mathematically represented as
\(s.t.\,\) \({\sum }_{j=1}^{\rm K} {\varphi }_{ij}=1, {\sum }_{j=1}^{\rm K} {\upsilon }_{j}\le {\rm K},\) \({\varphi }_{ij}\le {\upsilon }_{j}, {\sum }_{i=1}^{k} {q}_{i}{\upsilon }_{j}\le {Q}_{j}\). Where \({\rm K}\) is the uppermost bound on the clusters, \({\beta }_{ij}\) represents the measure of dissimilarity between cluster \(j\) and observation i. \({Q}_{j}\) is the capacity of cluster\(j\), and \({q}_{i}\) is the weight of observation \(i\). Variable \({\varphi }_{ij}\) denotes the assignment of \(i\) to \(j\) and variable \({\upsilon }_{j}\) is equivalent to 1 when cluster \(j\) is used. If the variable \({\beta }_{ij}\) is a distance and the clusters are homogeneous then the formula also models the well-known facility location problem [64].
2.4 Linear Dimension Reduction
Linear dimensionality reduction or shrinkage methods have been developed extensively for ages in the domain of statistics and applied fields to become an indispensable tool for analysing high-dimensional and noisy data. These methods improve the model's interpretability by producing a low-dimensional linear function from the original high-dimensional data that preserve features of interest in the output sample [65].
2.4.1 Principal Components
Principal component analysis (PCA) targets prune the sum of squared residual errors between the original high-dimensional data and projected data points. PCA trail in terms of explained variances, which refer to the quantum of information regained from the original feature set \({a}_{1}, {a}_{2}, \dots .., {a}_{d}.\) PCA was formulated originally as
\(s.t.\, {\sum }_{j=1}^{d} {\left({\phi }_{j}^{1}\right)}^{2}=1,\) where \({\phi }^{1}\in {R}^{d}\) is a unit vector. The problem above was sensitive to the presence of outliers. To improve robustness, the original formulation later grew equivalent to "maximizing variance" derivation given as
\(s.t.\, {\sum }_{j=1}^{d} {\left({\phi }_{j}^{h}\right)}^{2}=1,\) where\({\phi }^{{h}^{\tau }}{S\phi }^{\iota }=0 \;and \;\forall \iota =1, 2, \dots , h-1 , {h}^{th} principle component\). PCA finds its application in various data analytics problems which benefit from dimensionality reduction mechanisms. For linear regression models, there exists Principal Component Regression (PCR) a two-staged procedure that inherits the properties of PCA accompanied by the advantage of including fewer predictors and reduced predictability time in the same variable dataset. Amid all the resolute outcomes of PCA, the only known drawback is interpretability.
3 Problems in Healthcare Sector
A change toward a data-driven socioeconomic health slant is taking place. This is due to the increased volume, velocity, and diversity of data attained from the public and private sectors in healthcare and natural sciences in a wide range. Over the last five years, there has been remarkable advancement in informatics technologies and computational intelligence for use in health and biomedical sciences. However, the full potential of data to address the breadth and extent of human health problems has yet to be realized. The properties of health data present intrinsic limitations to the effective implementation of typical data mining and ML technologies. Aside from the volume of data ('Big Data’) they are difficult to manage because of their complexity, heterogeneity, dynamic nature, and unpredictability. Finally, practical obstacles in applying new and current standards across different health providers and research organizations have hindered data management and the interpretability of the results. Oliveira et al. [125].
5.1.2 Environmental and Instrumental Noise
The process of digital data collection and regulation seldom accompanies environmental and instrumental disturbances. Little agitation in certain diagnostic procedures such as in multishot MRI where extensive supervision is required, can lead to undesirable noise in the solicited data thereupon increasing the risk of misdiagnosis.
5.2 Vulnerabilities Due to Data Annotation
ML/DL applications require extensive model training for perfect predictive performance. For medical usage applications, most models are extensively trained on clinically produced images that require every sample to be annotated. This tedious task of assigning labels should mostly be performed by clinical experts who can prepare domain-enriched datasets or by some automated algorithms [126]. Labeling data like secondary tasks are not encouraged by professionals as it employs a lot of their crucial time therefore trainee staff (who have little domain expertise) are employed for the task. As a result, it leads to problems such as bawdy labels, misclassification, sanction imbalance, etc. Several vulnerabilities due to data annotation are noted further.
5.2.1 Ambiguous Ground Truth
In medical datasets, Finlayson et al. [127] proactively presented a study that expresses the ambiguity in the ground truth of the results. Even well-defined diagnostic tasks are criticized by therapeutic experts, further mishandling and malicious attacks by some perplexed users make the diagnosis, and hence the treatment process difficult yet being under expert supervision.
5.2.2 Improper Annotation
The proper annotation for data samples is critical for certain life-saving healthcare applications. ML/DL mechanisms are deployed for the automated image labeling tasks which often might lead to coarse-grained problems, mislabelling [128]. These problems may challenge the predictive capabilities of healthcare systems that are mentioned next.
5.2.3 Efficiency Challenges
Efficacy becomes the prime factor to monitor an ML/DL-based system's performance. Particular challenges that influence the quality of data and performance thereafter are Limited and Imbalanced datasets, Class imbalance and bias, and sparsity. Newly identified diseases do not have much available history, due to this limitation the performance of a model on predicting the outcomes of this problem is demoted. Class Imbalance is seen as a common problem in supervised ML/DL models which arise due to a mismatch or non uniform data distribution amongst respective classes. Data Sparsity refers to the missing values in the input data that arise due to skipped or unreported samples. All these problems put a significant effect on the functioning of ML/DL techniques.
5.3 Vulnerabilities in Model Training
Vulnerabilities concerning ML/DL model training comprise partial training, model poisoning, privacy infringement, incomplete data rendering. Unbecoming training means inappropriate parameters (such as epochs, test/training ratio, etc.) feeding to the model as a result it becomes exposed to infer at a corrupt proposition. ML/DL models are exposed to cyber-attacks such as adversarial attacks, Trojan attacks, backdoor attacks, etc., breaching the secure integrity of the underlying system [134].
6.4 Availability of Quality Data
One of the other shortcomings in a healthcare ecosystem is the availability of diverse and good-quality data. Daily, an extensive amount of heterogeneous information related to patients is being generated across medical institutions, and an inadequate amount of useful data is being retrieved for researchers and the scientific community to work on. To produce high-quality practical data requires resources and service with good maintenance and management. The ample presence of quality data would enable professionals to develop systems for the grounds of illness prediction and treatment. Data collected during practice can have issues such as bias, a redundancy that will reflect as adverse outcomes in the algorithms. Intelligent systems cannot differentiate racial bias and fair subjectivity as humans persuade the act they learn, for example, a person with no health provision is repudiated for facilitating medical services wherefore research has brought forward that an AI system could predict bias in racial terms [135]. The trained data also contributes to its modeling challenges [136,137,138].
6.5 Casualty is Challenging
Casualty can be challenging from a medical perspective. Understanding the importance of reasoning, i.e., "What if?" while taking decisions in crucial healthcare problems is imminent [139]. Consider a circumstance where we need to analyse that if the doctor prescribed treatment 1 rather than treatment 2, how will the outcome be influenced? Queries of this kind cannot be answered from a medical data analysed perspective but through causal reasoning. In healthcare applications learning from observational data and inferencing is the socio norm but forming casual rationalizing from it is challenging which requires building casual models. ML/DL models lack fundamental reasoning under their hood and produce output based on correlation and patterns without considering the casual loop in between. In practical application, the limitation of casual analysis may raise concerns about the prophecy of AI systems. The acknowledgment of the casual effect of certain variables on target yields is paramount for fair predictive behaviour.
6.6 Updating Hospital Infrastructure is Inflexible
Healthcare organizations favor independent operations and mostly avoid sharing information. For a frictionless erudition exchange, it requires the fixing and updating of antiquated software which can be time-consuming and most are not cost-effective. Finlayson et al. [127] reported that even in the late 20 s most of the infirmaries were operating on the ninth version of the International Classification of Disease (ICD) system even though an updated version of ICD-10 had been released in the early '90 s. The difficulties in upgrading hospital infrastructure and internal management systems can raise concerns with the applicability of recent DL/ML practices.
7 Future Research Directions
In this section, various issues that require active research attention related to the security, privacy, and robustness of ML in the Healthcare ecosystem are discussed.
7.1 Machine Learning on the Edge
The revolutionary change in the purposes of ML in Healthcare applications has seen exponential growth in recent years. Research in ML has revolutionized traditional methods and opted for smart and energy-efficient utilization of wearable devices, IoT sensors, etc. With the development of smart cities and transportable medical devices such as portable ventilators, oxygen concentrators, MRI machines, etc., there is a constant demand for refined ML models trained on Edge devices. This imposes a few limitations including a lack of available hardware support and high computational processing capabilities. ML in the Edge devices is nurturing at its nascent stage and requires attention from the researching fraternity. The growth in this domain will lead to faster care in chancy situations and continuous monitoring of patient's health from a remote location, thereby improving healthcare facilities for a better lifestyle and timely medical assistance.
7.2 Handling Dataset Annotation
The output of AI systems is highly subservient on the labeled datasets for training and inference. This requires the medical experts and physiologists to annotate the medical data (such as images, clinical reports, signals, etc.) manually, spending a lot of their valuable time doing this tedious work. The variety of practical medical data glossed with accurate labels will appraise the execution of ML/DL models and exhibit hindrance that might have not been noticed. Thus, manual labeling of data into respective classes is inquisitive, tedious, and energy draining. Automatic approaches like active learning should be adopted and developed to inscribe this impediment.
7.3 Distributed Data Management and ML
In Healthcare systems, the generation of data is discrete, i.e., data is processed from various departments within a hospital extending to various other hospitals geographically. This imposes pressure on efficient data sharing and management for clinical analysis particularly using ML models. ML/DL models are developed based on a general consideration that all the analytical information is easily accessible and centrally available. These shortcomings offered by improper management of information exchange need the attention of developers and researchers who collaboratively could tackle the administration of distributed data and ML.
7.4 Fair and Accountable ML
Qayyum et al. [140] in analyzing robustness and security of ML/DL techniques reasoned that the results of the models are biased and lack accountability. Ensuring fairness and precision of predictions is of cardinal importance for life-critical application in healthcare systems. Trading the accuracy and accountability of these models could result in cynical outcomes and impose risk to patients' health. Fair predictions by the ML/DL models are influenced by a variety of cases with little available data. Taking into account the importance of fair judgment and interpretability, tuning of models accordingly will make it robust and desist from misjudgements made in the past clinical records. Further study to develop dynamic methods to ensure safety and lessen imperfections is needed in this area.
7.5 Model-Driven ML
The practice of ML, AI for predictive analysis in healthcare applications comes with privileges as well as liabilities. Latif et al. [141] discussed the associated caveats in utilizing these tools, failing to denote its lapses might turn out critical as in clinical terms. Usually, the perks of these models convince one that data once available in abundance can handle hypothesis generation without any medical expert validation and interpretation, which attracts unavoidable problems. To avoid these quandaries, it is important to achieve a combined data-driven method including hypothesis and model-based approaches to bring controlled precision in these studies. Areas for building robust, secure, and accountable ML deliverables that are technically precise require further research.
8 Conclusion
ML is activated by statistically afformed algorithms, distributed over different categories such as Regression, Classification, Clustering, etc. All of these algorithms assist in building intelligent solutions for automating clinical tasks and suspecting disease apprehensions. The traditional practice of services provided by healthcare systems has seen a vast change with the advent of ML and DL-based approaches. However, to ensure secure, bias-free, and hale utilization of these models, provocations should be addressed. This report provides a brief introduction to several ML algorithms, discusses their extent of reinforcement and controls, further marking reliable standards to bypass shortcomings in model building. This paper also provides a synopsis of the challenges arising in the ML deployment pipeline for healthcare infrastructure by classifying different origins of jeopardies in it. Conclusively this work discusses possible solutions to provide users as well as clinical experts in a healthcare ecosystem with secure, robust, and privacy-protected ML explication for privacy endeavouring applications. The paper is summarized by including the potential pursuit of ML techniques in the healthcare sector and the privacy consideration linked with it.
References
Kumar A, Krishnamurthi R, Nayyar A, Sharma K, Grover V, Hossain E (2020) A novel smart healthcare design, simulation, and implementation using healthcare 4.0 processes. IEEE Access 8:118433–118471. https://doi.org/10.1109/ACCESS.2020.3004790
Yang G et al (2020) Homecare robotic systems for healthcare 4.0: visions and enabling technologies. IEEE J Biomed Health Inform 24(9):2535–2549. https://doi.org/10.1109/JBHI.2020.2990529
Nasr M, Islam MM, Shehata S, Karray F, Quintana Y (2021) Smart healthcare in the age of AI: recent advances, challenges, and future prospects. IEEE Access 9:145248–145270. https://doi.org/10.1109/ACCESS.2021.3118960
Bharadwaj HK et al (2021) A review on the role of machine learning in enabling IoT based healthcare applications. IEEE Access 9:38859–38890. https://doi.org/10.1109/ACCESS.2021.3059858
Havaei M, Davy A, Warde-Farley D, Biard A, Courville A, Bengio Y, Pal C, Jodoin P-M, Larochelle H (2017) Brain tumor segmentation with deep neural networks. Med Image Anal 35:18–31
Zarrin PS, Roeckendorf N, Wenger C (2020) In-vitro classification of saliva samples of COPD patients and healthy controls using machine learning tools. IEEE Access 8:168053–168060. https://doi.org/10.1109/ACCESS.2020.3023971
Aslam AR, Altaf MAB (2020) An on-chip processor for chronic neurological disorders assistance using negative affectivity classification. IEEE Trans Biomed Circuits Syst 14(4):838–851. https://doi.org/10.1109/TBCAS.2020.3008766
Meneghetti L, Terzi M, Del Favero S, Susto GA, Cobelli C (2020) Data-driven anomaly recognition for unsupervised model-free fault detection in artificial pancreas. IEEE Trans Control Syst Technol 28(1):33–47. https://doi.org/10.1109/TCST.2018.2885963
Mehta J, Majumdar A (2017) Rodeo: robust de-aliasing autoencoder forreal-time medical image reconstruction. Pattern Recogn 63:499–510
Bejnordi BE, Veta M, Van Diest PJ, Van Ginneken B, Karssemeijer N, Litjens G, Van Der Laak JA, Hermsen M, Manson QF, Balkenhol M et al (2017) Diagnostic assessment of deep learning algorithms for detection of lymph node metastases in women with breastcancer. JAMA 318(22):2199–2210
Esteva A, Kuprel B, Novoa RA, Ko J, Swetter SM, Blau HM, Thrun S (2017) Dermatologist-level classification of skin cancer withdeep neural networks. Nature 542(7639):115
Rajpurkar P, Irvin J, Zhu K, Yang B, Mehta H, Duan T, Ding D, Bagul A, Langlotz C, Shpanskaya K, et al (2017) Chexnet: radiologistlevel pneumonia detection on chest x-rays with deep learning. ar**v:1711.05225
Shishvan OR, Zois D, Soyata T (2018) Machine intelligence in healthcare and medical cyber physical systems: a survey. IEEE Access 6:46419–46494. https://doi.org/10.1109/ACCESS.2018.2866049
Li JP, Haq AU, Din SU, Khan J, Khan A, Saboor A (2020) Heart disease identification method using machine learning classification in e-healthcare. IEEE Access 8:107562–107582. https://doi.org/10.1109/ACCESS.2020.3001149
Kumar V, Recupero DR, Riboni D, Helaoui R (2021) Ensembling classical machine learning and deep learning approaches for morbidity identification from clinical notes. IEEE Access 9:7107–7126. https://doi.org/10.1109/ACCESS.2020.3043221
Paranjape K, Schinkel M, Nanayakkara P (2020) Short keynote paper: mainstreaming personalized healthcare-transforming healthcare through new era of artificial intelligence. IEEE J Biomed Health Inform 24(7):1860–1863. https://doi.org/10.1109/JBHI.2020.2970807
Al-Dhief FT et al (2020) A survey of voice pathology surveillance systems based on internet of things and machine learning algorithms. IEEE Access 8:64514–64533. https://doi.org/10.1109/ACCESS.2020.2984925
Alhussein M, Muhammad G (2018) Voice pathology detection using deep learning on mobile healthcare framework. IEEE Access 6:41034–41041. https://doi.org/10.1109/ACCESS.2018.2856238
Tsang G, **e X, Zhou S-M (2020) Harnessing the power of machine learning in dementia informatics research: issues, opportunities, and challenges. IEEE Rev Biomed Eng 13:113–129. https://doi.org/10.1109/RBME.2019.2904488
Tong Y, Messinger AI, Luo G (2020) Testing the generalizability of an automated method for explaining machine learning predictions on asthma patients’ asthma hospital visits to an academic healthcare system. IEEE Access 8:195971–195979. https://doi.org/10.1109/ACCESS.2020.3032683
Fiaidhi J (2020) Envisioning insight-driven learning based on thick data analytics with focus on healthcare. IEEE Access 8:114998–115004. https://doi.org/10.1109/ACCESS.2020.2995763
El-Ganainy NO, Balasingham I, Halvorsen PS, Rosseland LA (2020) A new real time clinical decision support system using machine learning for critical care units. IEEE Access 8:185676–185687. https://doi.org/10.1109/ACCESS.2020.3030031
Sierra-Sosa D et al (2019) Scalable healthcare assessment for diabetic patients using deep learning on multiple GPUs. IEEE Trans Industr Inf 15(10):5682–5689. https://doi.org/10.1109/TII.2019.2919168
Kumar R, Dhiman G (2021) A comparative study of fuzzy optimization through fuzzy number. Int J Mod Res 1:1–14
Chatterjee I (2021) Artificial intelligence and patentability: review and discussions. Int J Mod Res 1:15–21
Arachchige PCM, Bertok P, Khalil I, Liu D, Camtepe S, Atiquzzaman M (2020) A trustworthy privacy preserving framework for machine learning in industrial IoT systems. IEEE Trans Ind Inf 16(9):6092–6102. https://doi.org/10.1109/TII.2020.2974555
Vaishnav PK, Sharma S, Sharma P (2021) Analytical review analysis for screening COVID-19. Int J Mod Res 1:22–29
Nair R, Soni M, Bajpai B, Dhiman G, Sagayam KM (2022) Predicting the death rate around the world due to COVID-19 using regression analysis. Int J Swarm Intell Res (IJSIR) 13(2):1–13
Sharma S, Gupta S, Gupta D, Juneja S, Singal G, Dhiman G, Kautish S (2022) Recognition of gurmukhi handwritten city names using deep learning and cloud computing. Sci Program
Zeidabadi FA, Doumari SA, Dehghani M, Montazeri Z, Trojovsky P, Dhiman G (2022) MLA: a new mutated leader algorithm for solving optimization problems. CMC—Comput Mater Continua 70(3):5631–5649
Zeidabadi FA, Doumari SA, Dehghani M, Montazeri Z, Trojovsky P, Dhiman G (2022) AMBO: all members-based optimizer for solving optimization problems. CMC—Comput Mater Continua 70(2):2905–2921
Alharbi Y, Alferaidi A, Yadav K, Dhiman G, Kautish S (2021) Denial-of-service attack detection over IPv6 network based on KNN algorithm. Wirel Commun Mobile Comput
Chinnasamy R, Deepika A, Senthil T (2018) Machine learning algorithms: A background artifact. Int J Eng Technol. 7:143–149
F. Y. Okay, M. Yıldırım and S. Özdemir, “Interpretable Machine Learning: A Case Study of Healthcare,” (2021) International Symposium on Networks. Computers and Communications (ISNCC) 2021:1–6. https://doi.org/10.1109/ISNCC52172.2021.9615727
Ileberi E, Sun Y, Wang Z (2021) Performance evaluation of machine learning methods for credit card fraud detection using SMOTE and AdaBoost. IEEE Access 9:165286–165294. https://doi.org/10.1109/ACCESS.2021.3134330
Ahsan M, Stoyanov S, Bailey C, Albarbar A (2020) Develo** computational intelligence for smart qualification testing of electronic products. IEEE Access 8:16922–16933. https://doi.org/10.1109/ACCESS.2020.2967858
Hari N, Ahsan M, Ramasamy S, Sanjeevikumar P, Albarbar A, Blaabjerg F (2020) Gallium nitride power electronic devices modeling using machine learning. IEEE Access 8:119654–119667. https://doi.org/10.1109/ACCESS.2020.3005457
Seng KP, Ang L-M, Schmidtke LM, Rogiers SY (2018) Computer vision and machine learning for viticulture technology. IEEE Access 6:67494–67510. https://doi.org/10.1109/ACCESS.2018.2875862
Rehman ZU, Zia MS, Bojja GR, Yaqub M, **chao F, Arshid K (2020) Texture based localization of a brain tumor from MR-images by using a machine learning approach. Med Hypotheses 141:109705. https://doi.org/10.1016/j.mehy.2020.109705
Singh PD, Kaur R, Dhiman G, Bojja GR (2021) BOSS: a new QoS aware blockchain assisted framework for secure and smart healthcare as a service. Expert Syst e12838
Vyas P, Bojja G, Ambati LS, Liu J, Ofori M (2021) Prediction of patient willingness to recommend hospital: a machine learning-based exploratory study.
**e Y, Li Y, **a Z, Yan R (2020) An improved forward regression variable selection algorithm for high-dimensional linear regression models. IEEE Access 8:129032–129042. https://doi.org/10.1109/ACCESS.2020.3009377
Doshi-Velez F, Kim B (2017) Towards a rigorous science of interpretable machine learning. Tech Rep 1702:08608
Gambella C, Ghaddar B, Naoum-Sawaya J (2020) Optimization problems for machine learning: a survey. Eur J Oper Res
Bertsimas D, Copenhaver MS (2018) Characterization of the equivalence of robustification and regularization in linear and matrix regression. Eur J Oper Res 270(3):931–942
Bengio Y, Lodi A, Prouvost A (2018) Machine learning for combinatorial optimization: a methodological tour d’Horizon. Tech Rep 1811:06128
Bertsimas D, Van Parys B et al (2020) Sparse high-dimensional regression: Exact scalable algorithms and phase transitions. Ann Stat 48(1):300–323
Laskowski M, Ambroziak SJ, Correia LM, Świder K (2020) On the usefulness of the generalised additive model for mean path loss estimation in body area networks. IEEE Access 8:176873–176882. https://doi.org/10.1109/ACCESS.2020.3025118
Yang X et al (2019) Piecewise linear regression based on plane clustering. IEEE Access 7:29845–29855. https://doi.org/10.1109/ACCESS.2019.2902620
D’Ambrosio C, Lodi A, Wiese S, Bragalli C (2015) Mathematical programming techniques in water network optimization. Eur J Oper Res 243(3):774–788
Baumann P, Hochbaum DS, Yang YT (2019) A comparative study of the leading machine learning techniques and two new optimization algorithms. Eur J Oper Res 272(3):1041–1057
Lan L, Wang Z, Zhe S, Cheng W, Wang J, Zhang K (2019) Scaling up kernel SVM on limited resources: a low-rank linearization approach. IEEE Trans Neural Netw Learn Syst 30(2):369–378. https://doi.org/10.1109/TNNLS.2018.2838140
Oktay O et al (2017) Stratified decision forests for accurate anatomical landmark localization in cardiac images. IEEE Trans Med Imaging 36(1):332–342. https://doi.org/10.1109/TMI.2016.2597270
Liang J, Qin Z, Xue L, Lin X, Shen X (2021) Efficient and privacy-preserving decision tree classification for health monitoring systems. IEEE Internet Things J 8(16):12528–12539
Zhu H et al (2020) MR-forest: a deep decision framework for false positive reduction in pulmonary nodule detection. IEEE J Biomed Health Inform 24(6):1652–1663. https://doi.org/10.1109/JBHI.2019.2947506
Ghaddar B, Naoum-Sawaya J (2018) High dimensional data classification and feature selection using support vector machines. Eur J Oper Res 265(3):993–1004
Vapnik VN (2000). The Nature of Statistical Learning Theory. https://doi.org/10.1007/978-1-4757-3264-1
Zhang F, O’Donnell LJ (2020) Support vector regression. Mach Learn. https://doi.org/10.1016/b978-0-12-815739-8.00007-9
Cafieri S, Costa A, Hansen P (2014) Reformulation of a model for hierarchical divisive graph modularity maximization. Ann Oper Res 222:213–226. https://doi.org/10.1007/s10479-012-1286-z
Üstün B, Melssen WJ, Buydens LMC (2007) Visualization and interpretation of support vector regression models. Anal Chim Acta 595(1–2):299–309. https://doi.org/10.1016/j.aca.2007.03.023
Kulis B, Jordan MI (2012) Revisiting k-means: new algorithms via Bayesian nonparametric. In Proceedings of the 29th international conference on machine learning (ICML ’12), pp. 513–520, Edinburgh, UK
Aloise D, Hansen P, Liberti L (2012) An improved column generation algorithm for minimum sum-of-squares clustering. Math Program 131:195–220. https://doi.org/10.1007/s10107-010-0349-7
2015 John P. Cunningham and Zoubin Ghahramani.
Mandal S, Greenblatt AB (2018) An J (2018) Imaging Intelligence: AI is transforming medical imaging across the imaging spectrum. IEEE Pulse 9(5):16–24
Noothout JMH et al (2020) Deep learning-based regression and classification for automatic landmark localization in medical images. IEEE Trans Med Imaging 39(12):4011–4022. https://doi.org/10.1109/TMI.2020.3009002
De Oliveira H, Augusto V, Jouaneton B, Lamarsalle L, Prodel M, **e X (2020) Automatic and explainable labeling of medical event logs with autoencoding. IEEE J Biomed Health Inform 24(11):3076–3084. https://doi.org/10.1109/JBHI.2020.3021790
Mengoudi K et al (2020) Augmenting dementia cognitive assessment with instruction-less eye-tracking tests. IEEE J Biomed Health Inform 24(11):3066–3075. https://doi.org/10.1109/JBHI.2020.3004686
Stojanovic J, Gligorijevic D, Radosavljevic V, Djuric N, Grbovic M, Obradovic Z (2017) Modeling healthcare quality via compact representations of electronic health records. IEEE/ACM Trans Comput Biol Bioinf 14(3):545–554
Brisimi TS, Xu T, Wang T, Dai W, Adams WG, Paschalidis IC (2018) Predicting chronic disease hospitalizations from electronic health records: an interpretable classification approach. Proc IEEE 106(4):690–707. https://doi.org/10.1109/JPROC.2017.2789319
Shickel B, Tighe PJ, Bihorac A, Rashidi P (2018) Deep EHR: a survey of recent advances in deep learning techniques for electronic health record (EHR) Analysis. IEEE J Biomed Health Inform 22(5):1589–1604. https://doi.org/10.1109/JBHI.2017.2767063
de la Fuente C, Urrutia A, Chávez E (2019) Using the random forest algorithm for searching behavior patterns in electronic health records. IEEE Lat Am Trans 17(05):875–881. https://doi.org/10.1109/TLA.2019.8891957
Harerimana G, Kim JW, Yoo H, Jang B (2019) Deep learning for electronic health records analytics. IEEE Access 7:101245–101259. https://doi.org/10.1109/ACCESS.2019.2928363
Bernardini M, Romeo L, Misericordia P, Frontoni E (2020) Discovering the type 2 diabetes in electronic health records using the sparse balanced support vector machine. IEEE J Biomed Health Inform 24(1):235–246. https://doi.org/10.1109/JBHI.2019.2899218
Tsang G, Zhou S-M, **e X (2021) Modeling large sparse data for feature selection: hospital admission predictions of the dementia patients using primary care electronic health records. IEEE J Transl Eng Health Med 9:1–13
Lee S, Wei S, White V, Bain PA, Baker C, Li J (2021) Classification of opioid usage through semi-supervised learning for total joint replacement patients. IEEE J Biomed Health Inform 25(1):189–200. https://doi.org/10.1109/JBHI.2020.2992973
Zebari DA, Zeebaree DQ, Abdulazeez AM, Haron H, Hamed HNA (2020) Improved threshold based and trainable fully automated segmentation for breast cancer boundary and pectoral muscle in mammogram images. IEEE Access 8:203097–203116. https://doi.org/10.1109/ACCESS.2020.3036072
Zech J, Pain M, Titano J, Badgeley M, Schefflein J, Su A, Costa A, Bederson J, Lehar J, Oermann EK (2018) Natural language–based machine learning models for the annotation of clinical radiology reports. Radiology 287(2):570–580
**g B, **e P, **ng E (2018) On the automatic generation of medical imaging reports. In: 56th annual meeting of the association for computational linguistics (ACL)
Li M et al (2021) Research on the auxiliary classification and diagnosis of lung cancer subtypes based on histopathological images. IEEE Access 9:53687–53707. https://doi.org/10.1109/ACCESS.2021.3071057
Umamaheswari D, Geetha S (2018) Segmentation and classification of acute lymphoblastic leukemia cells tooled with digital image processing and ML techniques. Second International Conference on Intelligent Computing and Control Systems (ICICCS) 2018:1336–1341. https://doi.org/10.1109/ICCONS.2018.8662950
Wang Y, Huang F, Zhang Y, Zhang R, Lei B, Wang T (2019) Breast cancer image classification via multi-level dual-network features and sparse multi-relation regularized learning. In: 2019 41st annual international conference of the IEEE engineering in medicine and biology society (EMBC), pp 7023–7026.
Abhinaav R, Brindha D (2019) Abnormality detection and severity classification of cells based on features extracted from papanicolaou smear images using machine learning. Int Conf Comput Commun Inform (ICCCI) 2019:1–5. https://doi.org/10.1109/ICCCI.2019.8822131
Bora AP, Joshi AD, Sawant ST (2020) Digitally reconstructed radiograph generation for enabling AI/ML in medical imaging. In: 2020 11th international conference on computing, communication and networking technologies (ICCCNT), pp 1–6.
Weng SF, Reps J, Kai J, Garibaldi JM, Qureshi N (2017) Can machine-learning improve cardiovascular risk prediction using routine clinical data? PLoS ONE 12(4):e0174944
Fatima M, Pasha M (2017) Survey of machine learning algorithms for disease diagnostic. J Intell Learn Syst Appl 9(01):1
Zhao K, So H-C (2019) Drug repositioning for schizophrenia and depression/anxiety disorders: a machine learning approach leveraging expression data. IEEE J Biomed Health Inform 23(3):1304–1315. https://doi.org/10.1109/JBHI.2018.2856535
Jamshidi M et al (2020) Artificial intelligence and COVID-19: deep learning approaches for diagnosis and treatment. IEEE Access 8:109581–109595. https://doi.org/10.1109/ACCESS.2020.3001973
Li N et al (2019) Machine learning assessment for severity of liver fibrosis for chronic HBV based on physical layer with serum markers. IEEE Access 7:124351–124365. https://doi.org/10.1109/ACCESS.2019.2923688
Noaro G, Cappon G, Vettoretti M, Sparacino G, Favero SD, Facchinetti A (2021) Machine-learning based model to improve insulin bolus calculation in type 1 diabetes therapy. IEEE Trans Biomed Eng 68(1):247–255. https://doi.org/10.1109/TBME.2020.3004031
Yang S, Wei R, Guo J, Xu L (2017) Semantic inference on clinical documents: combining machine learning algorithms with an inference engine for effective clinical diagnosis and treatment. IEEE Access 5:3529–3546. https://doi.org/10.1109/ACCESS.2017.2672975
Chaitra N, Vijaya PA, Deshpande G (2020) Diagnostic prediction of autism spectrum disorder using complex network measures in a machine learning framework. Biomed Signal Process Control 62:102099
Saygılı A (2021) A new approach for computer-aided detection of coronavirus (COVID-19) from CT and X-ray images using machine learning methods. Appl Soft Comput 105:107323
Nagiub EM, Abdelsalam KF, Hussain NM, Omar QT, Ali CA, detection L, using microscopic blood image based machine learning “convolutional neural network”, clinical lymphoma myeloma and leukemia, 18(Supplement), 1, (2018) Page S297. ISSN. https://doi.org/10.1016/j.clml.2018.07.246
He J, Wu X, Jiang Y, Peng Q, Jain R (2018) Hookworm detection in wireless capsule endoscopy images with deep learning. IEEE Trans Image Process 27(5):2379–2392. https://doi.org/10.1109/TIP.2018.2801119
Yu Y, Wang J, Chun HE, Xu Y, Fong ELS, Wee A, Yu A (2021) Implementation of machine learning-aided imaging analytics for histopathological image diagnosis, systems medicine. Academic Press, New York, pp 208–221
Suresh H (2017) Clinical event prediction and understanding with deep neural networks. Ph.D. dissertation, Massachusetts Institute of Technology
Qayyum A, Qadir J, Bilal M, Al-Fuqaha A (2020) Secure and robust machine learning for healthcare: a survey
Kim EY, Lee MY, Kim SH, Ha K, Kim KP, Ahn YM (2017) Diagnosis of major depressive disorder by combining multimodal information from heart rate dynamics and serum proteomics using machine-learning algorithm. Progr Neuro-Psychopharmacol Biol Psychiatry 76:65–71. https://doi.org/10.1016/j.pnpbp.2017.02.014
Pellegrini E, Ballerini L, Hernandez M, Chappell FM, González-Castro V, Anblagan D, Danso S, Muñoz-Maniega S, Job D, Pernet D, Mair G, MacGillivray TJ, Trucco E, Wardlaw JM (2018) Machine learning of neuroimaging for assisted diagnosis of cognitive impairment and dementia: a systematic review. Alzheimer’s Dementia Diagn Assess Dis Monit 10:519–535
Akbulut A, Ertugrul E, Topcu V (2018) Fetal health status prediction based on maternal clinical history using machine learning techniques. Comput Methods Programs Biomed 163:87–100
Karhade AV, Thio Q, Ogink P, Kim J, Lozano-Calderon S, Raskin K, Schwab JH (2018) Development of machine learning algorithms for prediction of 5-year spinal chordoma survival. World Neurosurgery 119:e842–e847
Abdar M, Wojciech Książek U, Acharya R, Tan R-S, Makarenkov V, Pławiak P (2019) A new machine learning technique for an accurate diagnosis of coronary artery disease. Comput Methods Programs Biomed 179:104992
Burdick H, Lam C, Mataraso S, Siefkas A, Braden G, Dellinger RP, McCoy A, Vincent JL, Green-Saxena A, Barnes G, Hoffman J, Calvert J, Pellegrini E, Das R (2020) Prediction of respiratory decompensation in Covid-19 patients using machine learning: the READY trial. Comput Biol Med 124:103949
Hashem S, ElHefnawi M, Habashy S, El-Adawy M, Esmat G, Elakel W, Abdelazziz AO, Nabeel MM, Abdelmaksoud AH, Elbaz TM, Shousha HI (2020) Machine learning prediction models for diagnosing hepatocellular carcinoma with HCV-related chronic liver disease. Comput Methods Program Biomed 196:105551
Magesh PR, Myloth RD, Tom RJ (2020) An explainable machine learning model for early detection of parkinson's disease using LIME on DaTSCAN imagery. Comput Biol Med 126:104041
Shen H, Hu Y, Liu X, Jiang Z, Ye H, Takshe A, Dulaimi SHKA (2021) Application of machine learning risk prediction mathematical model in the diagnosis of Escherichia coli infection in patients with septic shock by cardiovascular color doppler ultrasound. Results Phys 26:104368
Montolío A, Martín-Gallego A, Cegoñino J, Orduna E, Vilades E, Garcia-Martin E, del Palomar AP (2021) Machine learning in diagnosis and disability prediction of multiple sclerosis using optical coherence tomography. Comput Biol Med 133:104416
Lin Y-W, Zhou Y, Faghri F, Shaw M, Campbell R (2019) Analysis and prediction of unplanned intensive care unit readmission using recurrent neural networks with long short-term memory. PLoS ONE 14:e0218942. https://doi.org/10.1371/journal.pone.0218942
Rau C-S, Kuo P-J, Chien P-C, Huang C-Y, Hsieh H-Y, Hsieh C-H (2018) Mortality prediction in patients with isolated moderate and severe traumatic brain injury using machine learning models. PLoS ONE 13(11):e0207192
**e J, Wang Q (2020) Benchmarking machine learning algorithms on blood glucose prediction for type I diabetes in comparison with classical time-series models. IEEE Trans Biomed Eng 67(11):3101–3124. https://doi.org/10.1109/TBME.2020.2975959
Pezoulas VC, Papaloukas C, Veyssiere M, Goules A, Tzioufas AG, Soumelis V, Fotiadis DI (2021) A computational workflow for the detection of candidate diagnostic biomarkers of Kawasaki disease using time-series gene expression data. Comput Struct Biotechnol J 19:3058–3068
Nancy JY, Khanna NH, Kannan A (2010) A bio-statistical mining approach for classifying multivariate clinical time series data observed at irregular intervals. Expert Syst Appl 78
Froc E, Dubernard G, Bendifallah S, Hermouet E, Rubod-Dit-Guillet C, Canis M, Warembourg S, Golfier F, Fauconnier A, Roman H, Philip C-A (2021) Clinical characteristics of urinary tract endometriosis: a one-year national series of 232 patients from 31 endometriosis expert centers (by the FRIENDS group). Eur J Obst Gynecol Reprod Biol. https://doi.org/10.1016/j.ejogrb.2021.06.018.7
Wallace DS (2018) The role of speech recognition in clinical documentation. Nuance communications. Accessed 14 Dec 2019 https://www.hisa.org.au/slides/hic18/wed/SimonWallace.pdf.
Zamani NSM, Zaki WMDW, Huddin AB, Hussain A, Mutalib HA, Ali A (2020) Automated pterygium detection using deep neural network. IEEE Access 8:191659–191672. https://doi.org/10.1109/ACCESS.2020.3030787
Collins A, Yao Y (2018) Machine learning approaches: data integration for disease prediction and prognosis. In: Applied computational genomics. Springer, New York, pp 137–141.
Ke X, Zou J, Niu Y (2019) End-to-end automatic image annotation based on deep CNN and multi-label data augmentation. IEEE Trans Multimedia 21(8):2093–2106. https://doi.org/10.1109/TMM.2019.2895511
Davi C et al (2019) Severe dengue prognosis using human genome data and machine learning. IEEE Trans Biomed Eng 66(10):2861–2868. https://doi.org/10.1109/TBME.2019.2897285
Liu M, Zhang J, Lian C, Shen D (2020) Weakly supervised deep learning for brain disease prognosis using MRI and incomplete clinical scores. IEEE Trans Cybern 50(7):3381–3392. https://doi.org/10.1109/TCYB.2019.2904186
Fang G, Liu W, Wang L (2020) A machine learning approach to select features important to stroke prognosis. Comput Biol Chem 88:107316
Wang G, Zhang G, Choi K-S, Lam K-M, Lu J (2020) Output based transfer learning with least squares support vector machine and its application in bladder cancer prognosis. Neurocomputing 387:279–292
Cai W, Liu T, Xue X, Luo G, Wang X, Shen Y, Fang Q, Sheng J, Chen F, Liang T (2020) CT quantification and machine-learning models for assessment of disease severity and prognosis of COVID-19 patients. Acad Radiol 27(12):1665–1678
Zack CJ, Senecal C, Kinar Y, Metzger Y, Bar-Sinai Y, Widmer RJ, Lennon R, Singh M, Bell MR, Lerman A, Gulati R (2019) Leveraging machine learning techniques to forecast patient prognosis after percutaneous coronary intervention. JACC Cardiovasc Intervent 12(14):1304–1311
He Z-L, Zhou J-B, Liu Z-H, Dong S-Y, Zhang Y-T, Shen T, Zheng S-S, Xu X (2021) Application of machine learning models for predicting acute kidney injury following donation after cardiac death liver transplantation. Hepatob Pancr Dis Int 20(3):222–231
Ghadirzadeh A, Chen X, Yin W, Yi Z, Björkman M, Kragic D (2021) Human-centered collaborative robots with deep reinforcement learning. IEEE Robot Autom Lett 6(2):566–571. https://doi.org/10.1109/LRA.2020.3047730
Veltri P, Vizza P, Cristofaro M, Kallaverja E (2021) Clinical data annotation for parotid neoplasia management. In: 2021 IEEE 9th international conference on healthcare informatics (ICHI), pp. 445–446.
Finlayson SG, Bowers JD, Ito J, Zittrain JL, Beam AL, Kohane IS (2019) Adversarial attacks on medical machine learning. Science 363(6433):1287–1289
Alfeld S, Zhu X, Barford P (2016) Data poisoning attacks against autoregressive models. In: Thirtieth AAAI conference on artificial intelligence
Papernot N, McDaniel P, Sinha A, Wellman M (2016) Towards the science of security and privacy in machine learning. ar**v:1611.03814, 2016.
Begoli E, Bhattacharya T, Kusnezov D (2019) The need for uncertainty quantification in machine-assisted medical decision making. Nat Mach Intell 1(1):20
Pollard TJ, Chen I, Wiens J, Horng S, Wong D, Ghassemi M, Mattie H, Lindmeer E, Panch T (2019) Turning the crank for machine learning: ease, at what expense? Lancet Digit Health 1(5):e198–e199
Sahi MA et al (2018) Privacy preservation in e-healthcare environments: state of the art and future directions. IEEE Access 6:464–478. https://doi.org/10.1109/ACCESS.2017.2767561
Al-Rubaie M, Chang JM (2019) Privacy-preserving machine learning: threats and solutions. IEEE Secur Priv 17(2):49–58
Zhang J, Bareinboim E (2018) Fairness in decision-making the causal explanation formula. In: Thirty-second AAAI conference on artificial intelligence
Chen I, Johansson FD, Sontag D (2018) Why is my classifier discriminatory? In: Advances in neural information processing systems, pp 3539–3550.
Ghassemi M, Naumann T, Schulam P, Beam AL, Chen IY, Ranganath R (2019) Practical guidance on artificial intelligence for healthcare data. Lancet Digit Health 1(4):e157–e159
Panch T, Mattie H, Celi LA (2019) The inconvenient truth about AI in healthcare. Npj Digit Med 2(1):1–3
Perone CS, Ballester P, Barros RC, Cohen-Adad J (2019) Unsupervised domain adaptation for medical imaging segmentation with self-ensembling. Neuroimage 194:1–11
Schulam P, Saria S (2017) Reliable decision support using counterfactual models. In: Advances in neural information processing systems, pp 1697–1708
Qayyum A, Usama M, Qadir J, Al-Fuqaha A (2019) Securing connected & autonomous vehicles: challenges posed by adversarial machine learning and the way forward. ar**v:1905.12762
Latif S, Qayyum A, Usama M, Qadir J, Zwitter A, Shahzad M (2019) Caveat emptor: the risks of using big data for human development. IEEE Technol Soc Mag 38(3):82–90
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Swain, S., Bhushan, B., Dhiman, G. et al. Appositeness of Optimized and Reliable Machine Learning for Healthcare: A Survey. Arch Computat Methods Eng 29, 3981–4003 (2022). https://doi.org/10.1007/s11831-022-09733-8
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11831-022-09733-8