Automated Screening of Arrhythmia Using Wavelet Based Machine Learning Techniques

Martis, Roshan Joy; Krishnan, M. Muthu Rama; Chakraborty, Chandan; Pal, Sarbajit; Sarkar, Debranjan; Mandana, K. M.; Ray, Ajoy Kumar

doi:10.1007/s10916-010-9535-7

Automated Screening of Arrhythmia Using Wavelet Based Machine Learning Techniques

Original Paper
Published: 16 June 2010

Volume 36, pages 677–688, (2012)
Cite this article

Download PDF

Journal of Medical Systems Aims and scope Submit manuscript

Automated Screening of Arrhythmia Using Wavelet Based Machine Learning Techniques

Download PDF

Roshan Joy Martis¹,
M. Muthu Rama Krishnan¹,
Chandan Chakraborty¹,
Sarbajit Pal²,
Debranjan Sarkar²,
K. M. Mandana³ &
…
Ajoy Kumar Ray⁴

907 Accesses
44 Citations
3 Altmetric
Explore all metrics

Abstract

Arrhythmia is one of the preventive cardiac problems frequently occurs all over the globe. In order to screen such disease at early stage, this work attempts to develop a system approach based on registration, feature extraction using discrete wavelet transform (DWT), feature validation and classification of electrocardiogram (ECG). This diagnostic issue is set as a two-class pattern classification problem (normal sinus rhythm versus arrhythmia) where MIT-BIH database is considered for training, testing and clinical validation. Here DWT is applied to extract multi-resolution coefficients followed by registration using Pan Tompkins algorithm based R point detection. Moreover, feature space is compressed using sub-band principal component analysis (PCA) and statistically validated using independent sample t-test. Thereafter, the machine learning algorithms viz., Gaussian mixture model (GMM), error back propagation neural network (EBPNN) and support vector machine (SVM) are employed for pattern classification. Results are studied and compared. It is observed that both supervised classifiers EBPNN and SVM lead to higher (93.41% and 95.60% respectively) accuracy in comparison with GMM (87.36%) for arrhythmia screening.

Wavelet-based Machine Learning Techniques for ECG Signal Analysis

Design and Comparison of ECG Arrhythmias Classifiers Using Discrete Wavelet Transform, Neural Network and Principal Component Analysis

Comparison of Feature Extraction Methods for Heart Disease Classification

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Introduction

The human circulatory system consists of heart, which contracts and expands to pump the oxygenated blood into the vascular system, receives deoxygenated blood from various parts of the body and routes it into the lungs [1] where it gets oxygenated. Generally the sinus node acts as the source of impulse signal in the heart during normal operation. During abnormal conditions sinus node will not function, instead there will be other sources of impulse generation in the heart, a condition called arrhythmia is said to exist. One of the most occurring abnormalities of the heart is arrhythmia. It occurs due to the change in rhythm of heart. If it is identified well within time, the disease can be very well controlled and better healthcare could be provided to the patient, whereas on the other hand the arrhythmias are life-threatening and may lead to fatal abnormalities like ventricular tachycardia followed by fibrillations and if not therapeutically intervened at proper time, it may take the life of the patient.

Due to the increased incidence of arrhythmia, it has drawn attention worldwide including in the develo** countries. Hence detecting arrhythmia at an early stage is necessary to provide the quality healthcare and prognostic diagnosis.

Electrocardiogram (ECG) is one of the popular noninvasive tool for the diagnosis of heart related diseases, which is the electrical pulsation captured at the human body surface, representative of the functional dynamics [2] of the heart. These activities are almost repetitive and have a normal pattern with different amplitudes and corresponding intervals and morphology, in relation to the normal functioning of the heart. The ECG from a normal heart is called normal sinus rhythm. When the heart departs from its normal function, the ECG also changes in its characteristics. So the prognostic information given by the ECG has to be utilized properly for better diagnostics.

The different computational tools and algorithms are also can be used for automated diagnosis and early detection of heart related abnormalities. Towards this direction, an attempt is made towards automated classification of ECG belonging to normal sinus rhythm and arrhythmia classes.

Different techniques have been used to extract the R point in the ECG. Recently quadratic spline wavelet is used for ECG delineation [3, 4], where individual waves and amplitudes are extracted. Even the traditional Pan and Tompkins [5] algorithm is also computationally effective. Among other features extracted are time frequency features [6, 7] and higher order cumulants [8]. Among various classification methods discriminant analysis [9], error backpropagation neural network [10], self organizing feature maps using neural networks [11], probabilistic neural networks [12], support vector machines [13], independent component analysis and Gaussian mixture model [14].

There are automated methods for classification of arrhythmic beats [15, 16] in the ECG using different techniques. But most of the methods use compression of time domain features using Principal Component Analysis (PCA), Linear Discriminant Analysis (LDA) etc [14]. In the proposed method features are compressed using PCA on Discrete Wavelet Transform (DWT) sub bands. Since the DWT gives sparse representation of the data, PCA or any other feature compression technique should condense the features better than the time domain counterpart method.

In the proposed methodology the DWT sub band principal components are used for classification using three different algorithms, i.e., Gaussian mixture model (GMM), error back propagation neural network (EBPNN) and support vector machine (SVM) classifiers. The results are shown, compared and discussed.

Materials

In this work we have used open source data from www.physionet.org from MIT BIH arrhythmia database and MIT BIH normal sinus rhythm database which are described following.

MIT BIH arrhythmia database

It consists of 48 half hour excerpts of two channel ambulatory ECG data obtained from 47 subjects studied by the BIH arrhythmia Laboratory between 1975 and 1979. Twenty three recordings were randomly taken from a set of 4,000 twenty four hour ambulatory ECG data collected from a mixed population including both inpatients (approximately 60%) and outpatients (approximately 40%) at Boston’s Beth Israel Hospital. Remaining 25 recordings were selected from the same set to include less common but clinically significant arrhythmias. The ECG recordings are sampled at 360 Hz. per channel with 11 bit resolution over 10 mV range.

MIT BIH normal sinus rhythm database

It consists of 18 long term ECG recordings of subjects referred to the Arrhythmia Laboratory at Boston’s Beth Israel Hospital. Subjects included in this database were found to have had no significant arrhythmias; they include 5 men, aged 26 to 45, and 13 women, aged 20 to 50. It is digitized at 128 Hz.

Methodology

The proposed methodology consists of four steps, preprocessing, registration, feature extraction and classification of ECG signal. We have employed three different classifiers; Gaussian mixture model (GMM), error back propagation neural network (EBPNN) [17] and support vector machine (SVM) [18] classifier. Figure 1 shows the block diagram of the proposed system.

In the preprocessing stage re-sampling, noise filtering and baseline drift removal are done. The preprocessed ECG signal is subjected to extract the R point. From the detected R point a 200 sample window having R point at its 100th sample is segmented. On each of the window of 200 samples, DWT is applied and the four sub bands are subjected to PCA. After PCA, appropriate number of principal components (PCs) from each sub band is chosen such that they contain 98% of data variability. These PCs are used for statistical validation against the PCs of time domain windowed signal having 200 samples. After statistical validation the significant features are used to classify into normal sinus rhythm and arrhythmia using three different classifiers, viz. GMM, EBPNN and SVM. The results are tabulated and compared.

Preprocessing

Each of the datasets used are sampled at different rates, so a common sampling rate is chosen to be 250 Hz and re-sampled using standard re-sampling techniques [19]. Also the ECG signal suffers from baseline drift, which is corrected by standard filtering [20] techniques.

R point detection and segmentation

The R point is the characteristic of the ECG signal which can be used for registration before feature extraction and subsequent classification. There are many methods available in the literature for registration of ECG, Pan and Tompkins [5] work is simplest, even the wavelet based technique using quadratic spline wavelet [3] is computationally exhaustive. Here in our analysis Pan Tompkins algorithm is been used for detecting R point for ECG registration [14, 25]. It uses derivative, smoothing, summing and threshold operators in its architecture. Derivative gives the slope information; smoothing operation removes the high frequency noise. Finally the group delay incurred by all involved filters is compensated by advancing in time, and the middle point of the rectangular pulses after threshold operation is the R point.

If n is the number of points in the ECG data, and only first derivative is computed, then computation of first derivative is done in O(n) number of operations. Similar way rectification is done in O(n) operations. On the same lines smoothing, second derivative computation, rectification, smoothing, summing and threshold operations each require O(n) operations. Hence total complexity of Pan Tompkins algorithm is O(dkn) where d is the number of blocks present, k is the order of the individual filters (in this case it is 1) and n is the number of points in ECG signal. When d and k are very small compared to n, the total complexity would be O(n) i.e., linear order.

Based on the detected R point 100 points are taken to the right of R point and 99 points before R point with R point itself, altogether 200 point window is taken and used for subsequent discrete wavelet transform (DWT) [21] computation and pattern classification.

Discrete Wavelet Transform (DWT) computation

Fourier transform [20] is a very good tool to analyze the global frequency present in the signal; the transform has good frequency resolution but no time resolution. The time resolution may be increased at the expenditure of reduced frequency resolution. This intuition is incorporated in wavelet transform. It decomposes a signal in time-frequency components, and provides multi resolution analysis (MRA) [21] with different resolution at different levels of decomposition. Thus it can discriminate two signals having same frequency components occurring at different times. The wavelet transform consists of translates and dilates of a basis function called as mother wavelet, such a basis function at scale a and translation b is given by wavelet equation as follows

$$ {\psi_{a,b}}(t) = \frac{1}{{\sqrt {a} }}\psi \left( {\frac{{t - b}}{a}} \right) $$

(1)

When the scale and translation variables are sampled on a dyadic grid, it results in discrete wavelet transform (DWT), whose wavelet equation is given by

$$ {\psi_{m,n}}(t) = {2^{ - m/2}}\psi \left( {{2^{ - m}}t - n} \right) $$

(2)

The wavelet function is associated with a component of the signal having higher frequencies, called as detail of the signal. Using the family of translates and dilates of mother wavelet, we can express the detail coefficients of the DWT of signal x(t) as,

$$ {T_{m,n}} = \int\limits_{ - \infty }^\infty {x(t){\psi_{m,n}}} (t)dt $$

(3)

i.e, detailed coefficients of a signal is given by the inner product between the signal and the wavelet function at different m and n.

Associated with low frequency components of a signal (similar to signal detail), there is another basis function which is orthogonal to the corresponding wavelet function called as scaling function. Using scaling function the approximation coefficients of the signal is given by,

$$ {S_{m,n}} = \int\limits_{ - \infty }^{ + \infty } {x(t){\phi_{m,n}}(t)dt} $$

(4)

where

$$ {\phi_{m,n}}(t) = {2^{ - m/2}}\phi \left( {{2^{ - m}}t - n} \right)\quad {\hbox{and}}\quad \int\limits_{ - \infty }^{ + \infty } {{\phi_{0,0}}(t)dt = 1} $$

(5)

There are many wavelet families. Among them Daubechies 4 (db-4) is used for our study. The ECG signal is decomposed in time frequency using db-4 wavelet for a depth of 4 levels. The frequency components in each sub band are shown in Fig. 2. In the first level of decomposition the approximation signal will contain frequencies up to 62.5 Hz, where as detail signal will contain frequencies from 62.5 Hz till 125 Hz, and so on for other levels of decomposition as shown in Fig. 2.

The computational complexity of DWT using Mallat’s algorithm is O(2^d nk)where d is the depth of decomposition, n is the number of points in ECG data and k is the number of filter coefficients in approximation or detail filters. In our case n = 200, d = 4, k = 8.

In order for a simple and effective feature vector , the reduced principal components of wavelet sub-bands a₄,d₄,d₃,d₂ are used as features for statistical test of significance and subsequently classification by three classifiers independently.

Principal Component Analysis (PCA) on wavelet sub band features

In order to compress the features of ECG sub bands, PCA [22] is applied on each of the sub band of interest. We have considered four sub bands, a4, d4, d3, d2; as most of the signal frequencies of interest lies in these bands. PCA is an orthogonal transformation which transforms the sub band coefficients into those directions of maximum variance. The procedure consists of finding data covariance matrix by subtracting mean of the features as,

$$ \Sigma = \frac{1}{N}\left\{ {\left( {d - \bar{d}} \right){{\left( {d - \bar{d}} \right)}^T}} \right\},1 \leqslant i,j \leqslant N $$

(6)

where $ \bar{d} $ is the feature mean vector and N is the number of samples(observations) considered including both arrhythmia and normal sinus rhythm signals. In this study we have included 548 observations including 274 from each of the classes.

The covariance matrix is positive definite. The eigen values and eigen vectors of Σ are computed. The eigen vectors are sorted in the descending order of eigen values. Finally the data is projected into the directions of eigen vectors to obtain principal components (PCs). The first few of them will contain most of the energy contained in the signal. The first few components are chosen on the basis that 99% of the data energy is contained in these PCs.

PCA routine uses singular value decomposition method to find eigen values and eigen vectors of the covariance matrix. The SVD can be performed in O(mn ²) floating point operations, where m is total number of patterns, n is the dimension of each pattern.

We have chosen 13 PCs based on 98% data variability condition and the PCs of ECG sub band coefficients are used to analyze the statistical significance and subsequent classification.

Statistical test

A feature extraction scheme in compact supported wavelet basis space along with principal component analysis is proposed. Theoretically the wavelet features after PCA should give better and compact representation than that of time domain PCA features. Both the time domain and wavelet domain features are subjected to F test [23] and independent sample t-test [24] for statistical significance study.

Independent sample t-test is performed to show the significance of each PC in discriminating two groups with their means. Also another discriminating criterion, i.e., ratio of between group variance to within group variance that leads to F-test is considered to cross validate the PCs in time domain and DWT sub band domain.

Classification

Using the PCA features of ECG sub bands a two class pattern classification problem is formulated. In this paper we have considered three pattern classifiers, viz., GMM, BPNN and SVM.

GMM

Here we have a binary class problem of classification of normal sinus rhythm signal and arrhythmia signal. The Gaussian mixture model assumes that the features are drawn from a normal distribution. We have two mixing components corresponding to normal and arrhythmia classes respectively. Therefore we have two class conditional densities, $ p\left( {{x_n}|{\omega_k}} \right) $, $ 1 \leqslant k \leqslant 2 $ and corresponding class prior probabilities, p(ω _k), $ 1 \leqslant k \leqslant 2 $. Each of the two mixing component has a mean vector and covariance matrix. Since we have applied orthogonal transformation in compact supported basis, the off diagonal elements in the covariance matrix are all approximately zero since the data will be highly uncorrelated. The probability density function of such a model is given by

$$ p{\left( {\left. {x_{n} } \right|\omega _{k} } \right)} = \frac{1} {{2\pi \left| {\Sigma _{k} \left| {^{{1 \mathord{\left/ {\vphantom {1 2}} \right. \kern-\nulldelimiterspace} 2}} } \right.} \right.}}\exp {\left\{ { - \frac{1} {2}{\left( {x_{n} - \ifmmode\expandafter\bar\else\expandafter\=\fi{x}_{k} } \right)}^{T} \Sigma _{k} ^{{ - 1}} {\left( {x_{n} - \ifmmode\expandafter\bar\else\expandafter\=\fi{x}_{k} } \right)}} \right\}} $$

(7)

where

$$ {\bar{x}_k} = \frac{1}{{\left| {{X_k}} \right|}}\sum\limits_{{x_n} \in {\omega_k}} {{x_n}} $$

(8)

and

$$ {\Sigma_k} = \frac{1}{{\left| {{X_k}} \right|}}\sum\limits_{{x_n} \in {\omega_k}} {\left( {{x_n} - {{\bar{x}}_k}} \right)} {\left( {{x_n} - {{\bar{x}}_k}} \right)^T} = diag\left( {{\sigma_i}^2} \right),1 \leqslant i \leqslant d $$

(9)

The corresponding posterior probabilities are given by Bayes’ rule as follows.

$$ P\left( {{\omega_k}\left| {{x_i}} \right.} \right) = \frac{{p\left( {{x_i}\left| {{\omega_k}} \right.} \right)}}{{\sum\limits_{k = 1}^2 {p\left( {{\omega_k}} \right)p\left( {\left. {{x_i}} \right|{\omega_k}} \right)} }} $$

(10)

Since our data consists of missing observations or it does not represent the whole of the sample space, the mean vectors and the covariance matrices computed are not the correct ones. Therefore the means and variances are recomputed using Expectation Maximization (EM) algorithm and using maximum likelihood estimation method. The re-estimating formulae are following

$$ {\hat{\mu }_j} = \frac{{\sum\limits_{i = 1}^N {{x_i}P\left( {{\omega_j}\left| {{x_i}} \right.} \right)} }}{{\sum\limits_{i = 1}^N {P\left( {{\omega_j}\left| {{x_i}} \right.} \right)} }} $$

(11)

$$ \hat{\sigma }_j^2 = \frac{{{{\sum\limits_{i = 1}^N {\left( {{x_i} - {{\hat{\mu }}_j}} \right)} }^2}P\left( {{\omega_j}\left| {{x_i}} \right.} \right)}}{{\sum\limits_{i = 1}^N {P\left( {{\omega_j}\left| {{x_i}} \right.} \right)} }} $$

(12)

$$ p\left( {{{\hat{\omega }}_j}} \right) = \frac{1}{N}\sum\limits_{i = 1}^N {P\left( {{\omega_j}|{x_i}} \right)} $$

(13)

The initial prior probability is taken to be 0.5 for each of the classes. An initial model is assumed from the data. The EM algorithm used is having two core steps; E step and M step. During E step class conditional density is computed according to Eq. 7, and from it posterior density according to Eq. 10 is computed. During M step the class model is been re-estimated according to the Eqs. 11 through 13. The process is continued until the new estimate will not change much from the previous estimate, and model gets stabilized. Then the EM based GMM is said to be converged. The logarithm of the class conditional density called as log-likelihood is computed for each of the iteration and it will stop increasing at convergence.

The GMM algorithm is an optimization problem which maximizes the following objective function.

$$ J = \prod\limits_n {\sum\limits_k {p\left( {{\omega_k}} \right)p\left( {{x_n}\left| {{\omega_k}} \right.} \right)} } $$

(14)

The converged centroids are such that the product over all the observations, the total class conditional densities weighted with respective prior probability will be maximized. The EM algorithm determines its new estimate such that it will be approaching to the optimum of the objective function, so as for the algorithm to converge.

GMM is an iterative algorithm, which can be performed in O(ndkT) floating point operations, where n is the number of patterns, d is the total number of features in a pattern, k is the total number of classes present in the data, and T is the number of iterations required for convergence of the algorithm.

Error back propagation neural network

We have used here an error back propagation neural network [17] consisting of one input layer of 13 neurons, one layer of hidden neurons consisting of 6 nodes and one layer of two neurons in the output layer. The neural network is trained to adapt its weights so as to classify the data into two classes. This is also an optimization problem where following objective function is minimized.

$$ J(w) = \frac{1}{2}{\sum\limits_{n = 1}^N {\sum\limits_{k = 1}^N {\left\{ {{y_k}\left( {{x_n};w} \right) - t_k^n} \right\}} }^2} $$

(15)

Where y _k(x _n;w) is the network response for k ^th class neuron in the output layer and $ t_k^n $ is the target for k ^th class of the n ^th observation feature vector.

There is a bias term in the input layer. The thirteen PCs of ECG sub band features are fed to the input layer of the neural network, along with bias term for the training set and the weights are adapted using gradient descent method in order to reduce the total mean square error below a threshold. Then the testing data is fed to the neural network architecture based on the derived weights and the output is noted and the data is classified into binary class.

At the input layer of the neural network there are n inputs, n ₂ at the second layer, n ₃ at the third layer and so on till k for the output layer. If the neural network takes T iterations to converge the total complexity will be O(nn ₂ n ₃...kT). In our case we have 13 neurons in the input layer, 6 neurons in the hidden layer and 2 neurons in the output layer, the algorithm converges in 19 epochs. Therefore the parameters of complexity are n = 13, n ₂ = 6, k = 2, T = 19.

Support vector machine classifier

It is a single layer and highly non-linear network which optimizes the class separation boundary (discriminant hyper plane) such that the distance from the features falling in a given class to the hyper plane gets simultaneously maximized with respect to all the classes. Being a supervised classifier it has generalization ability, by which it can classify unseen data. Suppose $ \left( {{x_i},{y_i}} \right),i = 1:N $ is the data set, where x _i is the ith feature point, y _i is the class label. For the binary classification problem, let c₊ and c₋ are the centroids of the two classes, the classifier response will be

$$ \begin{array}{*{20}{c}} {{y_i} = {\rm sgn} \left( {\left( {x - c} \right).w} \right)} \\{ = {\rm sgn} \left( {\left( {x.{c_{+} }} \right) - \left( {x.{c_{-} }} \right) + b} \right)} \\\end{array} $$

(16)

where

$$ b = \frac{1}{2}\left( {{{\left\| {{c_{-} }} \right\|}^2} - {{\left\| {{c_{+} }} \right\|}^2}} \right) $$

(17)

The optimal hyperplane separating the two classes and satisfying condition in Eq. 16 is

$$ \begin{array}{*{20}c} {{minimize\frac{1}{2}{\left\| w \right\|}^{2} }} \\ {{w,b}} \\ \end{array} $$

(18)

such that

$$ y_{i} .{\left( {{\left( {w.x_{i} } \right)} + b} \right)} \geqslant 1,i = 1,...N $$

(19)

The Lagrangian dual of Eq. 18 is a quadratic programming problem and will find the optimal hyper plane which separate the two classes. It uses a routine called quadratic programming whose complexity is of polynomial order.

Results and discussion

The proposed methodology has been applied on ECG from MIT BIH arrhythmia and normal sinus rhythm databases (described in “Materials”) to form a two class problem. The R point in the ECG is detected by extended Pan Tompkins algorithm, which detects almost all peaks with a good detection rate. The detection of R point is shown in Fig. 3 for normal sinus rhythm signal. The choice of this algorithm is due to its simplicity and real time implementation ability, even though more accurate delineation algorithms [3, 7] are available in the literature.

After detecting the R point in the ECG, 99 points from the left of R point and 100 points to the right of R point are taken along with the R point itself as a 200 sample window consisting of only one QRS complex. The 200 sample window of one QRS belonging to either arrhythmia or normal sinus rhythm will be the one pattern for the subsequent dimensionality reduction and pattern classification.

The power spectrum of a normal sinus rhythm signal and a arrhythmia signal is shown in Fig. 4. It is observed from Fig. 4, and Fig. 2 that most of the signal frequency of ECG lies in four sub-bands, approximation 4, detail 4, detail 3 and detail 2. Hence these four sub bands are used for subsequent sub band PCA and classification.

The wavelet decomposition at different levels of decomposition is depicted in Fig. 5 for normal sinus rhythm signal and Fig. 6 for arrhythmia signal. The four sub band signals are shown, for both normal sinus rhythm and arrhythmia signals.

Suppose if time domain segmented ECG signal after R point detection is used for PCA extraction, the energy profile of the principal components is shown in Fig. 7. The 13 principal components consist of 99.97% of signal energy in these directions.

The sub bands of ECG are subjected to PCA and the energy profile of PCs is depicted in Fig. 8. The number of PCs considered and the energy contained in PCs directions of the data is given in Table 1.

Table 1 Sub band PCA

Full size table

The segmented ECG from detected R point is decomposed using DWT and on this time frequency sub bands PCA is applied. The convention for classification is that applying PCA on time domain signal in order to compress it. But we have applied PCA on wavelet sub bands, which are yielding better discrimination ability between the normal sinus rhythm and arrhythmia class features so as to classify them. This is substantiated by the statistical independent t test on these features given in Table 2. Table 2 gives the independent t test for time domain PC features and DWT sub band PC features. It is found that from Table 2 that the significant p-value is less than 0.000 for DWT based PC features whereas for time domain PC features only first three features are significant as can be seen by corresponding p-value.

Table 2 Statistical significance test for time domain and DWT domain features

Full size table

The Gaussian mixture model is trained and the log likelihood profile with respect to iterations is shown in Fig. 9. The GMM converges in 8 iterations.

The mean square error reduction during training of the error back propagation neural network is depicted in Fig. 10. It is seen that the neural network converges in 19 epochs. The SVM classification is shown in Fig. 11.

The sensitivity, specificity and average accuracy is shown for the three classifiers, i.e., GMM, EBPNN and SVM classifiers in Table 3.

Table 3 Classification performance

Full size table

Conclusion

In this study, a system approach has been developed in order to screen arrhythmia patients from normal ones based on ECG signal in time frequency sub band domain. In fact, it includes significant (lesser dimension) principal components due to sparser representation of features in time frequency domain. As because of the non-linearity of ECG signal, we have employed higher order classifiers considering non-linear transformations. In case of GMM, neural network and SVM, it is incorporated in EM algorithm, hidden layer and kernel transformation respectively.

From the results, finally it can be interestingly observed that the proposed system approach provides 87.36%, 93.41% and 95.60% overall accuracies for GMM, BPNN and SVM classifiers respectively. As a future work, the features extracted from other wavelet families can be used in our methodology for training and testing. Simultaneously, the statistical validation needs to be verified and hence classifiers are to be employed for final screening with higher accuracy.

References

Guyton, A. C., and Hall, J. E., Textbook of medical physiology (11th Edition). W. B. Saunders Co., 2006.
Fauci, A. S., Braunwald, E., Kesper, D. L., Hauser, S. L., Longo, D. L., Jamesonn, J. L., and Loscalzo, J., Harrison’s principles of internal medicine, 17th Edition, Mc-Graw Hill, 2008.
Martinez, J. P., Almeida, R., Olmos, S., Rocha, A. P., and Laguna, P., A wavelet based ECG delineator: Evaluation on standard databases. IEEE Trans. Biomed. Eng. 51(4):570–581, 2004.
Article Google Scholar
Fernando, S., Martinez, J. P., Laguna, P., Van G. B., Cyril, R., and Richard, H., Impact of sampling rate reduction on automatic ECG delineation, Proceedings of the 29th annual international conference of the IEEE EMBS, August 23–26, 2007, pp 2587–2590.
Pan, J., and Tompkins, W. J., A real time QRS detection algorithm. IEEE Trans. Biomed. Eng., BME 32(3):230–236, 1985.
Article Google Scholar
Mallat, S. G., Matching pursuit with time frequency dictionaries. IEEE Trans. Signal Process. 41(12):3397–3415, 1993.
Article MATH Google Scholar
Li, C., Zheng, C., and Tai, C., Detection of ECG characteristic points using wavelet transforms. IEEE Trans. Biomed. Eng. 42(1):21–29, 1995.
Article Google Scholar
Karimifard, S., and Ahmadian, A., Morphological heart arrhythmia classification using Hermitian model of higher-order statistics, Proc. 29th Annual International Conference of the IEEE EMBS Cité Internationale, Lyon, France August, 2007.
Chazal, P. D., Automatic classification of heartbeats using ECG morphology and heartbeat interval features. IEEE Trans. Biomed. Eng. 51(7):1196–1206, 2004.
Article Google Scholar
Maglaveras, N., Stamkopoulos, T., Pappas, C., and Strintzis, G. M., An adaptive backpropagation neural network for real-time ischemia episodes detection: Development and performance analysis using the European ST-T database. IEEE Trans. Biomed. Eng. 45(7):805–813, 1998.
Article Google Scholar
Lagerholm, M., Peterson, C., Braccini, G., Edenbrandt, L., and Sornmo, L., Clustering ECG complexes using Hermite functions and self organizing maps. IEEE Trans. Biomed. Eng. 47(7):838–848, 2000.
Article Google Scholar
Yu, S. N., and Chou, K. T., Integration of independent component analysis and neural networks for ECG beat classification. Expert Syst. Appl. 34(4):2841–2846, 2008.
Article Google Scholar
Ubeyli, E. D., ECG beats classification using multi-class support vector machines with error correcting output codes. Digit. Signal Process. 17(3):675–684, 2007.
Article Google Scholar
Martis, R. J., Chakraborty, C., and Ray, A. K., A two stage mechanism for registration and classification of ECG using Gaussian Mixture Model. Pattern Recognit. 42:2979–2988, 2009.
Article MATH Google Scholar
Throne, R. D., Jenkins, J. M., Winston, S. A., and DiCarlo, L. A., A comparison of four new time domain techniques for discriminating nonmorphic ventricular tachycardia from sinus rhythm using ventricular waveform morphology. IEEE Trans. Biomed. Eng. 38:561–570, 1991.
Article Google Scholar
Krasteva, V., and Jekova, I., QRS template matching for recognition of ventricular ectopic beats. Ann. Biomed. Eng. 35(12):2065–2076, 2007.
Article Google Scholar
Bishop, C., Neural Networks for pattern recognition. Oxford University Press, 1995.
Christianini, N., and Taylor, J. S., An introduction to support vector machines and other kernel based learning methods. Cambridge University Press, 2000.
Vaidyanathan, P. P., Multirate systems and filter banks. Pearson education (Asia) Pte. Ltd., 2004.
Oppenheim, A. O., and Schaffer, R. A., Discrete time signal processing. Mc-Graw Hill edition, 2003.
Strang, G., and Nguyen, T., Wavelets and filter banks. Willesley Cambridge Press, 1996.
Duda, R., Hart, P., and Stork, D., Pattern classification, Second Edition, John Wiley & Sons, 2001
Gun, A. M., Gupta, M. K., and Dasgupta, B., An outline of statistical theory. Vol I and II. World Press Private Ltd, 4th edition, 2005.
Gun, A. M., Gupta, M. K., and Dasgupta B., Fundamentals of statistics. Vol. I and II. World Press Private Ltd, 4th edition, 2008.
Banik, S., and Martis, R. J., Code excited linear prediction codec for electrocardiogram, Proc. of Int. Conf. on Engineering in medicine and biology society, September 2004.

Download references

Acknowledgement

We are extremely grateful to VECC, Kolkata for funding this work.

Author information

Authors and Affiliations

School of Medical Science and Technology, Indian Institute of Technology, Kharagpur, India
Roshan Joy Martis, M. Muthu Rama Krishnan & Chandan Chakraborty
Department of Atomic Energy, Variable Energy Cyclotron Center, Kolkata, India
Sarbajit Pal & Debranjan Sarkar
Cardiothoracic Surgery Division, AMRI Hospitals, Salt Lake, Kolkata, India
K. M. Mandana
Department of Electronics and Electrical Communication Engineering, Indian Institute of Technology, Kharagpur, India
Ajoy Kumar Ray

Authors

Roshan Joy Martis
View author publications
You can also search for this author in PubMed Google Scholar
M. Muthu Rama Krishnan
View author publications
You can also search for this author in PubMed Google Scholar
Chandan Chakraborty
View author publications
You can also search for this author in PubMed Google Scholar
Sarbajit Pal
View author publications
You can also search for this author in PubMed Google Scholar
Debranjan Sarkar
View author publications
You can also search for this author in PubMed Google Scholar
K. M. Mandana
View author publications
You can also search for this author in PubMed Google Scholar
Ajoy Kumar Ray
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Chandan Chakraborty.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Martis, R.J., Krishnan, M.M.R., Chakraborty, C. et al. Automated Screening of Arrhythmia Using Wavelet Based Machine Learning Techniques. J Med Syst 36, 677–688 (2012). https://doi.org/10.1007/s10916-010-9535-7

Download citation

Received: 22 March 2010
Accepted: 24 May 2010
Published: 16 June 2010
Issue Date: April 2012
DOI: https://doi.org/10.1007/s10916-010-9535-7

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Automated Screening of Arrhythmia Using Wavelet Based Machine Learning Techniques

Abstract

Similar content being viewed by others

Wavelet-based Machine Learning Techniques for ECG Signal Analysis

Design and Comparison of ECG Arrhythmias Classifiers Using Discrete Wavelet Transform, Neural Network and Principal Component Analysis

Comparison of Feature Extraction Methods for Heart Disease Classification

Introduction