Abstract
In this paper, we proposed an extension of the classical Conditionally Gaussian Observed Markov Switching Model (CGOMSM) by incorporating fuzzy switches. The proposed approach allows the modeling of transient switches and handles the discontinuity feature in switching regime models by using fuzzy switches instead of hard jumps. Fuzzy switched based approach is more adapted to real-world application in which regime continuity is an intrinsic property. To define an efficient scheme for an exact smoothing in CGOMFSM, we adapt fast smoothing equations to cope with the fuzzy model. Finally, we show through several experiments the interest of the fuzzy switches model.
You have full access to this open access chapter, Download conference paper PDF
Similar content being viewed by others
Keywords
1 Introduction
Let \(\varvec{\mathrm {X}} _1^N=\{X_{1},\dots , X_{N}\}\), \(\varvec{\mathrm {Y}} _1^N=\{Y_{1},\dots , Y_{N}\}\) and \(\varvec{\mathrm {R}} _1^N=\{R_{1},\dots , R_{N}\}\) be three random sequences taking values in \(\mathbb {R}^{m}\), \(\mathbb {R}^{q}\) and \(\varOmega =\{1,\dots ,K\}\) respectively. Let \(\varvec{\mathrm {X}} _1^N\) be a hidden process and \(\varvec{\mathrm {Y}} _1^N\) be an observed process. We consider a switching regime model represented by the sequence of switches \(\varvec{\mathrm {R}} _1^N\). We address the smoothing problem consisting in an recursive search of the unobserved process \(\varvec{\mathrm {X}} _1^N\) and the switches sequence \(\varvec{\mathrm {R}} _1^N\), only knowing the observed sequence \(\varvec{\mathrm {Y}} _1^N\). A fast Bayesian processing can be carried out by assuming that the distribution of \((\varvec{\mathrm {X}} _1^N, \varvec{\mathrm {Y}} _1^N)\) is within the framework of hidden Gaussian Markov model. The non-linearity can be modeled by a switching regime system. Then, the idea is to approximate a non-linear non-Gaussian system by a regime switching Gaussian system. Some recent switching models have been proposed with efficient fast exact filtering schemes. These switching models called “conditionally Markov switching hidden linear models” (CMSHLM) [16] include conditionally Gaussian observed Markov switching models (CGOMSM) defined as follows:
-
\(T_1^N=(X_1^N, R_1^N,Y_1^N)\) is Markov chain;
-
\(p(r_{n+1}|x_n,r_n,y_n)=p(r_{n+1}|r_n)\) and
$$\begin{aligned}&\left[ \begin{array}{c} X_{n+1}\\ Y_{n+1} \end{array} \right] =\left[ \begin{array}{cc} A_{n+1}^{xx}(R_{n}^{n+1})&{}A_{n+1}^{xy}(R_{n}^{n+1})\\ 0&{}A_{n+1}^{yy}(R_{n}^{n+1}) \end{array}\right] \left[ \begin{array}{c} X_{n}\\ Y_{n} \end{array} \right] \\&\quad \quad \quad \quad + \left[ \begin{array}{cc} B_{n+1}^{xx}(R_{n}^{n+1})&{}B_{n+1}^{xy}(R_{n}^{n+1})\\ B_{n+1}^{yx}(R_{n}^{n+1})&{}B_{n+1}^{yy}(R_{n}^{n+1}) \end{array}\right] \left[ \begin{array}{c} U_{n+1}\\ V_{n+1} \end{array} \right] +\left[ \begin{array}{c} N^X(R_{n}^{n+1})\\ N^Y(R_{n}^{n+1}) \end{array} \right] , \end{aligned}$$
with
where \(U_1^N\) and \(V_1^N\) are two Gaussian unit-variance white noise vectors, \(M^X(R_{n})\) and \(M^Y(R_{n})\) are respective means of \(\varvec{\mathrm {X}} _1^N\) and \(\varvec{\mathrm {Y}} _1^N\) in each state (independently from n). The CGOMSM is then defined by matrices \(A(\varvec{\mathrm {R}} _n^{n+1})\), \(B(\varvec{\mathrm {R}} _n^{n+1})\), transition matrix denoted as t such that \(t(i,j)=p(r_{n+1}=j|r_n=i)\) and mean vectors \(M(R_{n})=[M^X(R_{n}); M^Y(R_{n})]\). Figure 1 depicts the dependence graph of the CGOMSM model.
We assume that \(\varvec{\mathrm {R}} _1^N\) takes its values in a discrete finite set of K switches \(\varOmega =\{1,...,K\}\). This hard jumps model has been widely used in several contexts dealing with switching regime Markov systems. Its success comes from its ability to represent non-linear dynamic patterns which is an inherent property in several applications (analysis of economic and finance time series [11], sustainable energy [6], robotics [7, 8, 12], etc.).
However, this model does not take into account the intrinsic imprecision of the switches in real-world applications. In fact, hard jumps induce discontinuity in the dynamic behavior of the studied system. This transitory imprecision can be handled with fuzzy modeling which consists in allowing each switch to take its value as a mixture of many components simultaneously. Fuzzy modeling has been widely incorporated in several applications dealing with Markov models [13,14,15]. In this paper, we present a new method to approximate non-linear Markov systems using a new variant of CGOMSM using fuzzy switches (hereafter called CGOMFSM).
The remaining of this paper is organized as follows. In the second section, we detail the formulation of fuzzy switching model. The third section describes the adaptation of CGOMSM algorithms for parameters estimation and for posterior marginal probabilities computation the fuzzy counterpart. The fourth section presents experimental results, and the last one draws conclusions and future work.
2 Fuzzy Switching Model with K Hard Classes
2.1 Probability Distribution of Fuzzy Vectors
In the fuzzy switches system which extends the hard case with K classes \(\varOmega =\{1,...,K\}\), we assume that each jump \(r_n^K\) is a vector in \([0,1]^K\). So \(r_n^K=(\varepsilon _1,...,\varepsilon _K)\), and each component \(\varepsilon _k\) can be seen as “fuzzy part” of the hard class k. Therefore, we have \(\sum \limits _{k=1}^{K}{\varepsilon _k}=1\). This is an extension of the hard case as each \(\varepsilon =(\varepsilon _1,...,\varepsilon _K)\) of the form \(\varepsilon =(0,...,0,\varepsilon _k=1,0,...,0)\) is assimilated to the hard class k. The distribution of \(R_n^K\) is then a probability distribution on \(F^K\subset [0,1]^K\), with \(F^K\) the set of \(\varepsilon =(\varepsilon _1,...,\varepsilon _K)\) verifying \(\sum \limits _{k=1}^{K}{\varepsilon _k}=1\). Such a probability distribution can be defined in different manners; we propose to adopt the following one. Let us consider \(\delta _0\) Dirac mass on 0 and \(\delta _1\) Dirac mass on 1, and let \(\mu \) be Lebesgue measure on ]0, 1[. Let us note \(\nu =\delta _0+\delta _1+\mu \). Since \(\varepsilon _K=1-(\varepsilon _1+...+\varepsilon _{K-1})\), it is sufficient to define the distribution of the random vector \(R_n^{K-1}=(R_n^1,...,E_n^{K-1})\) on \(F_{K-1}\subset [0,1]^{K-1}\) whose elements verify \(\varepsilon _1+...+\varepsilon _{K-1}\le 1\). Let us define this distribution by a density f with respect to the measure \(\nu ^{\otimes (K-1)}\). Thus, we have:
The case \(K=2\) will be dealt with in next sections, in the example below we specify the case \(K=3\).
Example
Let \(K=3\). Here the distribution on \(F_2\subset [0,1]^2\) (whose elements verify \(\varepsilon _1+\varepsilon _2\le 1\)) is \(P_{E^2}=f\nu ^{\otimes 2}=f(\delta _0+\delta _1+\mu )^{\otimes 2}=f(\delta _{00}+\delta _{01}+\delta _{10}+\delta _{11}+\delta _0\otimes \mu + \mu \otimes \delta _0+\delta _1\otimes \mu + \mu \otimes \delta _1+\mu \otimes \mu )\).
However, we recall that \(f(\varepsilon _1, \varepsilon _2)=0\), for \((\varepsilon _1, \varepsilon _2)\in [0,1]^2\) such that \(\varepsilon _1+\varepsilon _2>1\). The sets \(F^3\subset [0,1]^3\) and \(F_2\subset [0,1]^2\) are illustrated in Fig. 2.
2.2 Joint Densities
In Markov context considered in this paper, we have to define joint densities of the distributions of \((R_n, R_{n+1})\) defined on \(F_{K-1}\times F_{K-1}\). We will consider f of the form:
We choose \(\phi (r_n,r_{n+1})= \left( 1-\frac{\Vert { r_{n+1} - r_n }\Vert }{\sqrt{2}}\right) ^r,~r\in \mathbb {R}\) with \( \Vert {r_{n+1}-r_n}\Vert \) is the distance between two consecutive switches given by the quadratic norm. Then:
Example
When the number of hard switches equals 3, the expression of normalization condition gives:
with \(\phi (u,v)=\left( 1-|u-v|\right) ^r\).
2.3 Parameters Interpolation
The model matrices of the fuzzy switching model can be calculated by linear interpolation using the following formula:
where A(i, j) and B(i, j) are the model matrices corresponding to the hard components i and j. M(i, j) is the mean vector for hard switches i and j.
The implementation of the fuzzy switching model can be performed by an adequate quantification of the interval [0, 1] into F discrete fuzzy levels. The larger F is, the more accurate the representation of data would be. However, choosing a large number of fuzzy levels will lead to high computation time. For example, when the number of crisp components equals three, setting \(F=3\) yields 15 switches and setting \(F=4\) gives 21 switches.
3 Fuzzy Switching Model with Two Hard Components
In this remaining of the paper, we consider the case of two hard switches \(\varOmega =\{0,1\}\). To model fuzzy switches, we consider that each random variable \(R_{n}\) in \(\varvec{\mathrm {R}} _1^N\) takes its values in the continuous interval [0, 1], instead of the set \(\{0,1\}\).
Let us denote the pair \((\varepsilon _n^0, \varepsilon _n^1)\in [0,1]\), in which \(\varepsilon _n^i\) represents the contribution of the hard component i to the switch \(r_n\). Without loss of generality, let \(\varepsilon _n=\varepsilon _n^1 = 1- \varepsilon _n^0\). Then we have \(R_{n}= \varepsilon _n\):
-
\(\varepsilon _n = 0 \) if the switch is the hard component 0.
-
\(\varepsilon _n \in \, ]0,1[ \) if the switch is fuzzy.
-
\(\varepsilon _n = 1 \) if the switch is the hard component 1.
So this model is able to represent signals with both discrete (hard) and continuous (fuzzy) components. Let \(\nu =\delta _0+\delta _1+\mu \).
We will assume that \(\varvec{\mathrm {R}} _1^N\) is stationary. Then \(p(r_n^{n+1})\) does not depend on n. Let us now precisely define the joint a priori density \(p(r_1^2)\), where notation \(r_1^2\) represents the pair \((r_1, r_2)\). \(p(r_1^2)\) is defined with respect to the measure product \(\nu \otimes \nu \), under normalization condition (Fig. 3).
Analytic computation of the joint prior densities can be worked out by quantifying the interval [0, 1] into F equal-length sub-intervals \(\left[ \frac{i}{F},\frac{i+1}{F} \right] \) as described in Fig. 4. Using this scheme, the normalization condition in Eq. (4) yields:
Each sub-interval can be represented by its medium value \(\frac{2i+1}{2F}\). So, in this discrete approximate scheme, the joint a priori density can be defined by a \((2+F) \times (2+F)\) matrix.
Under the assumptions of fuzzy switches, we can define the matrices of the incorporated model using a bi-linear function as follows:
The means vectors of the fuzzy model are calculated using the following equations:
Hence, the CGOMFSM is entirely defined by
-
the parameters of the corresponding deterministic hard switching model,
-
the number of fuzzy levels F, and
-
parameters \(\alpha _{ij}, i,j\in \{0,1\}\), \(\beta \), \(\theta \) and r.
The parameter r specifies the homogeneity of the switching model. The larger r is, the larger the probability of having two similar consecutive switches is.
Figure 5 represents an example of simulation of \((\varvec{\mathrm {X}} _1^N,\varvec{\mathrm {Y}} _1^N,\varvec{\mathrm {R}} _1^N)\) using the set of parameters of a fuzzy switching model defined in Table 1. Simulations were performed using the following transition matrix:
In the transition matrix rows correspond to \(R_n\) and columns correspond to \(R_{n+1}\). The first and the last rows and columns represent hard switches while other rows and columns correspond to discrete fuzzy switches. This simulation shows the imprecision between hard switches 0 and 1 as illustrated in the trajectory of \(\varvec{\mathrm {Y}} _1^N\). The choice of the transition matrix allows a progressive regime switching of parameters sets corresponding to hard switch 0 to hard switch 1.
4 Fast Smoothing in the CGOMFSM
Let us denote by \(\varvec{\mathrm {T}} _1^N\) the triplet \((\varvec{\mathrm {X}} _1^N, \varvec{\mathrm {R}} _1^N, \varvec{\mathrm {Y}} _1^N)\). The smoothing problem consists in computing:
from \(p\left( r_{n+1} \left| \varvec{\mathrm {y}}_1^{N} \right. \right) \) and \(\mathbb {E}\left[ \varvec{\mathrm {X}} _{n+1} \left| r_{n+1},\varvec{\mathrm {y}}_1^{n+1} \right. \right] \).
The optimal smoother computes recursively \(p\left( r_{n+1} \left| \varvec{\mathrm {y}}_1^{N} \right. \right) \) and \(\mathbb {E}\left[ \varvec{\mathrm {X}} _{n+1} \left| r_{n+1}, \varvec{\mathrm {y}}_1^{n+1} \right. \right] \) from \(p\left( r_n \left| \varvec{\mathrm {y}}_1^{N} \right. \right) \) and \(\mathbb {E}\left[ \varvec{\mathrm {X}} _{n} \left| r_n, \varvec{\mathrm {y}}_1^{n} \right. \right] \) and the model parameters using the procedure detailed in [9]. The main difference between CGOMSM and CGOMFSM is that, in the case of fuzzy switches, we involve continuous integration, requiring to be quantified with respect to the number of discrete fuzzy levels F.
Since \((\varvec{\mathrm {R}} _1^N, \varvec{\mathrm {Y}} _1^N)\) is a pairwise Markov chain in the model, we get
and
Since
and from (9), we can derive the following recursive equation:
with \(F_{n+1}(\varvec{\mathrm {r}}_n^{n+1}, \varvec{\mathrm {y}}_n^{n+1})\) and \(H_{n+1}(\varvec{\mathrm {r}}_n^{n+1}, \varvec{\mathrm {y}}_n^{n+1})\) are adequate matrices. Probabilities \(p(r_n|y_1^n)\) and \(p(r_n, y_1^N)\) are recursively calculated in linear time using forward and backward probabilities in the Markov chain \((Y_1^N,R_1^N)\) such that \(\alpha _n(r_n)=p(r_n,y_1^n)\) and \(\beta _n(r_n)=p(y_{n+1}^N|r_n,y_n)\).
and
Using forward-backward probabilities, we can compute the smoothed and the filtered probabilities as follows:
Posterior marginal probabilities are calculated using the normalized Baum-Welch algorithm. The algorithm computes recursively the forward and backward probabilities. In the case of fuzzy switches, these probabilities are defined as follows:
with \(\varvec{\mathrm {t}}_n(\theta ) = (\varvec{\mathrm {x}}_n, \varvec{\mathrm {y}}_n, r_n=\theta )\).
Then:
5 Experiments
In this section, we present two series of experiments to assess the performance of the exact smoother, in the case of scalar data (\(m=q=1\)). In the first series we evaluate the performance of the fuzzy model with synthetic fuzzy signals; in the second series we apply our algorithm to smooth simulated Stochastic Volatility (SV) data. In both experiments, parameters estimation is carried out using EM algorithm using training samples denoted by \((\varvec{x}_1^T, \varvec{y}_1^T)\) of size T. Then we repeatedly generate, according to the considered model, synthetic sequences of size S denoted by \((\varvec{x}_1^S, \varvec{y}_1^S)\). Smoothing algorithm is then performed using estimated parameters to generate \(\hat{\varvec{x}}_1^S\) from the observed sequence \(\varvec{y}_1^S\). The criterion used to assess the efficiency of smoothing algorithms is the mean squared error (MSE) defined as follows:
5.1 Smoothing Synthetic Fuzzy Signals
We define the distribution of the random process \((\varvec{\mathrm {X}} _1^N,\varvec{\mathrm {Y}} _1^N,\varvec{\mathrm {R}} _1^N)\) by the Gaussian distributions \(p(z_1,z_2|r_1,r_2)\), where \(Z_n=(X_n^T,Y_n^T)^T\). Let \(\varGamma _{i,j}\) be the covariance matrix \(\varGamma _{i,j}=\mathbb {E}\left[ Z_1Z_2^T \left| r_1=i,r_2=j \right. \right] =\left[ \begin{array}{cccc} 1 &{} b_i &{} a_{ij}&{}d_{ij}\\ b_i&{}1&{}e_{ij}&{}c_{ij}\\ a_{ij}&{}e_{ij} &{} 1 &{}b_j\\ d_{ij}&{} c_{ij} &{}b_j &{}1 \end{array}\right] \).
To ensure that the model follows the definition of CGOMSM (\(A^{yx}(R_n^{n+1})=0\)), we take \(d_{ij}=b_{i}c_{ij}\), we also take \(a_{ij}=c_{ij}\) and \(e_{ij}=d_{ij}\). Moreover, we consider that \(a_{ij}=a_{i}\). Hence, the simulation model is defined by the set of parameters \(\varTheta =\{a_0, a_1, b_0, b_1\}\). We consider five cases defined by the parameter set \(\varTheta \) detailed in Table 2. For each case, we generate a fuzzy signal with 5 discrete fuzzy switches. Then we restore the hidden process using the CGOMFSM model using different values of F ranging from 1 to 5. For each set of data, we consider 3 values of \(r\in \{2,8,20\}\). For each case, we perform 10 independent experiments with \(S=1000\). Each experiment consists in generating a training sample \((\varvec{X}_1^T, \varvec{Y}_1^T)\) of size \(T=20000\) for the 100 iterations of EM algorithm. Figure 6 shows examples of trajectories of simulated process \(\varvec{\mathrm {R}} _1^N\) with three different values of r. Figure 7 illustrates an example of simulated data using case 2 parameters and the optimal (but approximated) smoothing output. Table 3 reports the MSE results for the 5 different cases of fuzzy models and different values of r.
The experimental results show that when the number of fuzzy levels increases, the smoothed signal is closer to the “ground-truth” hidden signal. Moreover, we noticed that the higher the homogeneity parameter r, the lower the estimation error.
5.2 Experiments on Stochastic Volatility Models
Stochastic volatility (SV) models are widely used to highlight the variance of stochastic processes [10]. Several variants of SV models have been studied (Henston, CEV, GARCH, Chen, etc.). In this paper, we consider the standard SV model defined as follows:
where \(U_i\), \(V_i\) are independent standard Gaussian vectors. The SV models is defined by the set of parameters \(\sigma \), \(\mu \), and \(\alpha \). The main conclusion is that when the number of discrete fuzzy states increases, the model approaches the results of the optimal (but time consuming) particle smoother (Table 4).
6 Conclusion
In this paper, we presented a novel approach to approximate non-linear Markov system using Conditionally Gaussian Observed Markov Fuzzy Switching Model (CGOMFSM). The chief novelty of this work is the introduction of fuzzy jumps instead of classical crisp states. This model still allows exact (up to required quantification) and fast smoothing equations. The fuzzy jumps allow transient modification of parameters, which is more appropriate for real-world applications. Future work includes the evaluation of the model on real data.
References
Abbassi, N., Benboudjema, D., Derrode, S., Pieczynski, W.: Optimal filter approximations in conditionally gaussian pairwise Markov switching models. IEEE Trans. Autom. Control 60(4), 1104–1109 (2015)
Salzenstein, F., Collet, C.: Fuzzy Markov random fields versus chains for multispectral image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 28(11), 1753–1767 (2006)
Caillol, H., Hillion, A., Pieczynski, W.: Fuzzy random fields and unsupervised image segmentation. IEEE Trans. Geosci. Remote Sensing 34(4), 801–810 (1993)
Gorynin, I., Derrode, S., Monfrini, E., Pieczynski, W.: Exact fast smoothing in switching models with application to stochastic volatility. In: EUSIPCO, Nice, France, pp. 924–928, 31 August - 4, September 2015
Caillol, H., Pieczynski, W., Hillon, A.: Estimation of fuzzy Gaussian mixture and unsupervised statistical image segmentation. IEEE Trans. Image Process. 6(3), 425–440 (1997)
Yang, L., He, M., Zhang, J., Vittal, V.: Support vector machine enhanced Markov model for short term wind power forecast. IEEE Trans. Sustain. Energy 6(3), 791–799 (2015)
Artemiadis, P.K., Kyriakopoulos, K.J.: A switching regime model for the EMG-based control of a robot arm. IEEE Trans. Syst. Man Cybern. Part B (Cybernetics) 41(1), 53–63 (2011)
Corff, S.L., Fort, G., Moulines, E.: Online expectation maximization algorithm to solve the SLAM problem. In: 2011 IEEE Statistical Signal Processing Workshop (SSP), pp. 225–228, Nice, June 2011
Gorynin, I., Derrode, S., Monfrini, E., Pieczynski, W.: Fast filtering in switching approximations of non-linear Markov switching systems with application to stochastic volatility. IEEE Trans. Autom. Control 62(2), 853–862 (2017)
Ghysels, E., Harvey, A., Renault, E.: Stochastic volatility. In: Handbook of Statistics, vol. 14, pp. 119–192 (1995)
Koko, M.: Application of Markov-switching model to stock returns analysis. Dyn. Econometric Models 7, 259–268 (2006)
Baltzakis, H., Trahanias, P.: A hybrid framework for mobile robot localization: formulation using switching state-space models. Auton. Robots 15(2), 169–191 (2003)
Salzenstein, F., Pieczynski, W.: Parameter estimation in hidden fuzzy Markov random fields and image segmentation. Graph. Model Image Process. 59(4), 205–220 (1997)
Carincotte, C., Derrode, S., Sicot, G., Boucher, J.M.: Unsupervised Image segmentation based on a new fuzzy hidden Markov chain model. In: IEEE International Conference on Acoustic, Speech, Signal Processing, Montreal, Canada, May 2004 (2004)
Carincotte, C., Derrode, S., Bourennane, S.: Unsupervised change detection on SAR images using fuzzy hidden Markov chains. IEEE Trans. Geosci. Remote Sensing 44(2), 432–441 (2006)
Pieczynski, W.: Exact filtering in conditionally Markov switching hidden linear models. Comptes Rendus Mathématique 349(9–10), 587–590 (2011)
Gamal-Eldin, A., Salzenstein, F., Collet, C.: Hidden fuzzy Markov chain model with K discrete classes. In: 10th International Conference on Information Science, Signal Processing and their Applications (ISSPA 2010), Kuala Lumpur, pp. 109–112, May 2010
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Bouyahia, Z., Derrode, S., Pieczynski, W. (2017). An Exact Smoother in a Fuzzy Jump Markov Switching Model. In: Ben Amor, B., Chaieb, F., Ghorbel, F. (eds) Representations, Analysis and Recognition of Shape and Motion from Imaging Data. RFMI 2016. Communications in Computer and Information Science, vol 684. Springer, Cham. https://doi.org/10.1007/978-3-319-60654-5_10
Download citation
DOI: https://doi.org/10.1007/978-3-319-60654-5_10
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-60653-8
Online ISBN: 978-3-319-60654-5
eBook Packages: Computer ScienceComputer Science (R0)