Abstract
Deep learning (DL) has captured the attention of the community with an increasing number of recent papers in regression applications, including surveys and reviews. Despite the efficiency and good accuracy in systems with high-dimensional data, many DL methodologies have complex structures that are not readily transparent to human users. Accessing the interpretability of these models is an essential factor for addressing problems in sensitive areas such as cyber-security systems, medical, financial surveillance, and industrial processes. Fuzzy logic systems (FLS) are inherently interpretable models capable of using nonlinear representations for complex systems through linguistic terms with membership degrees mimicking human thought. This paper aims to investigate the state-of-the-art of existing deep fuzzy systems (DFS) for regression, i.e., methods that combine DL and FLS with the aim of achieving good accuracy and good interpretability. Within the concept of explainable artificial intelligence (XAI), it is essential to contemplate interpretability in the development of intelligent models and not only seek to promote explanations after learning (post hoc methods), which is currently well established in the literature. Therefore, this work presents DFS for regression applications as the leading point of discussion of this topic that is not sufficiently explored in the literature and thus deserves a comprehensive survey.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
The goal in regression is to predict one or more variables \(\varvec{\textbf{y}}(k)=(y_{1}(k),\ldots ,y_{m}(k))^T\) from the information provided by measurements \(\varvec{\textbf{x}}(k)=(x_{1}(k),\ldots ,x_{p}(k))^T\) for a given sample k. Customary, \(\varvec{\textbf{y}}(k)\) are refereed as targets, outputs, or dependent variables, while \(\varvec{\textbf{x}}(k)\) are commonly refereed as predictors, inputs, covariates, regressors, or independent variables. Regression models covers several application areas, as economic growth problems [1, 2], air quality prediction [3, 4], medicine [5, 6], chemical industries [7, 8], and industrial processes [9, 10]. Recent studies show that regression models have become predominant in increasingly complex real-world systems due to the large availability of data, inclusion of nonlinear parameters, and other aspects intrinsic to the application area. For complex systems, traditional machine learning techniques (i.e., non-deep/shallow techniques) may become limited due to two main challenges: big-data explosion (high-dimensionality, high number of observations) and increase in complexity, caused by dynamics of nowadays applications. Deep learning (DL) methods have gained prominence in recent years due to their ability to represent systems in complex structures with multiple levels of abstraction and high-level features. However, DL may have limitations such as dependence on a high number of samples, hyperparameter sensitivity, and interpretability issues [11, 12], which limits the application in critical systems or one that requires accountability in its results. In this sense, deep fuzzy systems (DFS) have emerged as a viable method over DL to balance accuracy and interpretability in complex real-world systems.
Regression models can be categorized as [13]: (i) “white-box” when the input–output map** is built upon first-principle equations, (ii) “black-box” when the map** is derived from the data (also referred to as data-driven modeling), or (iii) “grey-box” when the knowledge about the input–output map** is known beforehand and integrated along with data-driven modeling. White-box models are advantageous for promoting interpretations of the internal mechanisms associated with input–output map**. On the other hand, black-box models can address complex systems with predictive analysis without prior knowledge of the system. Grey-box models can combine the interpretability presented by “white-box” and the ability to learn from data given by “black-box” models (e.g., fuzzy systems). Regarding data-driven modeling, the dependency between input and output can be built by linear or nonlinear models. A regression model is linear when the rate of change between input–output is constant due to the linear combination of the inputs. Examples include the multivariate linear regression models. Data-driven nonlinear regression is adopted when the input–output dependence is nonlinear and can not be covered by linear modeling. There is a plethora of methods for nonlinear regression, and its applicability is problem-dependent. Examples include fuzzy systems, support vector regression, artificial neural networks (e.g., non-deep/shallow and deep networks), and rule-based regression (e.g., decision trees and random forest).
Following are brief discussions of recent surveys and reviews that have showcased the applicability of DL to various regression application domains. The work of Han et al. [14] reviews DL models for time-series forecasting, where the DL models are categorized as: (i) discriminative, where the learning stage is based on the conditional probability of the output/target given an observation, (ii) generative, which learn the joint probability of both output and observation, with the generation of random instances), or (iii) hybrid, a combination of generative and discriminative DL methods. The authors demonstrated that DL models are effective at discriminating complex patterns in time series with high-dimensional data by implementing them in benchmark systems and using a real-world use case from the steel industry. Sun et al. [11] discuss the use of DL for soft sensor applications, showing the trends, the scope of applications in industrial processes, and the best practices for model development. The authors established some directions for future research, such as working on DL solutions to overcome the limitation in learning in scenarios with a lack of labeled samples (e.g., semi-supervised methods), hyperparameter optimization, solutions to improve model reliability (e.g., model visualization), and the development of DL methods with distributed and parallel modeling. Torres et al. [15] survey DL architectures for time-series forecasting. Furthermore, the authors discuss the practical aspects that must be considered when using DL methods to solve complex real-world problems in big-data settings, which include the existing libraries/toolbox, the techniques for automatic optimization of model structures, and the hardware infrastructure. Pang et al. [16] and Chalapathy et al. [29] and [30] investigated some recent trends in DFS models and real-world applications (e.g., time series forecasting, natural language processing, traffic control, and automatic control). However, despite the benefit of adopting FLS principles to DL systems, no comprehensive survey or review has been conducted focusing exclusively on deep fuzzy for regression problems.
This paper surveys and discusses the state-of-the-art on deep fuzzy techniques developed to deal with a diverse range of regression applications. Initially, Sect. 2 presents fundamental concepts about FLS and XAI. Then, an overview of DL techniques commonly used for regression will be presented in Sect. 3, namely Convolutional Neural Networks, Deep Belief Networks, Multilayer Autoencoders, and Recurrent Neural Networks. Next, Sect. 4 shows the literature on deep fuzzy systems in two ways: (i) standard deep fuzzy systems, based on fundamental FLS; and (ii) hybrid deep fuzzy systems, with the combination of FLS and the conventional deep models discussed in Sect. 3. Finally, Sect. 5 presents general discussions based on the state-of-the-art surveyed.
2 Background
2.1 Fuzzy Logic Systems
Developed initially by Lotfi A. Zadeh [31], FLS are rule-based systems composed mainly of an antecedent part, characterized by an “IF” statement, and a consequent part, characterized by a “THEN” statement, allowing the transformation of a human knowledge base into mathematical formulations, thus introducing the concept of linguistic terms related to the membership degree.
The basic configuration of a fuzzy logic system, shown in Fig. 1, depends on an interface that transforms the real input variables into fuzzy sets (fuzzifier), which are interpreted by a fuzzy inference model to perform an input–output map** based on fuzzy rules. Thus, the mapped fuzzy outputs go through an interface that transforms them into real output variables (defuzzifier) [32, 33].
Some well-known fuzzy systems are Mamdani fuzzy systems [34], Takagi-Sugeno (T-S) fuzzy systems [35], and Angelov-Yager’s (AnYa) fuzzy rule-based systems using data clouds [36]. Among these, the T-S fuzzy models stand out for their ability to decompose nonlinear systems into a set of linear local models smoothly connected by fuzzy membership functions [37]. T-S fuzzy models are universal approximators capable of approximating any continuous nonlinear system, that can be described by the following fuzzy rules [38]:
where \(R^{i}\) (\(i=1,\ldots ,N\)) represents the i-th fuzzy rule, N is the number of rules, \(x_1(k), \ldots ,x_p(k)\) are the input variables of the T-S fuzzy system, \(F^{i}_{j}\) are the linguistic terms characterized by fuzzy membership functions \(\mu ^{i}_{F^{i}_{j}}\), and \(\varvec{\textbf{f}}(x_{1}(k),\ldots ,x_{p}(k))\) represents the function model of the system of the i-th fuzzy rule [39].
2.2 Explainable Artificial Intelligence
Although the explainable artificial intelligence (XAI) concept is often associated with a homonymous program formulated by a group of researchers from the Defense Advanced Research Projects Agency (DARPA) [40], the principles related to explainability gained strength from the 1970 s onwards. The earliest works presented rule-based structures and decision trees with human-oriented explanations, such as the MYCIN system proposed in [41] developed for infectious disease therapy consultation, the tutoring program GUIDON proposed in [42] based on natural language studies, among numerous other systems [43,44,45,46]. Although many authors commonly use the terms “explainability” and “interpretability” as synonyms, Rudin [47] discusses the problem of using purely explainable methods to only provide explanations from the results obtained in black-box models (post hoc analysis), demystifying the importance of develo** inherently interpretable methodologies with causality relationships that are understandable to human users.
3 Deep Regression Overview
Deep neural networks (DNN) have emerged due to their architecture with multiple levels of representation and their remarkable performance in a variety of tasks [48]. This section presents the DL techniques commonly employed in regression problems to provide a better understanding and context for the literature review of deep fuzzy regression performed in Sect. 4.
3.1 Convolutional Neural Networks
Convolutional neural networks (CNNs or ConvNets) are feedforward neural networks with a grid-like topology that are used for applications such as time-series data processing in 1-D grids and image data processing in 2-D pixel grids [49]. Figure 2 presents a CNN architecture for processing time-series data in 1-D grids.
The “neocognitron” model proposed in [50] is frequently referred to as the inspiration model for what is currently known about CNNs. First proposed in [51], the neocognitron aimed to represent simple and complex cells from the visual cortex of animals which present a mechanism capable of detecting light in receptive fields [52].
CNNs use a math operation on two functions called “convolution”, with the first function referred to as input, the second function as kernel, and the convolution’s output as the feature map [49]. Outputs from the convolution layer go through a pooling layer (downsampling), which performs an overall statistic of the adjacent outputs by reducing the size of the data and associated parameters via weight sharing [11, 53]. After the data is processed along with the layers that alternate between convolution and pooling, the final feature maps go through a fully connected (dense) layer to extract high-level features. For regression problems, the extracted features can be combined in a prediction mechanism with an activation function or a supervised learning model (e.g., support vector regression) to estimate the final output [54, 55].
CNNs in the context of regression have been explored in several domains, for example, traffic flow forecasting of real road networks [56, 57], prediction of natural environmental factors using data from meteorological institutes [58,59,60,61], industrial process optimization [62,63,64], electric load forecasting [65,66,67,68,69], and chemical process analysis [70, \(\varvec{\textbf{W}} = \{\varvec{\textbf{w}}_{1},\varvec{\textbf{w}}_{2},\ldots ,\varvec{\textbf{w}}_{L}\}\) between layers, are fine-tuned by a supervised method (e.g., backpropagation algorithm) [96].
Recent studies using AEs within a deep architecture for regression have considered soft sensing applications for quality variable prediction [97,98,99,100,101]. Other cases of applications in industrial processes include CNC turning machines [102, 103], end-point quality prediction [104, 105], and prognostics [106,107,108,109]. Other works were developed for time-series forecasting applications, such as natural environmental factors [110,179], which performs the construction of fuzzy rules from a small-scale observation set (data pairs) with input–output map**. Finally, a random local loop optimization strategy is performed to remove feature combinations and corresponding subsystems with low correlation to achieve fast convergence [178]. The performance of RLODFS on the prediction of 12 real-world datasets from the UCI repositoryFootnote 1 was compared with other methods such as DBN, LSTM, and generalized regression neural network (GRNN). From the authors’ perspective, RLODFS has good interpretability due to its structure, clear physical meaning of its parameters, and the ease of locating fuzzy rules that may fail for future optimizations. However, there is a lack of transparency in using input sharing strategies, which increases the method complexity. Furthermore, the selection of the number of features per fuzzy subsystem needs to be carefully done, whose manual/arbitrary choice may not reflect the real needs of the case study under analysis.
The structure of RLODFS based on grou** and input sharing, adapted from [178]
The study in [180] proposes a stacked structure composed of double-input rule modules and interval type-2 fuzzy models, abbreviated as IT2DIRM-DFM. The proposed model is illustrated in Fig. 10, where four layers are presented: the input layer, which deals directly with the original input data, grouped two-by-two in each rule module; the stacked layer, where the signals coming from the input layer become the inputs of the first layer, whose output becomes the input of the second layer, and so on; the dimension reduction layer, where the width of the IT2DIRM-DFM is hierarchically reduced until it becomes two; and the output layer, where the latest IT2DIRM-DFM model produces the final forecasting results. Still, the authors discuss the interpretability of the model showing layered learning and fuzzy rules composed of only two variables in the antecedent, partitioned into interval type-2 second-order rule partitions. The resulting model was evaluated in two real-world applications, using subway passenger data from Buenos Aires, Argentina, and traffic flow data from California Highway System. The results allowed to verify the interpretability from consistent readability of which partitions and bounds of the inputs are used in each fired rule from each rule module. However, the proposed method is only interpretable locally in each rule module and not globally, whose structure does not reflect the depth or number of layers needed to address the experiments. Furthermore, it is not clear the motivations for considering a function, denoted as \(\text {f}\), in each layer related to the worst-performing module (this is shown at the bottom of Fig. 10).
The structure of the double-input rule modules stacked deep interval type-2 fuzzy model (IT2DIRM-DFM), adapted from [180]
Another work that explores double-input rule modules within a stacked deep fuzzy model (in a hierarchical way) was proposed in [181] using datasets related to photovoltaic power plants from Belgium and China. The authors investigate the interpretability of the resulting model, called DIRM-DFM, with conclusions similar to [180], mainly regarding the composition of fuzzy rules. However, in DIRM-DFM, they promote more transparency and simplicity. Some other studies deal with interval type-2 fuzzy models for deep learning in regression problems. In [182], a novel dynamic fractional-order deep learned type-2 FLS was proposed and constructed using singular value decomposition and uncertainty bounds type-reduction. The resulting model was implemented with two chaotic benchmark system simulations, a simulation for the prediction of the glucose level of type-1 diabetes patients, and a dataset of a heat transfer system with an experimental setup. In addition to determining the limit values of the input data (upper and lower singular values), the authors used stability criteria of fractional-order systems, allowing to reduce the necessary number of fuzzy rules and reduce the complexity of nonlinear systems. An evolving recurrent interval type-2 intuitionistic fuzzy neural network (FNN) was proposed in [183], and it was evaluated using regression datasets from the KEEL repository,Footnote 2 Mackey–Glass time series, and a simulated second-order time-varying system. Intuitionistic evaluation, fire strength of membership degree and strategies for adding and removing fuzzy rules were considered to improve uncertainty modeling. Both studies in [182] and [183] did not present an analysis of the interpretability of their models.
Methods that use multiple neuro-fuzzy systems in a deep hierarchical structure were proposed in [184] and [185]. The work in [184] proposed a deep model that cascades multiple neuro-fuzzy systems modified as multivariable generalized additive models, with application in real ecological time series, the Darwin sea level pressure. The resulting model manages to locally detail the mechanisms of each neuro-fuzzy system in each layer. However, it presents incomplete discussions and results regarding the increase in the network depth and the influence of the inputs and partial outputs of the layers on the final prediction. In [185], a hybrid cascade neuro-fuzzy network was proposed, which is composed of multiple extended neo-fuzzy neurons with adaptive training designated for online non-stationary data stream handling. Each layer has a generalization node that performs a weighted linear combination to obtain an optimal output signal. The experimental results in electrical loads prediction, using Southern Ukraine’s data from 2012, show the authors’ search for better accuracy, although at the cost of increasing membership functions to cover the input space and increasing adjusted parameters (weights).
The work in [186] proposed a deep learning recurrent type-3 fuzzy system applied for modeling renewable energies (i.e., power generation of a 660kW wind turbine and solar radiation generated by sunlight simulator). The proposed methodology showed good performance compared to other methods, such as multilayer perceptron, type-1 FLS, type-2 FLS, and interval type-3 FLS, despite the lack of a more elaborate discussion of the presented results. Furthermore, transparency is not guaranteed regarding the modeling steps and the influence of various parameters optimized during learning. The authors in [187] proposed a self-organizing FNN with incremental deep pre-training, abbreviated as IDPT-SOFNN, to promote efficient feature extraction and dynamic adaptation in the structure according to error-reduction rate. IDPT-SOFNN was implemented for the prediction of Mackey–Glass time series, total phosphorus concentration in wastewater treatment plant, and air pollutant concentration. In [188], a deep fuzzy cognitive map was proposed for multivariable time-series forecasting applications, such as air quality indexes, traffic speed of six road segments in China, and two benchmark datasets from the UCI repository, the electric power consumption and the temperature of a monitor system. The analysis of the model’s interpretability was based on nonlinear and nonmonotonic influences of unknown exogenous factors. The work in [189] proposed a deep FNN composed of an input layer, four hidden layers (membership functions, T-norm operation, linear regression, and aggregation), and an output layer, designed exclusively for intra- and inter-fractional variational prediction for multiple patients’ breathing motion.
Table 1 summarizes the literature on Standard DFS. They are categorized according to the application domain and regarding its interpretability categories. The survey shows that the application domain is vast, with applications in the domain of industrial systems, power systems, traffic systems, and multivariable benchmark systems.
Table 1 shows that only four out of eleven works discuss interpretability, despite most of their proposed methods being transparent. Also, not all methods provide post hoc explanations, and of these, the local scope is more frequent. The Standard DFS architecture is easy to implement, with flexibility in the construction of fuzzy rules, having a similar structure to feedforward neural networks. However, these methods may suffer from loss of interpretability when using a non-intuitive hierarchical structure, the lack of investigation of interconnections between the variables involved, and the limitation of fuzzy rules that can disturb the estimation by not covering all operating regions (coverage of input data space) [190].
4.1.2 Hybrid Deep Fuzzy Systems
The Hybrid DFSs are discussed in this section. The selection and order of works follow the same principles used for Standard DFS. The common DL architectures used in combination with fuzzy systems are the Deep Belief Networks, Autoencoders, Long Short-Term Memory networks, and Echo State Networks.
In [191], it is proposed a sparse Deep Belief Network (SDBN) with FNN for nonlinear system modeling (benchmark) and total phosphorus prediction in wastewater treatment plant. The SDBN is considered for unsupervised learning and pre-training to perform fast weight-initialization and improve modeling robustness. The FNN is used as supervised learning to reduce layer-by-layer complexity. As shown in Fig. 11, the structure of the DBN resembles the structure shown in Fig. 3, except for considering additional constraints (sparsity) used to penalize fluctuations of values along the hidden neurons. The proposed method performed better compared to other similar methods, such as transfer learning-based growing DBN [192], DBN-based echo-state network [164], and self-organizing cascade neural network [193]. However, the authors noticed various fluctuations in the assignment of hyperparameters, which can compromise the stability of the proposed model, making it necessary to dynamically and robustly improve its structure to these fluctuations. In terms of interpretability, the proposed model guarantees a moderate number of membership functions and rules, allowing good consistency and readability of what happens within the FNN structure. The same cannot be said for the DBN structure, which is not intuitive in sparse representation to decide which features are more valuable than others.
The authors in [194] proposed a novel robust deep neural network (RDNN) for regression problems involving nonlinear systems, with the implementation of three strategies: a fuzzy denoising autoencoder (FDA) as a base-building unit for RDNN, improving the ability to represent uncertainties; a compact parameter strategy (CPS), designed to reconstruct the parameters of the FDA, reducing unnecessary learning parameters; and an adaptive backpropagation (ABP) algorithm to update RDNN parameters with fast convergence. As shown in Fig. 11, FDA has an input layer, a fuzzy layer (built by Gaussian membership functions), and an output layer, which can be divided into an encoder (“input” to “fuzzy”) and a decoder (“fuzzy” to “output”). Furthermore, at each FDA, due to the nature of this autoencoder, the data is partially corrupted by noises (represented with a “tilde” symbol) and is reconstructed using CPS (represented with a “hat” symbol) with associated parameters (e.g., encoder/decoder weights and biases) are adjusted via ABP. The resulting model was evaluated through four prediction examples: air quality from the UCI repository, wind speed from National Renewable Energy Laboratory, housing price from the KEEL repository, and water quality from a wastewater treatment plant in Bei**g, China. Regarding the interpretability of the proposed model, there was no appropriate discussion by the authors, as they focused mainly on model performance/accuracy in the presence of uncertainties. In addition to a fuzzy rule base not being defined, the number of fuzzy neurons in each FDA is manually initialized and adapts through redundancies during the reconstruction of parameters via CPS. Finally, the architecture of the proposed model is not intuitive based on the mechanisms related to the representation ability of each FDA, which would be crucial to determine the influence of input variables and learning parameters on the outputs.
Some hybrid methods using FLS and conventional deep models have been applied in the literature for traffic flow prediction. In [195], the authors developed an algorithm based on Dolphin Echolocation optimization [196], where the input features are fuzzified into membership functions to obtain chronological data, whose integration goes into the weight update process of the proposed algorithm converging to a globally optimal solution with a Deep Belief Network. The method in [195] was evaluated using datasets of traffic-major roads in Great Britain and PEMS-SF (San Francisco bay area freeways). The study in [197] combined fuzzy information granulation and a deep neural network to represent the temporal-spatial correlation of mass traffic data and be able to adapt to noisy data. A Stacked Autoencoder is used to obtain the prediction results based on processed granules that have a good capacity for interpretation, which have not been discussed by the authors. The method in [197] was evaluated using traffic flow data archived for the Portland–Vancouver Metropolitan region.
Other methods were implemented for energy forecasting, with the usual application of LSTM networks. A novel fuzzy seasonal LSTM was proposed in [198], where a fuzzy seasonality index [199] and a decomposition method were employed to solve the seasonal time-series problem in a monthly wind power output dataset from the National Development Council in Taiwan. In [200], the authors used an LSTM network with rough set theory [201] and interval type-2 fuzzy sets for short-term wind speed forecasting (dataset from Bandar-Abbas City, Iran), with the aid of mutual information approach for efficient variable input selection. A novel ultra-short-term photovoltaic power forecasting method was proposed in [202], where a T-S fuzzy model comprises a fuzzy c-means clustering algorithm and DBNs, with evaluation tests using a 433 kW photovoltaic matrix database. The studies in [198, 200] and [202] did not discuss the interpretability of their models, which have a structure with an ensemble characteristic whose fuzzy part has an affinity for good coverage of the input space and good capture of data uncertainty.
The work in [203] proposed a deep type-2 FLS (D2FLS) architecture with greedy layer-wise training for high-dimensional input data. The D2FLS model was applied to two regression datasets (prediction of the performance of British Telecom’s work area and health insurance premium) and two binary classification datasets (Santander Customer Transaction prediction and British Telecom’s customer service). The authors showed how to extract interpretable explanations related to the contribution of fuzzy rules to the final prediction developed for a two-layer D2FLS. However, the authors opted for a large number of fuzzy rules (100, in this case) that impair interpretability by increasing the complexity of the model. The authors in [204] proposed an ensemble model composed of an Echo State Network (ESN), a T-S fuzzy model, and differential evolution for time-series forecasting problems: Mackey-Glass time series, nonlinear auto-regressive moving average (NARMA) time series, and Lorenz attractor. The differential evolution method, used to optimize the weight coefficients of the model, managed to reduce the number of fuzzy rules. The interpretability of the model can be impaired due to the structure with an ensemble characteristic, which does not provide enough transparency for learning. A method based on the T-S fuzzy model and ESN was developed in [205] and tested with three benchmark examples: approximation of a nonlinear function, prediction of Henon chaotic system, and identification of a dynamic system with and without noise signal. The authors chose to balance model complexity and performance based on the parameters involved for learning (e.g., number of fuzzy rules and reservoir size), which resulted in better results in comparison with other methods (e.g., traditional ESN and hybrid fuzzy ESN), despite the lack of discussion on interpretability.
In [206], a framework based on a sparse autoencoder (SpAE) and a high-order fuzzy cognitive map (HFCM) is proposed for time-series forecasting problems (e.g., sunspots, Mackey-Glass, S&P 500 stock index and Dow-Jones industrial index). SpAE is used to extract features from the original data, and these features are via HFCM. Another study that uses SpAE with FLS was proposed in [207] and implemented in Mackey-Glass time series and Iris dataset (classification). The proposed method uses a method to reduce fuzzy rules by reducing the data dimensionality with SpAE. Both studies in [206] and [207] do not consider addressing the interpretability of the proposed models, which present good directions in data partitioning and construction of fuzzy rules but are not intuitive in feature extraction with sparse representation.
Table 2 summarizes the works presented in this section, in addition to evaluating them within the context of explainable artificial intelligence (XAI) systems.
Only two works discuss the interpretability in Hybrid DFS, half of the discussed works are opaque, and only one presents post hoc explanations. This fact occurs since the combination with DNN brings a new layer of black-box to the system. However, the Hybrid DFS methods categorized as transparent show that the fuzzy component can promote good interpretability with efficient input–output map**s and the construction of rules to cover the universe of discourse for a given system. Inherently interpretable methodologies are difficult to achieve only with an ensemble of multiple methods, as seen in recent studies, in addition to the challenges of reducing the time complexity of systems, which is slightly mitigated by the reduction of fuzzy rules [208].
5 Conclusion
This study surveyed the literature on deep fuzzy systems (DFSs) for regression applications with an emphasis on interpretability. For the survey, the DFSs were categorized as (i) Standard DFS and (ii) Hybrid DFS. Regarding the interpretability, each method was categorized as to whether it is transparent or not and whether it has post hoc explanations (under the definition of [22]). Standard DFSs have been shown to promote more interpretability of their models when compared to Hybrid DFSs, according to the survey. Indeed, Standard DFSs are based on fundamental fuzzy logic systems, which are inherently interpretable, whereas Hybrid DFSs include conventional deep learning (DL) methods, which lack flexibility in promoting interpretability. In terms of applications, DFS can be flexible in its implementation either by simulation or in real-time systems, whose most recurrent applications involve time-series forecastings, such as traffic flow and energy modeling (e.g., photovoltaic and wind). Furthermore, the DFS is frequently referred to as interpretable by default, but only 5 of the 23 works surveyed here actually addressed this issue. The remaining works had a common goal: to improve prediction accuracy using their proposed methods. However, this survey presented the potential of using Standard DFS as a base for develo** accurate models while promoting interpretability since hybrid models are not straightforward.
References
Busu, M., Trica, C.L.: Sustainability of circular economy indicators and their impact on economic growth of the European Union. Sustainability (2019). https://doi.org/10.3390/su11195481
Botev, J., Égert, B., Jawadi, F.: The nonlinear relationship between economic growth and financial development: evidence from develo**, emerging and advanced economies. Int. Econ. 160, 3–13 (2019). https://doi.org/10.1016/j.inteco.2019.06.004
Liu, B., Zhao, Q., **, Y., Shen, J., Li, C.: Application of combined model of stepwise regression analysis and artificial neural network in data calibration of miniature air quality detector. Sci. Rep. 11(1), 1–12 (2021). https://doi.org/10.1038/s41598-021-82871-4
Gu, K., Qiao, J., Lin, W.: Recurrent air quality predictor based on meteorology- and pollution-related factors. IEEE Trans. Ind. Informatics 14(9), 3946–3955 (2018). https://doi.org/10.1109/TII.2018.2793950
Zhu, J., et al.: Prevalence and influencing factors of anxiety and depression symptoms in the first-line medical staff fighting against COVID-19 in Gansu. Front. Psychiatr. 11, 386 (2020). https://doi.org/10.3389/fpsyt.2020.00386
Shi, B., et al.: Nonlinear heart rate variability biomarkers for gastric cancer severity: a pilot study. Sci. Rep. 9(1), 1–9 (2019). https://doi.org/10.1038/s41598-019-50358-y
Orlandi, M., Escudero-Casao, M., Licini, G.: Nucleophilicity prediction via multivariate linear regression analysis. J Org. Chem. 86(4), 3555–3564 (2021). https://doi.org/10.1021/acs.joc.0c02952
Yang, Y., et al.: A support vector regression model to predict nitrate-nitrogen isotopic composition using hydro-chemical variables. J. Environ. Manag. 290, 112674 (2021). https://doi.org/10.1016/j.jenvman.2021.112674
Souza, F., Mendes, J., Araújo, R.: A regularized mixture of linear experts for quality prediction in multimode and multiphase industrial processes. Appl. Sci. (2021). https://doi.org/10.3390/app11052040
Liu, H., Yang, C., Carlsson, B., Qin, S.J., Yoo, C.: Dynamic nonlinear partial least squares modeling using gaussian process regression. Ind. Eng. Chem. Res. 58(36), 16676–16686 (2019). https://doi.org/10.1021/acs.iecr.9b00701
Sun, Q., Ge, Z.: A survey on deep learning for data-driven soft sensors. IEEE Trans. Ind. Informatics 17(9), 5853–5866 (2021). https://doi.org/10.1109/TII.2021.3053128
Angelov, P., Soares, E.: Towards explainable deep neural networks (xDNN). Neural Netw. 130, 185–194 (2020). https://doi.org/10.1016/j.neunet.2020.07.010
Sjöberg, J., et al.: Nonlinear black-box modeling in system identification: a unified overview. Automatica 31(12), 1691–1724 (1995). https://doi.org/10.1016/0005-1098(95)00120-8
Han, Z., Zhao, J., Leung, H., Ma, K.F., Wang, W.: A review of deep learning models for time series prediction. IEEE Sens. J. 21(6), 7833–7848 (2021). https://doi.org/10.1109/JSEN.2019.2923982
Torres, J.F., Hadjout, D., Sebaa, A., Martínez-Álvarez, F., Troncoso, A.: Deep learning for time series forecasting: a survey. Big Data 9(1), 3–21 (2021). https://doi.org/10.1089/big.2020.0159
Pang, G., Shen, C., Cao, L., Hengel, A.V.D.: Deep learning for anomaly detection: a review. ACM Comput. Surv. (2021). https://doi.org/10.1145/3439950
Chalapathy, R., Chawla, S.: Deep learning for anomaly detection: A survey. ar**v preprint ar**v:1901.03407 (2019). https://doi.org/10.48550/ar**v.1901.03407
Dong, S., Wang, P., Abbas, K.: A survey on deep learning and its applications. Comput. Sci. Rev. 40, 100379 (2021). https://doi.org/10.1016/j.cosrev.2021.100379
Sengupta, S., et al.: A review of deep learning with special emphasis on architectures, applications and recent trends. Knowl.-Based Syst. 194, 105596 (2020). https://doi.org/10.1016/j.knosys.2020.105596
Pouyanfar, S., et al.: A survey on deep learning: algorithms, techniques, and applications. ACM Comput. Surv. (2018). https://doi.org/10.1145/3234150
Navamani, T.: Chapter 7-Efficient deep learning approaches for health informatics. In: Sangaiah, A.K. (ed.) Deep learning and parallel computing environment for bioengineering systems, pp. 123–137. Academic Press, Cambridge (2019). https://doi.org/10.1016/B978-0-12-816718-2.00014-2
Angelov, P.P., Soares, E.A., Jiang, R., Arnold, N.I., Atkinson, P.M.: Explainable artificial intelligence: an analytical review. WIREs Data Mining Knowl. Discov. (2021). https://doi.org/10.1002/widm.1424
Confalonieri, R., Coba, L., Wagner, B., Besold, T.R.: A historical perspective of explainable artificial intelligence. Wiley Interdiscip. Rev.: Data Mining Knowl. Discov. (2021). https://doi.org/10.1002/widm.1391
Phillips, P. J. et al.: Four Principles of Explainable Artificial Intelligence (National Institute of Standards and Technology, 2021). https://doi.org/10.6028/nist.ir.8312
Chimatapu, R., Hagras, H., Starkey, A., Owusu, G.: Explainable AI and Fuzzy logic systems. In: Fagan, D., Martín-Vide, C., O'Neill, M., Vega-Rodríguez, M.A. (eds) Theory and Practice of Natural Computing, pp. 3–20 (2018). https://doi.org/10.1007/978-3-030-04070-3_1
Łapa, K., Cpałka, K., Rutkowski, L.: New aspects of interpretability of fuzzy systems for nonlinear modeling. In: Gawęda, A., Kacprzyk, J., Rutkowski, L., Yen, G. (eds) Advances in Data Analysis with Computational Intelligence Methods. Studies in Computational Intelligence, vol. 738. Springer, Cham. (2017). https://doi.org/10.1007/978-3-319-67946-4_9
Mendes, J., Maia, R., Araújo, R., Souza, F.A.A.: Self-evolving fuzzy controller composed of univariate fuzzy control rules. Appl. Sci. 10(17), 5836 (2020). https://doi.org/10.3390/app10175836
Moral, J.M.A., Castiello, C., Magdalena, L., Mencar, C.: Explainable fuzzy systems. Springer International Publishing, New York (2021). https://doi.org/10.1007/978-3-030-71098-9
Das, R., Sen, S., Maulik, U.: A survey on fuzzy deep neural networks. ACM Comput. Surv. (2020). https://doi.org/10.1145/3369798
Zheng, Y., Xu, Z., Wang, X.: The fusion of deep learning and fuzzy systems: a state-of-the-art survey. IEEE Trans. Fuzzy Syst. (2021). https://doi.org/10.1109/TFUZZ.2021.3062899
Zadeh, L.A.: Fuzzy sets. Information Control 8(3), 338–353 (1965). https://doi.org/10.1016/s0019-9958(65)90241-x
Wang, L.-X.: A course in fuzzy systems and control. Prentice-Hall Inc., Hoboken (1997)
Mendes, J., Araújo, R., Sousa, P., Apóstolo, F., Alves, L.: An architecture for adaptive fuzzy control in industrial environments. Comput. Ind. 62(3), 364–373 (2011). https://doi.org/10.1016/j.compind.2010.11.001
Mamdani, E., Assilian, S.: An experiment in linguistic synthesis with a fuzzy logic controller. Int. J. Man-Mach. Stud. 7(1), 1–13 (1975). https://doi.org/10.1016/S0020-7373(75)80002-2
Takagi, T., Sugeno, M.: Fuzzy identification of systems and its applications to modeling and control. IEEE Trans. Syst. Man Cybern. SMC–15(1), 116–132 (1985). https://doi.org/10.1109/TSMC.1985.6313399
Angelov, P., Yager, R.: Simplified fuzzy rule-based systems using non-parametric antecedents and relative data density. In: IEEE Workshop on Evolving and Adaptive Intelligent Systems (EAIS), pp. 62–69. Paris, France (2011). https://doi.org/10.1109/EAIS.2011.5945926
Qiu, J., Gao, H., Ding, S.X.: Recent advances on fuzzy-model-based nonlinear networked control systems: a survey. IEEE Trans. Ind. Electron. 63(2), 1207–1217 (2016). https://doi.org/10.1109/TIE.2015.2504351
Ying, H.: General MISO Takagi-Sugeno fuzzy systems with simplified linear rule consequent as universal approximators for control and modeling applications. In: IEEE International Conference on Systems, Man, and Cybernetics. Computational Cybernetics and Simulation, Vol. 2, pp. 1335–1340 (1997). https://doi.org/10.1109/ICSMC.1997.638158
Júnior, J. S. S., Mendes, J., Araújo, R., Paulo, J. R., Premebida, C.: Novelty detection for iterative learning of MIMO fuzzy systems. In: 2021 IEEE 19th International Conference on Industrial Informatics (INDIN), pp. 1–7 (2021). https://doi.org/10.1109/INDIN45523.2021.9557354
Hall, P., Gill, N.: An introduction to machine learning interpretability. O’Reilly Media, Incorporated (2019)
Shortliffe, E.H., Axline, S.G., Buchanan, B.G., Merigan, T.C., Cohen, S.N.: An artificial intelligence program to advise physicians regarding antimicrobial therapy. Comput. Biomed. Res. 6(6), 544–560 (1973). https://doi.org/10.1016/0010-4809(73)90029-3
Clancey, W.J.: Tutoring rules for guiding a case method dialogue. Int. J. Man-Mach. Stud. 11(1), 25–49 (1979). https://doi.org/10.1016/S0020-7373(79)80004-8
Weiss, S.M., Kulikowski, C.A., Amarel, S., Safir, A.: A model-based method for computer-aided medical decision-making. Artif. Intell. 11(1), 145–172 (1978). https://doi.org/10.1016/0004-3702(78)90015-2
Suwa, M., Scott, A.C., Shortliffe, E.H.: An approach to verifying completeness and consistency in a rule-based expert system. Ai Mag. 3(4), 16–16 (1982). https://doi.org/10.1609/aimag.v3i4.377
Swartout, W.R.: XPLAIN: a system for creating and explaining expert consulting programs. Artif. Intell. 21(3), 285–325 (1983). https://doi.org/10.1016/S0004-3702(83)80014-9
Swartout, W. R.: Explaining and justifying expert consulting programs. In: Reggia, J.A., Tuhrim, S. (eds) Computer-assisted medical decision making, pp. 254–271 (1985). https://doi.org/10.1007/978-1-4612-5108-8_15
Rudin, C.: Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead. Nat. Mach. Intell. 1(5), 206–215 (2019). https://doi.org/10.1038/s42256-019-0048-x
Bengio, Y., Courville, A., Vincent, P.: Representation learning: a review and new perspectives. IEEE Trans. Pattern Anal. Mach. Intell. 35(8), 1798–1828 (2013). https://doi.org/10.1109/TPAMI.2013.50
Goodfellow, I., Bengio, Y., Courville, A., Bengio, Y.: Deep learning. MIT press, Cambridge (2016)
Fukushima, K.: Neocognitron: a self organizing neural network model for a mechanism of pattern recognition unaffected by shift in position. Biol. Cybern. 36(4), 193–202 (1980). https://doi.org/10.1007/bf00344251
Hubel, D.H., Wiesel, T.N.: Receptive fields and functional architecture of monkey striate cortex. J. Physiol. 195(1), 215–243 (1968). https://doi.org/10.1113/jphysiol.1968.sp008455
Gu, J., et al.: Recent advances in convolutional neural networks. Pattern Recognit. 77, 354–377 (2018). https://doi.org/10.1016/j.patcog.2017.10.013
Li, Z., Liu, F., Yang, W., Peng, S., Zhou, J.: A survey of convolutional neural networks: analysis, applications, and prospects. IEEE Trans. Neural Netw. Learn. Syst. (2021). https://doi.org/10.1109/TNNLS.2021.3084827
Ketkar, N.: Convolutional neural networks. In: Deep Learning with Python: A Hands-on Introduction 63–78 (2017). https://doi.org/10.1007/978-1-4842-2766-4_5
Mallat, S.: Understanding deep convolutional networks. Philos. Trans. Royal Soc.: Mathemat. Phys. Eng. Sci. 374(2065), 20150203 (2016). https://doi.org/10.1098/rsta.2015.0203
Wu, S.: Spatiotemporal dynamic forecasting and analysis of regional traffic flow in urban road networks using deep learning convolutional neural network. IEEE Trans. Intell. Trans. Syst. 23(2), 1607–1615 (2022). https://doi.org/10.1109/TITS.2021.3098461
Zhang, Y., Zhou, Y., Lu, H., Fujita, H.: Traffic network flow prediction using parallel training for deep convolutional neural networks on spark cloud. IEEE Trans. Ind. Informatics 16(12), 7369–7380 (2020). https://doi.org/10.1109/TII.2020.2976053
Mukhtar, M., et al.: Development and comparison of two novel hybrid neural network models for hourly solar radiation prediction. Appl. Sci. (2022). https://doi.org/10.3390/app12031435
Heo, J., Song, K., Han, S., Lee, D.-E.: Multi-channel convolutional neural network for integration of meteorological and geographical features in solar power forecasting. Appl. Energy 295, 117083 (2021). https://doi.org/10.1016/j.apenergy.2021.117083
Liu, T., et al.: Enhancing wind turbine power forecast via convolutional neural network. Electronics (2021). https://doi.org/10.3390/electronics10030261
Wan, R., Mei, S., Wang, J., Liu, M., Yang, F.: Multivariate temporal convolutional network: a deep neural networks approach for multivariate time series forecasting. Electronics (2019). https://doi.org/10.3390/electronics8080876
Gao, Z., et al.: Multitask-based temporal-channelwise CNN for parameter prediction of two-phase flows. IEEE Trans. Ind. Informatics 17(9), 6329–6336 (2021). https://doi.org/10.1109/TII.2020.2978944
Fan, W., Zhang, Z.: A CNN-SVR hybrid prediction model for wastewater index measurement. In: 2020 2nd International Conference on Advances in Computer Technology, Information Science and Communications (CTISC), pp. 90–94 (2020). https://doi.org/10.1109/CTISC49998.2020.00022
Yuan, X., et al.: Soft sensor model for dynamic processes based on multichannel convolutional neural network. Chemometr. Intell. Lab. Syst. 203, 104050 (2020). https://doi.org/10.1016/j.chemolab.2020.104050
Jalali, S.M.J., et al.: A novel evolutionary-based deep convolutional neural network model for intelligent load forecasting. IEEE Trans. Ind. Informatics 17(12), 8243–8253 (2021). https://doi.org/10.1109/TII.2021.3065718
Eskandari, H., Imani, M., Moghaddam, M.P.: Convolutional and recurrent neural network based model for short-term load forecasting. Electr. Power Syst. Res. 195, 107173 (2021). https://doi.org/10.1016/j.epsr.2021.107173
Zahid, M., et al.: Electricity price and load forecasting using enhanced convolutional neural network and enhanced support vector regression in smart grids. Electronics (2019). https://doi.org/10.3390/electronics8020122
Koprinska, I., Wu, D., Wang, Z.: Convolutional neural networks for energy time series forecasting. In: 2018 International Joint Conference on Neural Networks (IJCNN), pp. 1–8 (2018). https://doi.org/10.1109/IJCNN.2018.8489399
Tian, C., Ma, J., Zhang, C., Zhan, P.: A deep neural network model for short-term load forecast based on long short-term memory network and convolutional neural network. Energies (2018). https://doi.org/10.3390/en11123493
Gao, P., Zhang, J., Sun, Y., Yu, J.: Accurate predictions of aqueous solubility of drug molecules via the multilevel graph convolutional network (MGCN) and SchNet architectures. Phys. Chem. Chem. Phys. 22(41), 23766–23772 (2020). https://doi.org/10.1039/D0CP03596C
Wu, K., Wei, G.-W.: Comparison of multi-task convolutional neural network (MT-CNN) and a few other methods for toxicity prediction. ar**v Preprint (2017). https://doi.org/10.48550/arxiv.1703.10951
Witten, I. H., Frank, E., Hall, M. A., Pal, C. J.: Chapter 10 - Deep learning. In: Witten, I. H., Frank, E., Hall, M. A., Pal, C. J. (eds) Data Mining (Fourth Edition), pp. 417–466. Morgan Kaufmann (2017). https://doi.org/10.1016/B978-0-12-804291-5.00010-6
Hinton, G.E., Osindero, S., Teh, Y.-W.: A fast learning algorithm for deep belief nets. Neural Comput. 18(7), 1527–1554 (2006). https://doi.org/10.1162/neco.2006.18.7.1527
Wang, Y., et al.: An ensemble deep belief network model based on random subspace for NOx concentration prediction. ACS Omega 6(11), 7655–7668 (2021). https://doi.org/10.1021/acsomega.0c06317
Hao, X., et al.: Prediction of nitrogen oxide emission concentration in cement production process: a method of deep belief network with clustering and time series. Environ. Sci. Pollut Res. 28(24), 31689–31703 (2021). https://doi.org/10.1007/s11356-021-12834-9
Yuan, X., Gu, Y., Wang, Y.: Supervised deep belief network for quality prediction in industrial processes. IEEE Trans. Instrum. Meas. 70, 1–11 (2021). https://doi.org/10.1109/TIM.2020.3035464
Yuan, X., et al.: FeO content prediction for an industrial sintering process based on supervised deep belief network. IFAC-PapersOnLine 53(2), 11883–11888 (2020). https://doi.org/10.1016/j.ifacol.2020.12.703
Hao, X., et al.: Energy consumption prediction in cement calcination process: a method of deep belief network with sliding window. Energy 207, 118256 (2020). https://doi.org/10.1016/j.energy.2020.118256
Zhu, S.-B., Li, Z.-L., Zhang, S.-M., Zhang, H.-F.: Deep belief network-based internal valve leakage rate prediction approach. Measurement 133, 182–192 (2019). https://doi.org/10.1016/j.measurement.2018.10.020
Wang, G., Jia, Q.-S., Zhou, M., Bi, J., Qiao, J.: Soft-sensing of Wastewater Treatment Process via Deep Belief Network with Event-triggered Learning. Neurocomputing 436, 103–113 (2021). https://doi.org/10.1016/j.neucom.2020.12.108
Lian, P., Liu, H., Wang, X., Guo, R.: Soft sensor based on DBN-IPSO-SVR approach for rotor thermal deformation prediction of rotary air-preheater. Measurement 165, 108109 (2020). https://doi.org/10.1016/j.measurement.2020.108109
Tian, W., Liu, Z., Li, L., Zhang, S., Li, C.: Identification of abnormal conditions in high-dimensional chemical process based on feature selection and deep learning. Chin. J. Chem. Eng. 28(7), 1875–1883 (2020). https://doi.org/10.1016/j.cjche.2020.05.003
Shao, H., Jiang, H., Li, X., Liang, T.: Rolling bearing fault detection using continuous deep belief network with locally linear embedding. Comput. Ind. 96, 27–39 (2018). https://doi.org/10.1016/j.compind.2018.01.005
Xu, H., Jiang, C.: Deep belief network-based support vector regression method for traffic flow forecasting. Neural Comput. Appl. 32(7), 2027–2036 (2019). https://doi.org/10.1007/s00521-019-04339-x
Jia, Y., Wu, J., Du, Y.: Traffic speed prediction using deep learning method. In: 2016 IEEE 19th International Conference on Intelligent Transportation Systems (ITSC), pp. 1217–1222 (2016). https://doi.org/10.1109/ITSC.2016.7795712
Tian, J., Liu, Y., Zheng, W., Yin, L.: Smog prediction based on the deep belief - BP neural network model (DBN-BP). Urban Climate 41, 101078 (2022). https://doi.org/10.1016/j.uclim.2021.101078
**e, T., Zhang, G., Liu, H., Liu, F., Du, P.: A hybrid forecasting method for solar output power based on variational mode decomposition, deep belief networks and auto-regressive moving average. Appl. Sci. (2018). https://doi.org/10.3390/app8101901
Li, X., Yang, L., Xue, F., Zhou, H.: Time series prediction of stock price using deep belief networks with intrinsic plasticity. In: 2017 29th Chinese Control And Decision Conference (CCDC), pp. 1237–1242 (2017). https://doi.org/10.1109/CCDC.2017.7978707
Qiao, J., Wang, G., Li, W., Li, X.: A deep belief network with PLSR for nonlinear system modeling. Neural Netw. 104, 68–79 (2018). https://doi.org/10.1016/j.neunet.2017.10.006
Qiu, X., Zhang, L., Ren, Y., Suganthan, P. N., Amaratunga, G.: Ensemble deep learning for regression and time series forecasting. In: 2014 IEEE Symposium on Computational Intelligence in Ensemble Learning (CIEL), pp. 1–6 (2014). https://doi.org/10.1109/CIEL.2014.7015739
Salakhutdinov, R., Larochelle, H. (ed.): Efficient learning of deep Boltzmann machines. (ed.) In: Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics, pp. 693–700 (2010)
Schmidhuber, J.: Deep learning in neural networks: an overview. Neural Netw. 61, 85–117 (2015). https://doi.org/10.1016/j.neunet.2014.09.003
Ballard, D. H. (ed.): Modular learning in neural networks. (ed.) In: Proceedings of the Sixth National Conference on Artificial Intelligence - Volume 1, Vol. 647, pp. 279–284 (1987)
Rumelhart, D. E., Hinton, G. E., Williams, R. J.: Learning Internal Representations by Error Propagation. Technical report, California Univ., San Diego, La Jolla. Inst. for Cognitive Science (1985). https://doi.org/10.21236/ada164453
Baldi, P. (ed.): Autoencoders, Unsupervised Learning, and Deep Architectures. (ed.) In: Proceedings of ICML Workshop on Unsupervised and Transfer Learning, Vol. 27, pp. 37–49 (2012)
Liu, G., Bao, H., Han, B.: A stacked autoencoder-based deep neural network for achieving gearbox fault diagnosis. Mathemati. Problems Engi. 2018, 1–10 (2018). https://doi.org/10.1155/2018/5105709
Sun, Q., Ge, Z.: Deep learning for industrial KPI prediction: when ensemble learning meets semi-supervised data. IEEE Trans. Ind. Informatics 17(1), 260–269 (2021). https://doi.org/10.1109/TII.2020.2969709
Sun, Q., Ge, Z.: Gated stacked target-related autoencoder: a novel deep feature extraction and layerwise ensemble method for industrial soft sensor application. IEEE Trans. Cybern. (2020). https://doi.org/10.1109/TCYB.2020.3010331
Liu, C., Wang, Y., Wang, K., Yuan, X.: Deep learning with nonlocal and local structure preserving stacked autoencoder for soft sensor in industrial processes. Eng. Appl. Artif. Intell. 104, 104341 (2021). https://doi.org/10.1016/j.engappai.2021.104341
Yuan, X., Ou, C., Wang, Y., Yang, C., Gui, W.: A novel semi-supervised pre-training strategy for deep networks and its application for quality variable prediction in industrial processes. Chem. Eng. Sci. 217, 115509 (2020). https://doi.org/10.1016/j.ces.2020.115509
Wang, Y., Liu, C., Yuan, X.: Stacked locality preserving autoencoder for feature extraction and its application for industrial process data modeling. Chemometr. Intell. Lab. Sys. 203, 104086 (2020). https://doi.org/10.1016/j.chemolab.2020.104086
Shi, C., et al.: Using multiple-feature-spaces-based deep learning for tool condition monitoring in ultraprecision manufacturing. IEEE Trans. Ind0 Electron. 66(5), 3794–3803 (2019). https://doi.org/10.1109/TIE.2018.2856193
Bose, T., Majumdar, A., Chattopadhyay, T.: Machine load estimation via stacked autoencoder regression. In: 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 2126–2130 (2018). https://doi.org/10.1109/ICASSP.2018.8461576
Zhang, X., Zou, Y., Li, S.: Enhancing incremental deep learning for FCCU end-point quality prediction. Information Sci. 530, 95–107 (2020). https://doi.org/10.1016/j.ins.2020.04.013
Liu, C., Tang, L., Liu, J.: A stacked autoencoder with sparse Bayesian regression for end-point prediction problems in steelmaking process. IEEE Trans. Autom. Sci. Eng. 17(2), 550–561 (2020). https://doi.org/10.1109/TASE.2019.2935314
Wang, X., Liu, H.: Soft sensor based on stacked auto-encoder deep neural network for air preheater rotor deformation prediction. Adv. Eng. Informatics 36, 112–119 (2018). https://doi.org/10.1016/j.aei.2018.03.003
Wei, M., Ye, M., Wang, Q., Twajamahoro, J.P.: Remaining useful life prediction of lithium-ion batteries based on stacked autoencoder and gaussian mixture regression. J. Energy Storage 47, 103558 (2022). https://doi.org/10.1016/j.est.2021.103558
Li, Z., Li, J., Wang, Y., Wang, K.: A deep learning approach for anomaly detection based on SAE and LSTM in mechanical equipment. Int. J. Adv. Manuf. Technol. 103(1–4), 499–510 (2019). https://doi.org/10.1007/s00170-019-03557-w
Ren, L., Sun, Y., Cui, J., Zhang, L.: Bearing remaining useful life prediction based on deep autoencoder and deep neural networks. J. Manuf. Syst. 48, 71–77 (2018). https://doi.org/10.1016/j.jmsy.2018.04.008
**, X.-B., Gong, W.-T., Kong, J.-L., Bai, Y.-T., Su, T.-L.: PFVAE: a planar flow-based variational auto-encoder prediction model for time series data. Mathematics (2022). https://doi.org/10.3390/math10040610
**ao, X., et al.: SSAE-MLP: stacked sparse autoencoders-based multi-layer perceptron for main bearing temperature prediction of large-scale wind turbines. Concurr. Comput.: Practice Exp. (2021). https://doi.org/10.1002/cpe.6315
Jiao, R., Huang, X., Ma, X., Han, L., Tian, W.: A model combining stacked auto encoder and back propagation algorithm for short-term wind power forecasting. IEEE Access 6, 17851–17858 (2018). https://doi.org/10.1109/ACCESS.2018.2818108
Li, M., **e, X., Zhang, D.: Improved deep learning model based on self-paced learning for multiscale short-term electricity load forecasting. Sustainability (2022). https://doi.org/10.3390/su14010188
Lv, S.-X., Peng, L., Wang, L.: Stacked autoencoder with echo-state regression for tourism demand forecasting using search query data. Appl. Soft Comput. 73, 119–133 (2018). https://doi.org/10.1016/j.asoc.2018.08.024
Grossberg, S.: Recurrent neural networks. Scholarpedia 8(2), 1888 (2013). https://doi.org/10.4249/scholarpedia.1888
Hopfield, J.J.: Neural networks and physical systems with emergent collective computational abilities. Proc. Nat. Acad. Sci. 79(8), 2554–2558 (1982). https://doi.org/10.1073/pnas.79.8.2554
Hopfield, J.J.: Hopfield network. Scholarpedia 2(5), 1977 (2007). https://doi.org/10.4249/scholarpedia.1977
Rumelhart, D.E., Hinton, G.E., Williams, R.J.: Learning representations by back-propagating errors. Nature 323(6088), 533–536 (1986). https://doi.org/10.1038/323533a0
Zhang, S., Bamakan, S.M.H., Qu, Q., Li, S.: Learning for personalized medicine: a comprehensive review from a deep learning perspective. IEEE Rev. Biomed. Eng. 12, 194–208 (2019). https://doi.org/10.1109/RBME.2018.2864254
Albertini, F., Pra, P. D.: Recurrent neural networks: identification and other system theoretic properties. In: Neural Network Systems Techniques and Applications, Vol. 3 pp. 1–49 (1998). https://doi.org/10.1016/s1874-5946(98)80003-5
Pascanu, R., Gulcehre, C., Cho, K., Bengio, Y.: How to Construct Deep Recurrent Neural Networks. ar**v preprint ar**v:1312.6026 (2013). https://doi.org/10.48550/ARXIV.1312.6026
Theodoridis, S.: Chapter 18 - Neural networks and deep learning. In: Machine learning (Second Edition) pp. 901–1038 (2020). https://doi.org/10.1016/B978-0-12-818803-3.00030-1
Werbos, P.: Backpropagation through time: what it does and how to do it. Proc. IEEE 78(10), 1550–1560 (1990). https://doi.org/10.1109/5.58337
Lalapura, V.S., Amudha, J., Satheesh, H.S.: Recurrent neural networks for edge intelligence: a survey. ACM Comput. Surv. 54(4), 1–38 (2021). https://doi.org/10.1145/3448974
Hochreiter, S., Schmidhuber, J.: Long Short-Term Memory. Neural Comput. 9(8), 1735–1780 (1997). https://doi.org/10.1162/neco.1997.9.8.1735
Lu, S., Zhang, Q., Chen, G., Seng, D.: A combined method for short-term traffic flow prediction based on recurrent neural network. Alexandria Eng. J. 60(1), 87–94 (2021). https://doi.org/10.1016/j.aej.2020.06.008
Yang, B., Sun, S., Li, J., Lin, X., Tian, Y.: Traffic flow prediction using LSTM with feature enhancement. Neurocomputing 332, 320–327 (2019). https://doi.org/10.1016/j.neucom.2018.12.016
Roy, D.K., et al.: Daily prediction and multi-step forward forecasting of reference evapotranspiration using LSTM and Bi-LSTM models. Agronomy (2022). https://doi.org/10.3390/agronomy12030594
Kumari, P., Toshniwal, D.: Long short term memory-convolutional neural network based deep hybrid approach for solar irradiance forecasting. Appl. Energy 295, 117061 (2021). https://doi.org/10.1016/j.apenergy.2021.117061
Altan, A., Karasu, S., Zio, E.: A new hybrid model for wind speed forecasting combining long short-term memory neural network, decomposition methods and grey wolf optimizer. Appl. Soft Comput. 100, 106996 (2021). https://doi.org/10.1016/j.asoc.2020.106996
Ma, J., et al.: Air quality prediction at new stations using spatially transferred bi-directional long short-term memory network. Sci. Total Environ. 705, 135771 (2020). https://doi.org/10.1016/j.scitotenv.2019.135771
Shen, Z., Zhang, Y., Lu, J., Xu, J., **ao, G.: A novel time series forecasting model with deep learning. Neurocomputing 396, 302–313 (2020). https://doi.org/10.1016/j.neucom.2018.12.084
Zhang, Q., Wang, H., Dong, J., Zhong, G., Sun, X.: Prediction of sea surface temperature using long short-term memory. IEEE Geosci. Remote Sens. Lett. 14(10), 1745–1749 (2017). https://doi.org/10.1109/LGRS.2017.2733548
Zhu, X., Hao, K., **e, R., Huang, B.: Soft sensor based on extreme gradient boosting and bidirectional converted gates long short-term memory self-attention network. Neurocomputing 434, 126–136 (2021). https://doi.org/10.1016/j.neucom.2020.12.028
Yuan, X., Li, L., Shardt, Y.A.W., Wang, Y., Yang, C.: Deep learning with spatiotemporal attention-based lstm for industrial soft sensor model development. IEEE Trans. Ind. Electron. 68(5), 4404–4414 (2021). https://doi.org/10.1109/TIE.2020.2984443
Yuan, X., Li, L., Wang, Y.: Nonlinear dynamic soft sensor modeling with supervised long short-term memory network. IEEE Trans. Ind. Informatics 16(5), 3168–3176 (2020). https://doi.org/10.1109/TII.2019.2902129
Salles, R., Mendes, J., Araújo, R., Melo, C., Moura, P.: Prediction of key variables in wastewater treatment plants using machine learning models. In: International Joint Conference on Neural Networks (IJCNN), pp. 1–9. IEEE, Padova, Italy (2022). https://doi.org/10.1109/IJCNN55064.2022.9892661
Cai, W., Zhang, W., Hu, X., Liu, Y.: A hybrid information model based on long short-term memory network for tool condition monitoring. J. Intell. Manuf. 31(6), 1497–1510 (2020). https://doi.org/10.1007/s10845-019-01526-4
Cheng, Y., Zhu, H., Wu, J., Shao, X.: Machine health monitoring using adaptive kernel spectral clustering and deep long short-term memory recurrent neural networks. IEEE Trans. Ind. Informatics 15(2), 987–997 (2019). https://doi.org/10.1109/TII.2018.2866549
Li, S., Fang, H., Shi, B.: Multi-step-ahead prediction with long short term memory networks and support vector regression. In: 2018 37th Chinese Control Conference (CCC), pp. 8104–8109 (2018). https://doi.org/10.23919/ChiCC.2018.8484066
Cheng, Y., Wang, C., Wu, J., Zhu, H., Lee, C.: Multi-dimensional recurrent neural network for remaining useful life prediction under variable operating conditions and multiple fault modes. Appl. Soft Comput. 118, 108507 (2022). https://doi.org/10.1016/j.asoc.2022.108507
Guo, J., Lao, Z., Hou, M., Li, C., Zhang, S.: Mechanical fault time series prediction by using EFMSAE-LSTM neural network. Measurement 173, 108566 (2021). https://doi.org/10.1016/j.measurement.2020.108566
Moradzadeh, A., Zakeri, S., Shoaran, M., Mohammadi-Ivatloo, B., Mohammadi, F.: Short-term load forecasting of microgrid via hybrid support vector regression and long short-term memory algorithms. Sustainability (2020). https://doi.org/10.3390/su12177076
Narayan, A., Hipel, K. W.: Long short term memory networks for short-term electric load forecasting. In: 2017 IEEE International Conference on Systems, Man, and Cybernetics (SMC), pp. 2573–2578 (2017). https://doi.org/10.1109/SMC.2017.8123012
Gers, F.: Long short-term memory in recurrent neural networks. Ph.D. thesis, École Polytechnique Fédérale de Lausanne (2001). https://doi.org/10.5075/epfl-thesis-2366
Jaeger, H.: The “echo state” approach to analysing and training recurrent neural networks-with an erratum note. Tech. Rep. 34, Bonn, Germany: German National Research Center for Information Technology GMD Technical Report (2001)
Maass, W., Natschläger, T., Markram, H.: Real-time computing without stable states: a new framework for neural computation based on perturbations. Neural Comput. 14(11), 2531–2560 (2002). https://doi.org/10.1162/089976602760407955
Schrauwen, B., Verstraeten, D., Campenhout, J. V. (ed.): An overview of reservoir computing: theory, applications and implementations. (ed.) In: Proceedings of the 15th European Symposium on Artificial Neural Networks, pp. 471–482 (2007)
Lukoševičius, M., Jaeger, H.: Reservoir computing approaches to recurrent neural network training. Comput. Sci. Rev. 3(3), 127–149 (2009). https://doi.org/10.1016/j.cosrev.2009.03.005
Jaeger, H.: Echo state network. Scholarpedia 2(9), 2330 (2007). https://doi.org/10.4249/scholarpedia.2330
Tang, Y., Deng, J., Zang, C., Wu, Q.: Chaotic modeling of stream nitrate concentration and transportation via IFPA-ESN and turning point analyses. Front. Environ. Sci. (2022). https://doi.org/10.3389/fenvs.2022.855694
Na, X., Ren, W., Xu, X.: Hierarchical delay-memory echo state network: a model designed for multi-step chaotic time series prediction. Eng. Appl. Artif. Intell. 102, 104229 (2021). https://doi.org/10.1016/j.engappai.2021.104229
Yao, X., Wang, Z., Zhang, H.: Prediction and identification of discrete-time dynamic nonlinear systems based on adaptive echo state network. Neural Netw. 113, 11–19 (2019). https://doi.org/10.1016/j.neunet.2019.01.003
Xu, M., Han, M., Qiu, T., Lin, H.: Hybrid regularized echo state network for multivariate chaotic time series prediction. IEEE Trans. Cybern. 49(6), 2305–2315 (2019). https://doi.org/10.1109/TCYB.2018.2825253
McDermott, P.L., Wikle, C.K.: Deep echo state networks with uncertainty quantification for spatio-temporal forecasting. Environmetrics (2018). https://doi.org/10.1002/env.2553
Lian, L.: Wind speed prediction based on CEEMD-SE and multiple echo state network with Gauss-Markov fusion. Rev. Sci. Instrum. 93(1), 015105 (2022). https://doi.org/10.1063/5.0081086
Bai, Y., Liu, M.-D., Ding, L., Ma, Y.-J.: Double-layer staged training echo-state networks for wind speed prediction using variational mode decomposition. Appl. Energy 301, 117461 (2021). https://doi.org/10.1016/j.apenergy.2021.117461
Hu, H., Wang, L., Tao, R.: Wind speed forecasting based on variational mode decomposition and improved echo state network. Renew. Energy 164, 729–751 (2021). https://doi.org/10.1016/j.renene.2020.09.109
Tian, Z.: Approach for short-term wind power prediction via kernel principal component analysis and echo state network optimized by improved particle swarm optimization algorithm. Trans. Institute Measurement Control 43(16), 3647–3662 (2021). https://doi.org/10.1177/01423312211046421
López, E., Valle, C., Allende, H., Gil, E., Madsen, H.: Wind power forecasting based on echo state networks and long short-term memory. Energies (2018). https://doi.org/10.3390/en11030526
Zhang, H., et al.: Self-organizing deep belief modular echo state network for time series prediction. Knowl.-Based Syst. 222, 107007 (2021). https://doi.org/10.1016/j.knosys.2021.107007
Li, Q., Wu, Z., Zhang, H.: Spatio-temporal modeling with enhanced flexibility and robustness of solar irradiance prediction: a chain-structure echo state network approach. J. Clean. Prod. 261, 121151 (2020). https://doi.org/10.1016/j.jclepro.2020.121151
Xu, M., Han, M., Lin, H.: Wavelet-denoising multiple echo state networks for multivariate time series prediction. Information Sci. 465, 439–458 (2018). https://doi.org/10.1016/j.ins.2018.07.015
Sun, X., Li, T., Li, Q., Huang, Y., Li, Y.: Deep belief echo-state network and its application to time series prediction. Knowl.-Based Syst. 130, 17–29 (2017). https://doi.org/10.1016/j.knosys.2017.05.022
Li, Z., Tanaka, G.: Multi-reservoir echo state networks with sequence resampling for nonlinear time-series prediction. Neurocomputing 467, 115–129 (2022). https://doi.org/10.1016/j.neucom.2021.08.122
Wang, L., Su, Z., Qiao, J., Deng, F.: A pseudo-inverse decomposition-based self-organizing modular echo state network for time series prediction. Appl. Soft Comput. 116, 108317 (2022). https://doi.org/10.1016/j.asoc.2021.108317
Gao, R., Du, L., Duru, O., Yuen, K.F.: Time series forecasting based on echo state network and empirical wavelet transformation. Appl. Soft Comput. 102, 107111 (2021). https://doi.org/10.1016/j.asoc.2021.107111
Schwedersky, B.B., Flesch, R.C.C., Dangui, H.A.S.: Nonlinear MIMO system identification with echo-state networks. J. Control Autom. Electr. Syst. (2022). https://doi.org/10.1007/s40313-021-00874-y
He, Y.-L., Tian, Y., Xu, Y., Zhu, Q.-X.: Novel soft sensor development using echo state network integrated with singular value decomposition: application to complex chemical processes. Chemom. Intell. Lab. Sys. 200, 103981 (2020). https://doi.org/10.1016/j.chemolab.2020.103981
Huang, R., Li, Z., Cao, B.: A soft sensor approach based on an echo state network optimized by improved genetic algorithm. Sensors (2020). https://doi.org/10.3390/s20175000
Mansoor, M., Grimaccia, F., Leva, S., Mussetta, M.: Comparison of echo state network and feed-forward neural networks in electrical load forecasting for demand response programs. Math. Comput. Simul. 184, 282–293 (2021). https://doi.org/10.1016/j.matcom.2020.07.011
Yao, X., Wang, Z., Zhang, H.: A novel photovoltaic power forecasting model based on echo state network. Neurocomputing 325, 182–189 (2019). https://doi.org/10.1016/j.neucom.2018.10.022
Wen, S., et al.: Memristor-Based Echo State Network With Online Least Mean Square. IEEE Trans. Syst. Man Cybern.: Syst. 49(9), 1787–1796 (2019). https://doi.org/10.1109/TSMC.2018.2825021
Bala, A., Ismail, I., Ibrahim, R., Sait, S.M., Oliva, D.: An improved grasshopper optimization algorithm based echo state network for predicting faults in airplane engines. IEEE Access 8, 159773–159789 (2020). https://doi.org/10.1109/ACCESS.2020.3020356
Zhong, S., **e, X., Lin, L., Wang, F.: Genetic algorithm optimized double-reservoir echo state network for multi-regime time series prediction. Neurocomputing 238, 191–204 (2017). https://doi.org/10.1016/j.neucom.2017.01.053
Racca, A., Magri, L.: Robust optimization and validation of echo state networks for learning chaotic dynamics. Neural Netw. 142, 252–268 (2021). https://doi.org/10.1016/j.neunet.2021.05.004
Gunning, D., et al.: XAI–explainable artificial intelligence. Sci. Robotics (2019). https://doi.org/10.1126/scirobotics.aay7120
Huang, Y., Chen, D., Zhao, W., Mo, H.: Deep fuzzy system algorithms based on deep learning and input sharing for regression application. Intern. J. Fuzzy Syst. 23(3), 727–742 (2021). https://doi.org/10.1007/s40815-020-00998-4
Wang, L.-X., Mendel, J.: Generating fuzzy rules by learning from examples. IEEE Trans. Syst. Man Cybern. 22(6), 1414–1427 (1992). https://doi.org/10.1109/21.199466
Peng, W., Zhou, C., Li, C., Deng, X., Zhang, G.: Double-input rule modules stacked deep interval type-2 fuzzy model with application to time series forecasting. Intern. J. Fuzzy Syst. 23(5), 1326–1346 (2021). https://doi.org/10.1007/s40815-021-01087-w
Li, C., Zhou, C., Peng, W., Lv, Y., Luo, X.: Accurate prediction of short-term photovoltaic power generation via a novel double-input-rule-modules stacked deep fuzzy method. Energy 212, 118700 (2020). https://doi.org/10.1016/j.energy.2020.118700
Qasem, S.N., Mohammadzadeh, A.: A deep learned type-2 fuzzy neural network: Singular value decomposition approach. Appl. Soft Comput. 105, 107244 (2021). https://doi.org/10.1016/j.asoc.2021.107244
Luo, C., Tan, C., Wang, X., Zheng, Y.: An evolving recurrent interval type-2 intuitionistic fuzzy neural network for online learning and time series prediction. Appl. Soft Comput. 78, 150–163 (2019). https://doi.org/10.1016/j.asoc.2019.02.032
Bodyanskiy, Y., Vynokurova, O., Pliss, I., Peleshko, D., Rashkevych, Y.: Deep stacking convex neuro-fuzzy system and its on-line learning. DepCoS-RELCOMEX 2017: Advances in Dependability Engineering of Complex Systems, pp. 49—59 (2018). https://doi.org/10.1007/978-3-319-59415-6_5
Bodyanskiy, Y.V., Tyshchenko, O.K.: A hybrid cascade neuro-fuzzy network with pools of extended neo-fuzzy neurons and its deep learning. Intern. J. Appl. Math. Comput. Sci. 29(3), 477–488 (2019). https://doi.org/10.2478/amcs-2019-0035
Cao, Y., et al.: Deep learned recurrent type-3 fuzzy system: application for renewable energy modeling/prediction. Energy Rep. 7, 8115–8127 (2021). https://doi.org/10.1016/j.egyr.2021.07.004
Wang, G., Qiao, J.: An efficient self-organizing deep fuzzy neural network for nonlinear system modeling. IEEE Trans. Fuzzy Syst. (2021). https://doi.org/10.1109/TFUZZ.2021.3077396
Wang, J., Peng, Z., Wang, X., Li, C., Wu, J.: Deep fuzzy cognitive maps for interpretable multivariate time series prediction. IEEE Trans. Fuzzy Syst. 29(9), 2647–2660 (2021). https://doi.org/10.1109/TFUZZ.2020.3005293
Park, S., Lee, S.J., Weiss, E., Motai, Y.: Intra- and inter-fractional variation prediction of lung tumors using fuzzy deep learning. IEEE J. Trans. Eng. Health Med. 4, 1–12 (2016). https://doi.org/10.1109/JTEHM.2016.2516005
Magdalena, L.: Semantic interpretability in hierarchical fuzzy systems: creating semantically decouplable hierarchies. Information Sci. 496, 109–123 (2019). https://doi.org/10.1016/j.ins.2019.05.016
Wang, G., Jia, Q.-S., Qiao, J., Bi, J., Liu, C.: A sparse deep belief network with efficient fuzzy learning framework. Neural Netw. 121, 430–440 (2020). https://doi.org/10.1016/j.neunet.2019.09.035
Wang, G., Qiao, J., Bi, J., Li, W., Zhou, M.: TL-GDBN: growing deep belief network with transfer learning. IEEE Trans. Autom. Sci. Eng. 16(2), 874–885 (2019). https://doi.org/10.1109/TASE.2018.2865663
Li, F., Qiao, J., Han, H., Yang, C.: A self-organizing cascade neural network with random weights for nonlinear system modeling. Appl. Soft Comput. 42, 184–193 (2016). https://doi.org/10.1016/j.asoc.2016.01.028
Han, H.-G., Zhang, H.-J., Qiao, J.-F.: Robust deep neural network using fuzzy denoising autoencoder. Intern. J. Fuzzy Syst. 22(4), 1356–1375 (2020). https://doi.org/10.1007/s40815-020-00845-6
George, S., Santra, A.K.: Fuzzy inspired deep belief network for the traffic flow prediction in intelligent transportation system using flow strength indicators. Big Data 8(4), 291–307 (2020). https://doi.org/10.1089/big.2019.0007
Borkar, G.M., Mahajan, A.R.: A secure and trust based on-demand multipath routing scheme for self-organized mobile ad-hoc networks. Wireless Netw. 23(8), 2455–2472 (2016). https://doi.org/10.1007/s11276-016-1287-y
Chen, J., Yuan, W., Cao, J., Lv, H.: Traffic-flow prediction via granular computing and stacked autoencoder. Granul. Comput. 5(4), 449–459 (2019). https://doi.org/10.1007/s41066-019-00167-5
Liao, C.-W., Wang, I.-C., Lin, K.-P., Lin, Y.-J.: A fuzzy seasonal long short-term memory network for wind power forecasting. Mathematics (2021). https://doi.org/10.3390/math9111178
Chang, P.-T.: Fuzzy seasonality forecasting. Fuzzy Sets Syst. 90(1), 1–10 (1997). https://doi.org/10.1016/S0165-0114(96)00138-8
Imani, M., et al.: application of rough and fuzzy set theory for prediction of stochastic wind speed data using long short-term memory. Atmosphere (2021). https://doi.org/10.3390/atmos12070924
Pawlak, Z.: Rough sets. Intern. J. Comput. Information Sci. 11(5), 341–356 (1982). https://doi.org/10.1007/bf01001956
Liu, L., Liu, F., Zheng, Y.: A novel ultra-short-term PV power forecasting method based on DBN-based Takagi-Sugeno fuzzy model. Energies (2021). https://doi.org/10.3390/en14206447
Chimatapu, R., Hagras, H., Kern, M., Owusu, G.: Hybrid deep learning Type-2 fuzzy logic systems for explainable AI. In: 2020 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE), pp. 1–6 (2020). https://doi.org/10.1109/FUZZ48607.2020.9177817
S.N., D., S., G., T.S., A.: Fuzzy echo state neural network with differential evolution framework for time series forecasting. In: 2018 17th IEEE International Conference on Machine Learning and Applications (ICMLA), 1322–1327 (2018). https://doi.org/10.1109/ICMLA.2018.00214
Mahmoud, T.A., Elshenawy, L.M.: TSK fuzzy echo state neural network: a hybrid structure for black-box nonlinear systems identification. Neural Comput. Appl. (2022). https://doi.org/10.1007/s00521-021-06838-2
Wu, K., Liu, J., Liu, P., Yang, S.: Time series prediction using sparse autoencoder and high-order fuzzy cognitive maps. IEEE Trans. Fuzzy Syst. 28(12), 3110–3121 (2020). https://doi.org/10.1109/TFUZZ.2019.2956904
Sevakula, R. K., Verma, N. K.: Fuzzy rule reduction using sparse auto-encoders. In: 2015 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE), pp. 1–7 (2015). https://doi.org/10.1109/FUZZ-IEEE.2015.7338118
Ojha, V., Abraham, A., Snášel, V.: Heuristic design of fuzzy inference systems: a review of three decades of research. Eng. Appl. Artif. Intell. 85, 845–864 (2019). https://doi.org/10.1016/j.engappai.2019.08.010
Acknowledgements
This work has been supported by the ERDF and national funds through the project iProMo (CENTRO-01-0247-FEDER-069730). Jorge S. S. Júnior is supported by Fundação para a Ciência e a Tecnologia (FCT) under the grant ref. 2021.04917.BD.
Funding
Open access funding provided by FCT|FCCN (b-on).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Competing Interest
The authors declare no competing Interests.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
S. S. Júnior, J., Mendes, J., Souza, F. et al. Survey on Deep Fuzzy Systems in Regression Applications: A View on Interpretability. Int. J. Fuzzy Syst. 25, 2568–2589 (2023). https://doi.org/10.1007/s40815-023-01544-8
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s40815-023-01544-8