Nonmasking-based reservoir computing with a single dynamic memristor for image recognition

Wu, **aona; Lin, Zhuosheng; Deng, **gliang; Li, Jia; Feng, Yue

doi:10.1007/s11071-024-09338-9

Nonmasking-based reservoir computing with a single dynamic memristor for image recognition

Original Paper
Published: 06 March 2024

Volume 112, pages 6663–6678, (2024)
Cite this article

Download PDF

Nonlinear Dynamics Aims and scope Submit manuscript

Nonmasking-based reservoir computing with a single dynamic memristor for image recognition

Download PDF

**aona Wu¹,
Zhuosheng Lin ORCID: orcid.org/0000-0001-5963-8525^1,2,
**gliang Deng¹,
Jia Li¹ &
…
Yue Feng¹

945 Accesses
Explore all metrics

Abstract

Reservoir computing has been widely used in temporal information processing, and the presentation of time-delayed reservoir computing systems effectively reduces the difficulty of the physical implementation of reservoir computing. However, challenges of complex structure and difficult multi-parameter optimization still persist. To address these issues, this study simplifies the structure of time-delayed reservoir computing system by removing the masking procedure and feedback loop and proposes a nonmasking-based reservoir computing system using only a single dynamic memristor for image recognition tasks. The histogram of oriented gradient (HOG) feature of the input image is linearly mapped into a voltage sequence and directly injected into a dynamic memristor. The nonlinear map** of the input signal is performed by utilizing the physical computing resources of the dynamic memristor, so as to construct a reservoir computing system without masking procedure and feedback loop. The proposed reservoir computing system achieves effective image recognition only by properly adjusting the range of the map** voltage. The recognition accuracies on the image recognition tasks of MNIST and Fashion-MNIST datasets are $98.44\%$ and $90.19\%$, respectively, surpassing the same type dynamic memristor-based parallel reservoir computing system and laser-based reservoir computing system. Moreover, the recognition accuracy on the MNIST dataset is only slightly reduced by $0.14\%$ than that of the classical reservoir computing system with 1200 physical nodes. In comparison to the proposed reservoir computing system with masking procedure, the training time of the proposed nonmasking-based reservoir computing system on the MNIST and Fashion-MNIST datasets is reduced by about $46.1\%$ and $45.33\%$, respectively, while the corresponding recognition accuracy is only slightly decreased by $0.5\%$ and 0.35$\%$, respectively. In addition, the experimental results on CIFAR-10 and Cropped SVHN datasets further verify the feasibility of the proposed reservoir computing system in complex image recognition tasks.

Simulation platform for pattern recognition based on reservoir computing with memristor networks

Article Open access 14 June 2022

In-sensor reservoir computing system for latent fingerprint recognition with deep ultraviolet photo-synapses and memristor array

Article Open access 03 November 2022

Toward grouped-reservoir computing: organic neuromorphic vertical transistor with distributed reservoir states for efficient recognition and prediction

Article Open access 25 January 2024

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

Neuromorphic computing is an effective solution to break the bottleneck of von Neumann computing from the architectural level. As a novel paradigm of neuromorphic computing, reservoir computing (RC) [16, 22, 36] has a fundamental structure comprising of input layer, reservoir layer, and output layer, as illustrated in Fig. 1a. Among them, the input layer and reservoir layer adopt random, fixed connection weights, and only the weights of the output layer need to be trained. This effectively addresses issues like gradient vanishing or exploding, slow convergence, and high computational costs encountered in classical RNNs training. The reservoir layer can nonlinearly map low-dimensional input signals to a high-dimensional space, which provides a lightweight network scheme for efficient processing of sequential signals.

According to the topology structure of the reservoir layer, RC can be divided into two categories [32]: classical RC and time-delayed RC. As shown in Fig. 1a, the reservoir layer of classical RC consists of numerous randomly connected nodes. When realizing the random connections among multiple physical nodes in hardware, the hardware implementation of the classic RC is difficult and the structure is complex. In 2011, L. Appeltant et al. [4] proposed the time-delayed RC, providing significant convenience for hardware implementation. Specifically, a single Mackey–Glass oscillator is used as the nonlinear node (see NL in Fig. 1b). After the input signal is multiplied by the mask matrix, according to the principle of time division multiplexing, the delay feedback loop with a length of $\tau $ is divided into N virtual nodes to replace the random connection nodes of the classical RC. However, the time-delayed RC computing system still existed multi-parameter optimization problems such as mask matrix construction [3] (including mask length, mask signal type [18]) and delayed feedback control strength [9, 20, $98.44\%$ on MNIST dataset, outperforming the parallel architecture RC system [41] based on the same dynamic memristor by $0.45\%$ and demonstrating only a slight decrease of $0.14\%$ compared to the classical RC system [27] with 1200 physical nodes. The result indicates the balance between recognition accuracy and resource consumption of the RC system. Meanwhile, the recognition accuracy for Fashion-MNIST dataset is $90.19\%$, surpassing the laser-based RC system [14] by a significant $5.53\%$. In addition, the impact of the map** voltage range, a crucial parameter of the proposed system, on the operational state of dynamic memristor and image recognition performance is extensively investigated. Besides, the feature extraction performance of the proposed RC system is validated by reducing the scale of training set, while the noise robustness of the proposed RC system is examined by recognizing noised images. Finally, in order to further verify the feasibility of the proposed RC system on complex image recognition tasks, two real-world color image datasets: CIFAR-10 and Cropped SVHN (the cropped version of Street View House Numbers) are used to evaluated the proposed RC system.

Table 1 lists all abbreviations used in this paper.

Table 1 Abbreviations used in this paper

Full size table

The remaining of this paper is organized as follows: Section 2 summarizes the related work for time-delayed RC system. Section 3 introduces the theoretical model of dynamic memristor and proposes a nonmasking-based RC system with a single dynamic memristor, along with a demonstration of the specific image recognition process. In Sect. 4, the application of the proposed RC system in image recognition tasks is carried out, and the experimental results are also discussed. The final section concludes this study.

2 Related work

In view of the issues mentioned above, a brief overview of previous research on RC system hardware implementation is provided in this section, especially on data input, and structure design of time-delayed RC system. Table 2 summarizes the related work in these two major aspects.

Table 2 The research status of hardware-implemented RC

Full size table

2.1 Data processing in input layer

At present, the input image processing methods of RC systems in image recognition tasks are mainly as follows: 1) Employing the pixel values of original image directly as input signals [30, 40]. 2) Utilizing geometric transformation method. Like the classical RC system, the boundary of a single original image is removed, and then rotated at different angles and spliced as the final input image to enhance the feature information [27]. The dynamic memristor-based parallel reservoir computing system rotates the original image by 0$^{\circ }$, 30$^{\circ }$, and 90$^{\circ }$, respectively, combines three rotating images, and then cuts them vertically [41]. The pixel values of each cut image are used as the input signal. 3) Extracting HOG feature from the original image. For instance, the photonic RC system in [20] introduced the HOG technique at the input layer for extracting feature descriptors from the original image, serving as input. Huang et al. [14] similarly applied the HOG technique in processing input images. Yue et al. [39] compared four distinct methods for preprocessing input images within the optoelectronic RC system. Their findings indicated that employing the HOG technique for extracting image features as input leads to comparatively higher recognition accuracy.

2.2 Structural design of time-delayed RC systems

The structure design of delay RC system mainly involves three aspects: the construction of mask matrix, the design of delay feedback loop, and the architecture of nonlinear node.

The construction of the masking matrix encompasses matrix design algorithms, the selection of mask signal types, and the determination of the mask length. Derived from the concept of maximum length sequences, Appeltant et al. [3] outlined a procedure to construct an optimal mask pattern. This ensured the creation of the shortest possible mask that leads to maximum variability in reservoir states. Compared with random mask, optimal mask pattern makes the RC system achieve more stable performance. Kuriki et al. [18] utilized four distinct types of signals, namely, binary mask, six-level mask, random-level mask, and chaos mask, as mask signals in a photonic RC system, aiming to investigate how to improve the performance of time-delayed RC systems from the perspective of mask signal design. Zhong et al. [41] extensively examined the reservoir state and signal separation capability across varying mask lengths, contributing a theoretical foundation for determining the optimal mask length.

The design of the delay feedback loop mainly includes feedback intensity adjustment and loop structure design. Chen et al. [9] introduced a new hidden layer on the basis of time-delayed RC to construct an RC system with double feedback loops. This enhancement led to a significant improvement in performance on various tasks, including time series prediction, speech recognition, and nonlinear channel equalization. You et al. [38] proposed a RC system with multilayer time-delay structure and double feedback loops. The double feedback loop structure re-injected earlier-generated responses into the reservoir, enhancing the storage capacity of the RC system. The serial multilayer structure significantly improved the utilization of virtual nodes. Li et al. [20] proposed an optical reservoir calculation method based on a single physical node, which used an optical injection semiconductor laser with self-delay feedback as a reservoir. By setting multiple delay times, the number of virtual nodes was increased to enhance system performance.

The architecture of nonlinear nodes mainly includes single, parallel, and serial multilayer architectures. In Classical RC [27], the nonlinear nodes are connected randomly within the reservoir layer. Beyond that, the responses of a single nonlinear node to the input signal are served as the reservoir states within the reservoir layer of time-delayed RC systems. Time-delayed RC [3, 9, 18, 20] with a single nonlinear node architecture improved performance through strategies mentioned above, such as mask matrix construction or loop structure design. Most researches structured multiple reservoir layers in parallel or serial multilayer architecture to enhance the performance of the time-delayed RC systems [10, 14, 21, 23, 38, 39, 41]. For instance, Du et al. [10] utilized 88 dynamic memristors to implement a time-delayed RC system with parallel architecture. The system was deployed for the classification of handwritten digit images, resulting in an accuracy rate of 88.1$\%$. Moon et al. [23] used a 32$\times $32 array chip of $WO_x$ memristors to implement a time-delayed RC system with parallel architecture. In the task of spoken-digit recognition, the system achieved a remarkable classification accuracy up to 99.2$\%$, even with partial input. Additionally, well long-term predictions for chaotic sequences can be achieved without retraining. Zhong et al. [41] proposed a time-delayed RC system with parallel architecture based on dynamic memristor, which outperforms most existing hardware-based RC systems in tasks (such as image recognition, waveform classification, spoken-digit recognition, and chaotic sequence prediction). Liu et al. [21] constructed a serial multilayer RC system, basing on ferroelectric $\alpha $-$In_{2}Se_{3}$ devices with voltage input and output. The deep reservoir architecture was validated for its high memory capacity and powerful computational capabilities in tasks such as time series prediction and waveform classification. The time-delayed RC system, featuring a multilayer architecture in [38], enhanced the utilization of reservoir virtual nodes. This multilayer architecture also led to reduced prediction errors and improved resistance to interference in diverse time series prediction tasks.

3 Theoretical model and process of image recognition

This section first introduces theoretical model of dynamic memristor. Subsequently, a nonmasking-based RC system by using a dynamic memristor as the nonlinear node in reservoir layer is constructed. Finally, the specific process of image recognition is illustrated.

3.1 Theoretical model of dynamic memristor

The dynamic memristor introduced in [41] has a structure of $Ti/TiO_x/TaO_y/Pt$, exhibiting I-V nonlinearity and short-term memory characteristics. On this basis, the constructed parallel RC system showed exceptional performance in tasks such as waveform classification, spoken-digit recognition, and Hénon map prediction. These results demonstrated the potential of the dynamic memristor to act as a nonlinear node in a time-delayed RC. According to [41], the theoretical model of the dynamic memristor is defined as

$$\begin{aligned} \left\{ \begin{array}{l} I = KG{V^3}\\ G = (1 - r){G_0} + \left( r - \frac{{\alpha |V|}}{{\alpha |V| + 1}}\right) G' + \frac{{\alpha |V|}}{{\alpha |V| + 1}}{G_{th}} \end{array}\right. , \end{aligned}$$

(1)

where

$$\begin{aligned} \left\{ \begin{array}{l} K = {K_p},{G_{th}} = 1,\quad \mathrm{{if }}\quad V \ge 0,\\ K = {K_n},{G_{th}} = 0,\quad \mathrm{{else}}, \end{array} \right. \end{aligned}$$

(2)

V and I represent the input voltage and output current of the dynamic memristor, respectively. G and $G'$ denote the conductance values of the dynamic memristor at the current time step and the previous time step, respectively. The parameters K and $G_{th}$ are determined by the sign of V as shown in Eq. (2). Other parameters are set as listed in Table 3. According to Eqs.(1) and (2), the simulated current–voltage (I-V) curve of the dynamic memristor can be obtained, as shown by the blue solid line in Fig. 3. Meanwhile, the black dashed line in Fig. 3 represents the actual I-V curve of the device, which is obtained from experimental data provided in [41]. It can be observed from Fig. 3 that the blue solid line approximately overlaps with the black dashed line, capturing the general trend of the black dashed line. Therefore, the theoretical model of dynamic memristor, described by Eqs.(1) and (2), matches well with the actual physical characteristics of the device and can be utilized for subsequent simulation experiments.

Table 3 Parameters of the dynamic memristor model

Full size table

3.2 Nonmasking-based RC system with a single dynamic memristor

In order to simplify the structure of the RC system and minimize the number of system parameters that require to be optimized, a nonmasking-based RC system with a single dynamic memristor is proposed in this study. As depicted in Fig. 4, the proposed RC system consists of three parts: input layer, reservoir layer, where a single dynamic memristor is utilized as the nonlinear node, and output layer. In the input layer, the feature information of the input signals is extracted and linearly mapped into an appropriate voltage sequence. In the reservoir layer, the voltage sequence is directly injected to dynamic memristor for iteration, allowing the feature information to be integrated into the iterative states of dynamic memristor. Each iterative state of dynamic memristor is regarded as a virtual node in the reservoir. Additionally, the feedback loop presented in the conventional time-delayed RC system [4] is omitted. Among them, the linear map** process enables the setting of input weights similar to the traditional masking procedure. The combination of feature extraction and dynamic memristor state iterative update is equivalent to the expansion of input signal in the time domain, enabling the removal of the traditional masking procedure present in the conventional time-delayed RC system. In the output layer, the corresponding current responses $X_i= [x_\mathrm{{1}}, x_\mathrm{{2}}, \cdots , x_\mathrm{{N}}]^\mathrm{{T}}$ of the input voltage sequences $V_i= [v_\mathrm{{1}}, v_\mathrm{{2}}, \cdots , v_\mathrm{{N}}]^\mathrm{{T}}$ will be directly collected as states of the reservoir, so that additional operations that require reading the conductance of memristor can be skipped. When all the images in the training set are input, the overall reservoir states collected during training phase are denoted as $X=[ X_1, X_2, \cdots , X_n]$, where n is the size of training set. By using Tikhonov regularization, we minimize the mean square error between the system output and the desired output. Therefore, the output weight matrix ${W^{out}}$ can be obtained as

$$\begin{aligned} {W^{out}} = {Y_d}{X^T}{(X{X^T} + \beta I)^{ - 1}}, \end{aligned}$$

(3)

where ${Y_d} = [{y_1},{y_2}, \cdots , {y_k}, \cdots ,{y_n}]$ represents the desired output ($y_k$ as shown in Fig. 5), I is an identity matrix, and $\beta $ represents ridge parameter, which is set to avoid overfitting to the training data.

In the test phase, all the reservoir states $X_{test}$ of the signal to be identified are collected. Combined with the output weight matrix $W^{out}$ calculated by Eq.(3), the predicted output $Y_p$ can be obtained as:

$$\begin{aligned} {Y_p} = {W^{out}}{X_{test}}. \end{aligned}$$

(4)

Therefore, the proposed RC system avoids the parameters such as mask matrix, number of virtual nodes, and feedback strength that needs to be optimized in the conventional time-delayed RC system. Instead, the only parameter that needs to be optimized to improve recognition accuracy is to map the feature information into an appropriate voltage sequence.

3.3 The process of nonmasking-based RC system with a single dynamic memristor for image recognition

In this study, the image recognition performance of the proposed nonmasking-based RC system with a single dynamic memristor is first evaluated using the MNIST dataset and the Fashion-MNIST dataset. Furthermore, in order to verify the feasibility of the proposed RC system on more complex image recognition tasks, experiments are carried out on more challenging color image datasets: CIFAR-10 and Cropped SVHN. The MNIST dataset comprises 70, 000 images of handwritten digits from 0 to 9, authored by 250 individuals. The Fashion-MNIST dataset contains 70, 000 grayscale images of fashion product items, categorized into 10 classes. Both datasets are originally split into training and testing sets with 60, 000 and 10, 000 images, respectively. Each image in these two datasets is of $28 \times 28$ grayscale pixels size. The CIFAR-10 dataset, which contains 10 categories of real-world object images, is divided into 50,000 training images and 10,000 testing images. There are 6000 color images of size 32x32 for each category. The Cropped SVHN dataset is obtained by crop** from Google Street View images. It contains 10 categories of 32x32 color images, of which 73,257 images are used for training and 26,032 images are used for testing. Examples of sample images and corresponding labels for each category are illustrated in Fig. 5, where $y_k$ represents the label vector of the corresponding category, and the number of elements equal to the number of categories in the dataset (i.e., 10). Moreover, $y_k$ is the column vector of the label matrix $Y_d = [ y_1, y_2,..., y_k,..., y_n]$. Suppose that the label of a sample image is $i \in \{ 0,1,...,9\} $, then all other elements in $y_k$ are set to 0 except the $(i+1)-th$ element is set to 1. For example, the label of the digit “2” in MNIST dataset and “Pullover” product item in Fashion-MNIST dataset are “2”. Therefore, the corresponding label vector $y_k$ is set to be $(0\ 0\ 1\ 0\ 0\ 0\ 0\ 0\ 0\ 0)^\mathrm{{T}}$. Figure 6 schematically illustrates the image recognition process of the proposed RC system. Firstly, the HOG features are extracted from each input image, resulting in a $1 \times N$ feature descriptor. Subsequently, the feature descriptor is linearly mapped into the input voltage sequence for the dynamic memristor, and the corresponding output current is regarded as virtual nodes in reservoir layer. The virtual nodes in reservoir layer are coupled with each other under the excitation of input voltage sequence, achieving a nonlinear map** of image feature information (voltage) to reservoir states (current).

After that, it is divided into the training phase and the testing phase. In the training phase, each sample image from the training set is inputted in turn. The label vector $y_k$ is served as a column in the label matrix $Y_d$, and the corresponding reservoir state is collected as a column in the state matrix X. After all sample images from the training dataset are inputted, the corresponding label matrix $Y_d$ and state matrix X can be obtained. According to Eq.(3), the output weight matrix $W^{out}$ can be calculated. In the testing phase, all reservoir states $X_{test}$ of the sample images in testing set are first collected. Then, the predicted output $Y_p$ can be calculated according to Eq.(4). Among them, the row number of matrix $Y_p=[y^1_p,y^2_p,...,y^k_p,...,y^m_p]$ is equal to the number of categories in datasets, where m is the size of testing set, $y^k_p$ represents the output of a testing image. Finally, the winner-takes-all strategy is applied to $y^k_p$ to recognize the test image category. As shown in Fig. 6, taking “2”, “Pullover”, “bird” or “2” as an example, their labels are both 2. If the output of the proposed RC system is that the third element $p_2$ of the output column vector $y^k_p =(p_0,p_1,p_2,...,p_9)^\mathrm{{T}}$ is the maximum value, the corresponding output recognized result will be “2”, “Pullover”, “bird” or “2”. In this case, it indicates that the recognition of the test image is correct; otherwise, it is incorrect.

In this study, the performance of the RC system for image recognition is evaluated using the recognition accuracy (ACC). Assuming that the total number of samples in the testing set is t and the number of correctly recognized image samples is c, one has:

$$\begin{aligned} ACC = \frac{c}{t} \times 100\%. \end{aligned}$$

(5)

4 Experimental results and analysis

4.1 Experiment results and comparative analysis

The proposed RC system is implemented based on the MATLAB software (version R2021a, 64-bit) and hardware device (Intel(R) Core (TM) i7-10700K CPU @ 3.80GHz, 64 G RAM). According to the original partition of the training/testing set in MNIST dataset and Fashion-MNIST dataset, the sample images in the training/testing set are input in random order. The optimized values for the map** voltage range $[V_{min},V_{max}]$ in the recognition tasks of MNIST dataset and Fashion-MNIST dataset are [$-$0.9, $-$0.1] and [$-$0.9,$-$0.4], respectively. And the corresponding recognition accuracy is $98.44\%$ and $90.19\%$, respectively. Specifically, the confusion matrix of the recognition results is depicted in Fig. 7. Figure 7a shows the experimental result of the proposed RC system for MINST dataset. Among them, recognition accuracies for digits “0” and “1” are both exceeding $99\%$ and recognition accuracies for digits “2”, “3”, “5”, “6”, “7”, and “8” are over $98\%$. Although recognition accuracy for the digit “9” is relatively low, it is still $97\%$. In addition, Fig. 7b shows the experimental result of the proposed RC system for Fashion-MINST dataset. The recognition accuracy for “Bag” is the highest, reaching 97.6$\%$, which is due to the significant differences in shape compared to other items. And recognition accuracies for “Trouser”, “Dress”, “Sandal”, “Sneaker”, and “Ankle boot” are over $90\%$. However, the recognition accuracy for “Shirt” is only $68.2\%$, with $12.1\%$, $8.4\%$, and $5.9\%$ being wrongly recognized as “T-shirt/top”, “Coat”, and “Pullover”, respectively. The main reason for the misclassification of sample images is that these items are quite similar to each other, and the differences between the corresponding HOG feature descriptor are small, leading to confusion during the recognition process.

The performance comparison of several RC systems on recognition tasks is shown in Table 4. For the recognition task on MNIST dataset, the proposed RC system is compared with the Parallel RC system [41] using the same type of dynamic memristors and the Classical RC system [27]. In terms of the number of nonlinear nodes, the proposed RC system in this paper utilizes only one nonlinear physical node, while the Parallel RC system and the Classical RC system require 300 and 1200 nonlinear physical nodes, respectively. Regarding the preprocessing process, both the Parallel RC system and the Classical RC system mostly employ methods such as removing redundant boundary and concatenating images with different rotation angles to enhance feature information. The pixel values of the preprocessed images are finally used as input signals. In contrast, the proposed RC system references preprocessing procedures from machine learning, extracting HOG features from input sample images. The size of HOG feature descriptor for each sample image is $1\times 1980$, which effectively reduces and filters out a large amount of redundant data before being injected into the reservoir layer. This approach avoids the extra computational overhead associated with the masking procedure. The performance comparison results in Table 4 demonstrate that the recognition accuracy of the proposed RC is higher than that of the Parallel RC system by $0.44\%$ and experiences a slight decrease of $0.14\%$ compared to the Classical RC system. Furthermore, for the recognition task on Fashion-MNIST dataset, the proposed RC system achieves a recognition accuracy that is $5.53\%$ higher compared to the Laser-based RC system [14] with two parallel nonlinear physical nodes and masking procedure. The results confirm the effectiveness of the proposed RC system.

Table 4 Performance of several RC systems on recognition tasks

Full size table

To further investigate the feasibility of the proposed nonmasking-based RC system, Table 5 presents a performance comparison between the proposed RC system with/without masking procedure. In Table 5, “With masking” indicates the proposed RC system with the traditional masking procedure, while “Nonmasking” means the proposed nonmasking-based RC system. Specifically, “With masking” will multiply the feature descriptor of sample images with a mask matrix following a Gaussian distribution (mean value is 0, standard deviation is 0.5) and then inject into the reservoir layer. The number of virtual nodes is set to 1980 (same with the size of HOG feature descriptor), and the voltage range is consistent with the values mentioned above. The test results shown in Table 5 indicate that compared to “With masking”, the proposed nonmasking-based RC system reduces the training time for recognition tasks on MNIST and Fashion-MNIST datasets by approximately $46.1\%$ and $45.33\%$, respectively. The corresponding recognition accuracies only slight decrease by $0.5\%$ and $0.35\%$, respectively. This provides a simple and efficient approach to balance image recognition accuracy and practical resource efficiency. Therefore, the proposed nonmasking-based RC system not only simplifies the nonlinear nodes of the reservoir layer to a single nonlinear physical node, but also further reduces the structural computational complexity and improves system efficiency.

Table 5 Performance comparison of the proposed RC system with/without masking procedure

Full size table

4.2 Impact of map** voltage range on recognition performance of the proposed RC system

The map** voltage range is a critical factor affecting the operation state of the reservoir layer in the proposed nonmasking-based RC system. It is of significant importance to adjust the upper and lower limits of the map** voltage to make full use of the inherent dynamic characteristics of the dynamic memristor and to explore a favorable operating range. During the simulation process, the image recognition accuracy of the proposed RC system under different map** voltage ranges is calculated using the grid traversal method. Figure 8a and b shows the recognition accuracy distribution of the proposed RC system on MNIST and Fashion-MNIST datasets, respectively, under different map** voltage ranges. In Fig. 8a and b, the horizontal axis represents the lower limit of the map** voltage $V_{min}$, while the vertical axis represents the upper limit of the map** voltage $V_{max}$. Besides, the color bar indicates the magnitude of recognition accuracy, with darker colors indicating higher accuracy. As shown in Fig. 8a and b, the recognition accuracy of the proposed RC system changes with the map** voltage range, exhibiting similar trends across the two different datasets. The region with higher recognition accuracy covers a larger range, mainly distributed in the acute angle regions of the right triangle in Fig. 8a and b. Additionally, compared with the acute angle region labeled as “B”, the dark area within the acute angle region labeled as “A” is larger, indicating that the recognition accuracy is higher when both $V_{min}$ and $V_{max}$ are set to negative values. The simulation results indicate that the optimal map** voltage ranges for MNIST and Fashion-MNIST datasets are [$-$0.9, $-$0.1] and [$-$0.9, $-$0.4], respectively. And the corresponding overall recognition accuracies are $98.44\%$ and $90.19\%$, respectively. Furthermore, it can also be observed from Fig. 8a and b that the closer to the main diagonal region corresponds to higher recognition accuracy. That is, the smaller difference between the upper and lower map** voltage limits ($V_{max}-V_{min}$) results in higher recognition accuracy. In order to further analyze and validate the impact of the difference in map** voltage limits on the recognition accuracy, $V_{max}$ is increased from -2.9V to 3V in increments of 0.1V, while $V_{min}$ is fixed as -3V. The corresponding recognition accuracy curves of the proposed RC system for MNIST and Fashion-MNIST datasets are shown in Fig. 8c, demonstrating a similar trend. When $V_{max}$ is negative in the initial stage, the recognition accuracy remains at a relatively high stationary value. Subsequently, when $V_{max}$ is greater than 0V, it exhibits a rapid monotonically decreasing trend with the increase of $V_{max}$. This indicates that a larger difference between the upper and lower map** voltage limits ($V_{max}-V_{min}$) leads to lower recognition accuracy.

Furthermore, the impact of internal conductance fluctuation of the dynamic memristor during the recognition process is investigated. Specifically, an image from each category of sample images is randomly selected, and the feature descriptor of the selected image is mapped into a voltage sequence according to different map** voltage ranges. Then, the voltage sequence is injected into the dynamic memristor, and the corresponding conductance response curves for MNIST and Fashion-MNIST datasets are depicted in Fig. 9a-c, d-f, respectively, demonstrating a similar trend. When the conductance fluctuation is highly intense as shown in Fig. 9a and d, the recognition accuracy is as low as $87.03\%$ and $77.55\%$, respectively. As the conductance fluctuation gradually becomes smoother, as depicted in Fig. 9b and e, the recognition accuracy increases to $96.7\%$ and $85.29\%$, respectively. When the conductance fluctuation is very gentle, as shown in Fig. 9c and f, the recognition accuracy reaches its maximum values of $98.44\%$ and $90.19\%$, respectively. This is because when the conductance of the dynamic memristor changes drastically, it is easy to drive the reservoir states to reach the upper or lower limit [41], decreasing the richness of the reservoir state. In other words, the signal separation ability of the proposed RC system is reduced, leading to a larger recognition error. Therefore, in order to achieve high recognition accuracy of the proposed RC system, it is necessary for the dynamic memristor to exhibit a gentle fluctuation of conductance change under the excitation of the voltage sequence. That is, the map** voltage limits $V_{min}$ and $V_{max}$ are required to be negative, and the difference between $V_{max}$ and $V_{min}$ is required to be relatively small.

4.3 Impact of the training set scale on recognition performance of the proposed RC system

In principle, a larger scale of training set leads to more comprehensive coverage of features from samples, resulting in higher accuracy. However, this also leads to a significant increase in training time and computational complexity. The impact of the training set scale on the recognition performance of the proposed nonmasking-based RC system for Fashion-MNIST dataset is further discussed. In the simulation experiment, the scale of the testing set remains unchanged as 10,000 and the scale of the training set is gradually reduced. The corresponding recognition accuracy of the proposed RC system is shown in Table 6, exhibiting a monotonous downward trend. When the scale of the training set is reduced from 60,000 to 20,000, the training time rapidly decreases by $68.67\%$, while the recognition accuracy only decreases by $0.7\%$. Even in the extreme case that the each category of sample images in the testing set is only 1, the recognition accuracy still exceeds $50\%$. It can be seen that the RC system proposed in this paper can achieve good recognition accuracy from a small scale of training set, providing an effective approach for small-sample image recognition.

Table 6 Experimental results of dataset scale on recognition accuracy and training time

Full size table

Table 7 Recognition accuracy under different noise parameters

Full size table

4.4 Noise robustness verification

In order to verify the robustness of the proposed nonmasking-based RC system, noise interference experiments are performed on the image recognition task using the Fashion-MNIST dataset. The noise parameter of Gaussian noise, Salt & Pepper noise, and Speckle noise is denoted as $\sigma $. Specifically, Gaussian noise following a distribution $(0,{\sigma ^2})$, Salt & Pepper noise with noise density $\sigma $ and Speckle noise with standard deviation $\sigma $ are added to 10,000 testing images, respectively. The experimental results are shown in Table 7. When the noise parameter $\sigma $ is 0.1, the proposed RC system exhibits the strongest resistance to Speckle noise and the worst resistance to Salt & Pepper noise. When the noise parameter $\sigma $ is 0.01, the recognition accuracy corresponding to Gaussian noise, Salt & Pepper noise, and Speckle noise only decreases by 0.9$\%$, 1.9$\%$, and 0.6$\%$, respectively. Moreover, with the decrease of noise parameter $\sigma $, the recognition accuracy is almost free from the noise. The experiment results demonstrate that the proposed RC system retains the ability to achieve a high recognition accuracy even when the test image is disturbed by noise, verifying its robustness.

Table 8 Comparison results with Fit-DNN

Full size table

4.5 Feasibility verification and comparative analysis for more complex image recognition tasks

Real-world images have the characteristics of variable perspectives, different scales, background interference, intra-class differences, etc., which is a challenging problem in the field of image recognition. In this section, the proposed RC system is applied to the real-world color image recognition task, and its recognition feasibility is verified on the two datasets of CIFAR-10 and Cropped SVHN.

According to the original division of the training/testing set of CIFAR-10 and Cropped SVHN datasets, when the map** voltage range $[{V_{\min }},{V_{\max }}]$ is preferably selected as $[ - 1.6, - 0.8]$ and $[ - 1.3, - 1.2]$, the recognition accuracy of the proposed RC system is 57.81$\%$ and 80.84$\%$, respectively. In contrast, the cross-point arrays-based quasi-static memdiode model achieved recognition accuracies lower than 30$\%$ on these two datasets [1]. The recognition accuracy of hierarchical memcapacitive RC is below 17.5$\%$ on CIFAR-10 dataset [33], while the recognition accuracy of memristive deep delayed feedback based RC is 78.5$\%$ on Cropped SVHN dataset [6]. Table 8 compares the proposed RC with Fit-DNN [30] which is also based on single neuron node. The recognition accuracy of the proposed RC on CIFAR-10 dataset is 5.13$\%$ higher than that of Fit-DNN, while it is only 0.66$\%$ lower on Cropped SVHN. In terms of input data, the proposed method first extracts 1$\times $936 HOG feature descriptors of each input image, so that the number of nodes in the input layer is only 936, while Fit-DNN directly inputs 3072 original image pixel values. From the perspective of hidden layer structure, the proposed method is a single-layer structure with 936 nodes in hidden layer, while Fit-DNN uses three hidden layers with 400 nodes in each hidden layer. Meanwhile, in terms of feedback loop, the proposed RC removes the feedback loop, while the Fit-DNN is set with 100 feedback loops. Therefore, the experimental results of the proposed RC are competitive on CIFAR-10 and Cropped SVHN datasets.

In addition, the stability of the proposed RC system for real-word image recognition is further verified by changing the number of HOG feature descriptors. The experimental results in Table 9 show that when the number of HOG feature descriptors is 3276,1560 and 936, the recognition accuracies of the propose RC system on CIFAR-10 and Cropped SVHN datasets are over 57$\%$ and 80$\%$, respectively. It can be seen that the proposed RC system has the characteristics of large available interval of parameters, showing stable recognition performance.

Table 9 Recognition accuracy corresponding to different numbers of HOG feature descriptors

Full size table

The experimental results in this subsection show that the proposed RC system can still achieve good recognition accuracies and stable recognition performance on more complex real-world image recognition tasks, which further verifies its application value in real image recognition.

5 Conclusions

On the basis of the time-delayed RC system, this work proposes a nonmasking-based RC system with a single dynamic memristor by fully utilizing the intrinsic nonlinear characteristics of the memristor. The proposed RC system involves the utilization of only one dynamic memristor while removing the masking procedure and feedback loop from the conventional time-delayed RC system. Thus, the structure of the conventional time-delayed RC system is simplified. Specifically, the extracted HOG features from input images are linearly mapped into voltage sequences and directly injected into the dynamic memristor. By appropriately adjusting the map** voltage range, the proposed RC system demonstrates excellent performance in image recognition tasks for both MNIST and Fashion-MNIST datasets. At the same time, high recognition accuracies are also achieved on more complex image datasets: CIFAR-10 and Cropped SVHN. This provides a simple and efficient approach to balance the image recognition accuracy and hardware implementation complexity of the system. Furthermore, the proposed nonmasking-based RC system exhibits shorter system training time and stronger noise robustness in computationally intensive image recognition tasks, which can be served as an effective solution to implement on resource-limited computing platforms such as IoT devices.

Image recognition based on deep neural networks has far surpassed traditional machine learning algorithms in terms of accuracy and efficiency. In order to facilitate the hardware implementation of the deep neural networks, in the future research, we will further combine the nonlinear characteristics of novel materials or devices to simplify the structure of the representative deep neural networks. By doing so, this may pave the way for in-memory computing.

Data Availibility

The used datasets are openly available from http://yann.lecun.com/exdb/mnist/, https://github.com/zalandoresearch/fashion-mnist, https://www.cs.toronto.edu/kriz/cifar.html, and http://ufldl.stanford.edu/housenumbers/, and the data used to support the findings of this study are available from the corresponding author on reasonable request.

References

Aguirre, F.L., Pazos, S.M., Palumbo, F., Suñé, J., Miranda, E.: Application of the quasi-static memdiode model in cross-point arrays for large dataset pattern recognition. IEEE Access 8, 202174–202193 (2020)
Article Google Scholar
Antonik, P.: Application of FPGA to Real-Time Machine Learning: Hardware Reservoir Computers and Software Image Processing. Springer International Publishing, Cham (2018)
Book Google Scholar
Appeltant, L., Van der Sande, G., Danckaert, J., Fischer, I.: Constructing optimized binary masks for reservoir computing with delay systems. Sci. Rep. 4(1), 3629 (2014)
Article Google Scholar
Appeltant, L., Soriano, M.C., Van der Sande, G., Danckaert, J., Massar, S., Dambre, J., Schrauwen, B., Mirasso, C.R., Fischer, I.: Information processing using a single dynamical node as complex system. Nat. Commun. 2, 468 (2011)
Article Google Scholar
Bai, K., An, Q., Liu, L., Yi, Y.: A training-efficient hybrid-structured deep neural network with reconfigurable memristive synapses. IEEE Trans. Very Large Scale Integr. Syst. 28, 62–75 (2020)
Article Google Scholar
Bai, K., An, Q., Yi, Y.: Deep-DFR: A memristive deep delayed feedback reservoir computing system with hybrid neural network topology. In: Proceedings of the 56th Annual Design Automation Conference 2019, pp. 1–6. Las Vegas, NV, USA (2019)
Bao, Y., Song, K., Liu, J., Wang, Y., Yan, Y., Yu, H., Li, X.: Triplet-graph reasoning network for few-shot metal generic surface defect segmentation. IEEE Trans. Instrum. Meas. 70, 1–11 (2021)
Google Scholar
Cao, J., Zhang, X., Cheng, H., Qiu, J., Liu, X., Wang, M., Liu, Q.: Emerging dynamic memristors for neuromorphic reservoir computing. Nanoscale 14, 289–298 (2022)
Article Google Scholar
Chen, Y., Yi, L., Ke, J., Yang, Z., Yang, Y., Huang, L., Zhuge, Q., Hu, W.: Reservoir computing system with double optoelectronic feedback loops. Opt. Express 27, 27431–27440 (2019)
Article Google Scholar
Du, C., Cai, F., Zidan, M.A., Ma, W., Lee, S.H., Lu, W.D.: Reservoir computing using dynamic memristors for temporal information processing. Nat. Commun. 8, 2204 (2017)
Article Google Scholar
Díaz Ledezma, F., Haddadin, S.: Machine learning-driven self-discovery of the robot body morphology. Sci. Robot. 8(85), eadh0972 (2023)
Article Google Scholar
Gonzalez-Zapata, A.M., de la Fraga, L.G., Ovilla-Martinez, B., Tlelo-Cuautle, E., Cruz-Vega, I.: Enhanced FPGA implementation of Echo state networks for chaotic time series prediction. Integration 92, 48–57 (2023)
Article Google Scholar
Grollier, J., Querlioz, D., Camsari, K., Everschor-Sitte, K., Fukami, S., Stiles, M.D.: Neuromorphic spintronics. NIST 3(7), 360–370 (2020)
Google Scholar
Huang, Y., Zhou, P., Yang, Y., Chen, T., Li, N.: Time-delayed reservoir computing based on a two-element phased laser array for image identification. IEEE Photonics J. 13, 1–9 (2021)
Google Scholar
Humayun, A.I., Balestriero, R., Balakrishnan, G., Baraniuk, R.G.: Splinecam: exact visualization and characterization of deep network geometry and decision boundaries. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3789–3798. Vancouver, BC, Canada (2023)
Jaeger, H.: The Echo State Approach to Analysing and Training Recurrent Neural Networks-with an Erratum Note, vol. 148, p. 13. German National Research Center for Information Technology, Bonn, Germany (2001)
Google Scholar
Jiang, H., Gao, M., Li, H., **, R., Miao, H., Liu, J.: Multi-learner based deep meta-learning for few-shot medical image classification. IEEE J. Biomed. Health Inform. 27(1), 17–28 (2022)
Article Google Scholar
Kuriki, Y., Nakayama, J., Takano, K., Uchida, A.: Impact of input mask signals on delay-based photonic reservoir computing with semiconductor lasers. Opt. Express 26(5), 5777–5788 (2018)
Article Google Scholar
Larger, L., Soriano, M.C., Brunner, D., Appeltant, L., Gutierrez, J.M., Pesquera, L., Mirasso, C.R., Fischer, I.: Photonic information processing beyond turing: an optoelectronic implementation of reservoir computing. Opt. Express 20, 3241–3249 (2012)
Article Google Scholar
Li, J., Cai, Q., Li, P., Yang, Y., Alan Shore, K., Wang, Y.: Image recognition based on optical reservoir computing. Chaos Interdiscip. J. Nonlinear Sci. 32, 123106 (2022)
Article Google Scholar
Liu, K., Dang, B., Zhang, T., Yang, Z., Bao, L., Xu, L., Cheng, C., Huang, R., Yang, Y.: Multilayer reservoir computing based on ferroelectric $\alpha $-In2Se3 for hierarchical information processing. Adv. Mater. 34(48), 2108826 (2022)
Article Google Scholar
Maass, W., Natschläger, T., Markram, H.: Real-time computing without stable states: a new framework for neural computation based on perturbations. Neural Comput. 14, 2531–2560 (2002)
Article Google Scholar
Moon, J., Ma, W., Shin, J.H., Cai, F., Du, C., Lee, S.H., Lu, W.D.: Temporal data classification and forecasting using a memristor-based reservoir computing system. Nat. Electron. 2(10), 480–487 (2019)
Article Google Scholar
Nakajima, K., Fischer, I. (eds.): Reservoir Computing: Theory Physical Implementations, and Applications. Natural Computing Series. Springer, Singapore (2021)
Google Scholar
Nakajima, K., Hauser, H., Kang, R., Guglielmino, E., Caldwell, D., Pfeifer, R.: A soft body as a reservoir: case studies in a dynamic model of octopus-inspired soft robotic arm. Front. Comput. Neurosci. 7, 91 (2013)
Article Google Scholar
Nishioka, D., Tsuchiya, T., Namiki, W., Takayanagi, M., Imura, M., Koide, Y., Higuchi, T., Terabe, K.: Edge-of-chaos learning achieved by ion-electron-coupled dynamics in an ion-gating reservoir. Sci. Adv. 8, eade1156 (2022)
Article Google Scholar
Schaetti, N., Salomon, M., Couturier, R.: Echo state networks-based reservoir computing for mnist handwritten digits recognition. In: 2016 IEEE International Conference on Computational Science and Engineering (CSE) and IEEE International Conference on Embedded and Ubiquitous Computing (EUC) and 15th International Symposium on Distributed Computing and Applications for Business Engineering(DCABES), pp. 484–491. IEEE, Paris, France (2016)
Schmarje, L., Santarossa, M., Schröder, S.M., Koch, R.: A survey on semi-, self-and unsupervised learning for image classification. IEEE Access 9, 82146–82168 (2021)
Article Google Scholar
Shehab, M., Al-Ayyoub, M., Jararweh, Y., Jarrah, M.: Accelerating compute-intensive image segmentation algorithms using GPUs. J. Supercomput. 73, 1929–1951 (2017)
Article Google Scholar
Stelzer, F., Röhm, A., Vicente, R., Fischer, I., Yanchuk, S.: Deep neural networks using a single neuron: folded-in-time architecture using feedback-modulated delay loops. Nat. Commun. 12, 5164 (2021)
Tanaka, G., Nakane, R.: Simulation platform for pattern recognition based on reservoir computing with memristor networks. Sci. Rep. 12(1), 9868 (2022)
Tanaka, G., Yamane, T., Héroux, J.B., Nakane, R., Kanazawa, N., Takeda, S., Numata, H., Nakano, D., Hirose, A.: Recent advances in physical reservoir computing: a review. Neural Netw. 115, 100–123 (2019)
Article Google Scholar
Tran, S.D., Teuscher, C.: Hierarchical memcapacitive reservoir computing architecture. In: 2019 IEEE International Conference on Rebooting Computing (ICRC), pp. 1–6. IEEE, San Mateo, CA, USA (2019)
Usami, Y., van de Ven, B., Mathew, D.G., Chen, T., Kotooka, T., Kawashima, Y., Tanaka, Y., Otsuka, Y., Ohoyama, H., Tamukoh, H., Tanaka, H., van der Wiel, W.G., Matsumoto, T.: In-materio reservoir computing in a sulfonated polyaniline network. Adv. Mater. 33, 2102688 (2021)
Article Google Scholar
Vandoorne, K., Mechet, P., Van Vaerenbergh, T., Fiers, M., Morthier, G., Verstraeten, D., Schrauwen, B., Dambre, J., Bienstman, P.: Experimental demonstration of reservoir computing on a silicon photonics chip. Nat. Commun. 5, 3541 (2014)
Article Google Scholar
Verstraeten, D., Schrauwen, B., D’Haene, M., Stroobandt, D.: An experimental unification of reservoir computing methods. Neural Netw. 20, 391–403 (2007)
Article Google Scholar
Wang, S., Chen, H., Zhang, W., Li, Y., Wang, D., Shi, S., Zhao, Y., Loong, K.C., Chen, X., Dong, Y., Zhang, Y., Jiang, Y., Furqan, C., Chen, J., Wang, Q., Xu, X., Wang, G., Yu, H., Shang, D., Wang, Z.: Convolutional Echo-state network with random memristors for spatiotemporal signal classification. Adv. Intell. Syst. 4, 2200027 (2022)
Article Google Scholar
You, M., Li, F., **, J., Wang, G., Du, B.: Multilayer time delay reservoir with double feedback loops for time series forecasting task. Appl. Soft Comput. 138, 110179 (2023)
Article Google Scholar
Yue, D., Hou, Y., Hu, C., Zang, C., Kou, Y.: Handwritten digits recognition based on a parallel optoelectronic time-delay reservoir computing system. Photonics 10, 236 (2023)
Article Google Scholar
Zhang, G., Qin, J., Zhang, Y., Gong, G., **ong, Z.Y., Ma, X., Lv, Z., Zhou, Y., Han, S.T.: Functional materials for memristor-based reservoir computing: dynamics and applications. Adv. Funct. Mater. 33, 2302929 (2023)
Article Google Scholar
Zhong, Y., Tang, J., Li, X., Gao, B., Qian, H., Wu, H.: Dynamic memristor-based reservoir computing for high-efficiency temporal signal processing. Nat. Commun. 12(1), 408 (2021)
Article Google Scholar

Download references

Funding

National Natural Science Foundation of China under Grant no. (61901304).

Author information

Authors and Affiliations

Faculty of Intelligent Manufacturing, Wuyi University, Jiangmen, 529020, China
**aona Wu, Zhuosheng Lin, **gliang Deng, Jia Li & Yue Feng
Institute of Carbon Peaking and Carbon Neutralization, Wuyi University, Jiangmen, 529020, China
Zhuosheng Lin

Authors

**aona Wu
View author publications
You can also search for this author in PubMed Google Scholar
Zhuosheng Lin
View author publications
You can also search for this author in PubMed Google Scholar
**gliang Deng
View author publications
You can also search for this author in PubMed Google Scholar
Jia Li
View author publications
You can also search for this author in PubMed Google Scholar
Yue Feng
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Zhuosheng Lin.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Wu, X., Lin, Z., Deng, J. et al. Nonmasking-based reservoir computing with a single dynamic memristor for image recognition. Nonlinear Dyn 112, 6663–6678 (2024). https://doi.org/10.1007/s11071-024-09338-9

Download citation

Received: 17 September 2023
Accepted: 17 January 2024
Published: 06 March 2024
Issue Date: April 2024
DOI: https://doi.org/10.1007/s11071-024-09338-9

Nonmasking-based reservoir computing with a single dynamic memristor for image recognition

Abstract

Similar content being viewed by others

Simulation platform for pattern recognition based on reservoir computing with memristor networks

In-sensor reservoir computing system for latent fingerprint recognition with deep ultraviolet photo-synapses and memristor array

Toward grouped-reservoir computing: organic neuromorphic vertical transistor with distributed reservoir states for efficient recognition and prediction

1 Introduction