Keywords

1 Introduction

Fireworks algorithm (FWA) is inspired by the phenomenon of fireworks explosion and proposed by Tan [1, 2]. Fireworks are initialized in solution space randomly in FWA and sparks are generated by the explosion process of fireworks. All fireworks and sparks are regarded as candidate solutions, and the explosion process is considered to be a stochastic search around the fireworks. The original FWA works as follows: N fireworks are initialized randomly in a search space, and their quality (the fitness value is used to represent the quality of fireworks in the original FWA) is evaluated to determine the number of sparks and explosion amplitude for all fireworks. Afterwards, the fireworks explode and generate sparks within their local space. Finally, N candidate fireworks are selected from all the fireworks and sparks as new fireworks of the next generation. The workflow continues until the termination criterion is reached.

Since FWA is raised in [1], it arouses lots of interests from researchers. The FWA has been applied to many real word optimization problems, including optimizing anti-spam model [3], solving the network reconfiguration [4], solving the path problem of vehicle congestion [5], swarm robotics [6, 7] modern web information retrieval [8], single-row facility layout problem [9], etc.

At the same time, there has been many researches attempting improving the performance of FWA. Zheng proposes the Enhanced Fireworks Algorithm (EFWA) [10], five modifications combined with conventional FWA eliminate some disadvantages of the original algorithm among it. Li proposes GFWA which puts forward a simple and efficient mutation operator called guiding vector. The Adaptive Fireworks Algorithm (AFWA) [11] uses a new adaptive amplitude calculated according to the fitness value instead of the amplitude operator in EFWA. Based on EFWA, Zheng proposes the Dynamic Search in Fireworks Algorithm (dynFWA) [12] as an improvement. In dynFWA, the firework with the smallest fitness value uses the dynamic explosion amplitude strategy. Variants mentioned above optimize the performance by adjusting the explosion amplitude adaptively. [13] proposes a fireworks algorithm based on a loser-out tournament, which also uses an independent selection operators to select fireworks for the next generation.

Learning automata (LA) [14] is a kind of machine learning algorithm which can be used as a general-purpose stochastic optimization tool. An learning automaton maintains a state probability vector where each component represents a reward probability of an action. The vector is updated through interactions with a stochastic unknown environment. LA tries to find the optimal action from a finite number of actions by applying actions to environment constantly. Environment returns a reinforcement signal which shows the relative quality of a selected action. An learning automaton receives signals and updates the vector according to its own strategy. When the termination criterion is satisfied, the optimal action is found out. So far, several PSO algorithms combined with LA have been proposed. Hashemi [15] proposes a PSO variant using LA to adaptively select parameters of PSO. A PSO variant that integrates with LA in a noisy environment is proposed by Zhang [16]. It uses LA through its unique selection mechanism to allocate re-evaluations in an adaptive manner and reduce computing resources.

Since the state probability vector of LA is updated constantly, it accumulates historical information and evaluates the quality of each action. It is more reasonable to apply learning automata to determine the number of sparks of each firework than the use of current fitness value only. On the other side, the probability vector converges gradually as the search proceeds, which leads to a strong local search ability in the late search stage.

In this paper, a Learning Automata-based Fireworks Algorithm (LAFWA) is proposed. Fireworks obtain reasonable numbers of sparks by applying LA to FWA, which leads to a competitive performance, as sparks will only be assigned to promising fireworks which brings a strong local search ability.

The rest of this paper is organized as follows. Section 2 reviews the related works of FWA and Learning Automata. Section 3 proposes the LAFWA. Experimental results based on the CEC 2013 benchmark suite are given in Sect. 4 and compared with its peers. Conclusions are drawn in Sect. 5.

2 Related Work

2.1 Fireworks Algorithm

This paper is based on GFWA [17]. In this section, GFWA will be introduced first. Without loss of generality, a minimization problem is considered as the optimization problem in this paper:

$$\begin{aligned} min \quad {f(x)} \end{aligned}$$
(1)

where x is a vector in the solution space.

Explosion Strategy. GFWA follows the explosion strategy of dynFWA. In GFWA, the number of explosion sparks of each firework is calculated as following:

$$\begin{aligned} \lambda _{i} = \hat{\lambda }\cdot \frac{\mathop {max}\limits _{j}(f(X_j))-f(X_i)}{\sum \limits _{j}(\mathop {max}\limits _{k}(f(X_k))-f(X_j))}, \end{aligned}$$
(2)

where \(\lambda \) is a parameter to control the number of explosion sparks. A firework with smaller fitness value generates more sparks according to this formula. Secondly, GFWA adopts a dynamic explosion amplitude update strategy for each firework from dynFWA. The explosion amplitude of each firework is calculated as follows:

$$\begin{aligned} A_{i}(t)=\left\{ \begin{aligned}&A_{i}(t-1)\cdot \rho ^+&\mathrm{if}~f(X_i(t))-f(X_i(t-1))<0\\&A_{i}(t-1)\cdot \rho ^-&\mathrm{otherwise}~ \\ \end{aligned} \right. \end{aligned}$$
(3)

where \(A_{i}(t)\) and \(X_i(t)\) represent the explosion amplitude and the position of i-th firework at generation t. \(\rho ^-\in (0,1)\) is the reduction coefficient while \(\rho ^+\in (1,+\infty )\) is the amplification coefficient. Sparks are generated uniformly within a hypercube. The explosion amplitude is the radius of the hypercube and the center of the hypercube is the firework. Algorithm 1 shows the process of sparks generated by a firework where D is the dimension, \(B_U\) and \(B_L\) are the upper and lower bounds of the search space, respectively.

figure a

Guiding Vector. A mechanism called guiding vector (GV) is proposed in GFWA. A group of sparks with good quality and another group of sparks with bad quality are utilized to build a guiding vector. The GV guides a firework to move farther. Note that each firework only generates one guiding vector. The GV of i-th firework named \(\varDelta _{i}\) is calculated from its explosion sparks \(s_{i,j}(1\le j \le \lambda _{i})\) as follows:

$$\begin{aligned} \varDelta _{i} = \frac{1}{\sigma \lambda _{i}}\sum _{j=1}^{\sigma \lambda _{i}}(s_{i,j}-s_{i,\lambda _{i}-j+1}) \end{aligned}$$
(4)

where \(\sigma \) is a parameter to control the proportion of adopted explosion sparks and \(s_{i,j}\) means the spark of i-th firework with j-th smallest fitness value. A guiding spark (\(GS_{i}\)) is generated by add a GV to the i-th firework as shown in (5).

$$\begin{aligned} GS_{i} = X_{i} + \varDelta _{i} \end{aligned}$$
(5)

The main process of GFWA is described in Algorithm 2.

figure b

2.2 Learning Automata

LA with a variable structure can be represented as a quadruple \(\{\alpha , \beta , P, T\}\), where \(\alpha =\{\alpha _1,\alpha _2,\dots ,\alpha _r\}\) is a set of actions; \(\beta =\{\beta _1,\beta _2,\dots ,\beta _s\}\) is a set of inputs; \({P}=\{p_1, p_2, \dots ,p_r\}\) is a state probability vector of actions and T is a pursuit scheme to update the state probability vector, \({P}(t+1)=T(\alpha (t), \beta (t) ,{P}(t))\). The most popular pursuit scheme DP\(_{RI}\) is proposed in [18, 19] which increases the state probability of the estimated optimal action and decreases others. The pursuit scheme can be described as follows:

$$\begin{aligned} p_{w}(t+1) = max(0, p_{w}(t)-\varDelta ) \qquad w\ne i \end{aligned}$$
(6)
$$\begin{aligned} p_{i}(t+1)=\sum {p_{w}(t)-p_{w}(t+1)}+p_{i}(t) \end{aligned}$$
(7)

where the optimal action is the i-th action. Another famous pursuit scheme DGPA is proposed in [20], it increases the state probability of the actions with higher reward estimates than the current chosen action and decreases others. It can be described as follows:

$$\begin{aligned} p_{w}(t+1) = max(0, p_{w}(t)-\varDelta ) \qquad p_{w}< p_{i} \end{aligned}$$
(8)
$$\begin{aligned} p_{w}(t+1)=\frac{\sum {p_{i}(t)-p_{i}(t+1)}}{k}+p_{w}(t) \qquad p_{w}\ge p_{i} \end{aligned}$$
(9)

where i represents i-th action selected this time and k is the number of the actions whose probability is not less than the i-th action. And Zhang [21] proposes a new pursuit scheme Last-position Elimination-based Learning Automata (LELA) inspired by a reverse philosophy. Z(t) is the set of actions whose probability is not zero at time t. LELA decreases the state probability of the estimated worst action in Z(t), and increases others in Z(t). It can be described as follows:

$$\begin{aligned} p_{w}(t+1) = max(0, p_{w}(t)-\varDelta ) \end{aligned}$$
(10)
$$\begin{aligned} p_{i}(t+1)=\frac{{p_{w}(t)-p_{w}(t+1)}}{||Z(t)||}+p_{w}(t) \qquad \forall i\ne w,i\epsilon Z(t) \end{aligned}$$
(11)

3 Learning Automata-Based Fireworks Algorithm

3.1 m-DP\(_{RI}\)

In this paper, we modify DP\(_{RI}\) to make it more suitable for our algorithm. In the classic DP\(_{RI}\), it rewards the estimated optimal action and punishes others. However, the pursuit scheme leads to a fast convergence of the state probability vector which is harmful for the global search ability in the early search stage. In the m-DP\(_{RI}\), the best m actions will be rewarded besides the best one. And m decreases linearly as the search progresses to enhance the local search ability gradually. The update strategy can be expressed as follows:

$$\begin{aligned} p_{w}(t+1) = max(0, p_{w}(t)-\varDelta ) \qquad w \in [m+1,n] \end{aligned}$$
(12)
$$\begin{aligned} p_{w}(t+1)=\frac{\sum _{m+1}^{n}{p_{i}(t)-p_{i}(t+1)}}{m}+p_{w}(t) \qquad w \in [1,m] \end{aligned}$$
(13)
$$\begin{aligned} m=m-1 \qquad if \quad g \% \frac{MG}{M}=0 \end{aligned}$$
(14)

where \(\varDelta \) is the step size, g is the generation number now, MG is maximum number of generation allowed and M is the initial number of m. The state probability vector will be sorted after the update to decide the m actions to be rewarded again.

3.2 Assigning Sparks

LA is applied to assigning sparks to fireworks according to the state probability vector in this paper. The firework with larger probability will generate more sparks. The probability vector converges as the search proceeds, the promising fireworks generates most sparks in the late search stage so that the algorithm has a strong local search ability. n probability intervals P are calculated to assign sparks by (15) based on the state probability vector p. Algorithm 3 shows how the LAFWA assigns sparks by the probability intervals P.

$$\begin{aligned} P_{i}=\left[ \sum _{1}^{i-1}p_{j}, \sum _{1}^{i-1}p_{j} +p_{i}\right] \end{aligned}$$
(15)
figure c

3.3 Learning Automata-Based Fireworks Algorithm

The procedure of LAFWA is given as the pseudo code shown in Algorithm 4 and explained as follows:

  • Step 1 Initialization: Generate the positions and velocities of n fireworks randomly. Initialize the state probability vector of assigning the sparks p evenly and step size \(\varDelta \), where p represents the probability that the spark assigned to the firework and \(\varDelta \) represents the step size that p decreases or increases.

  • Step 2 Assign Sparks: Each one of the \(\lambda \) sparks will be assigned to n fireworks according to Algorithm 3, the firework with greater probability will generate more sparks.

  • Step 3 Perform Explosion: For each firework, the explosion amplitude is calculated by (3). Sparks generated uniformly within a hypercube. The explosion amplitude is the radius of the hypercube and the center of the hypercube is the firework. Generate sparks by Algorithm 1

  • Step 4 Generate Guiding Sparks: Generate the guiding spark by (4) and (5).

  • Step 5 Select Fireworks: Evaluate the fitness value of sparks and guiding vector. Select the best individual among the sparks, guiding spark, and the firework as a new firework for each firework.

  • Step 6 Update Probability: Update p according to (12) and (13) and sort p.

  • Step 7 Decrease Linearly: Complete linear decrement of m by performing (14).

  • Step 8 Terminal Condition Check: If any of pre-defined termination criteria is satisfied, the algorithm terminates. Otherwise, repeat from Step 2.

figure d
Table 1. 28 test functions

4 Experimental Results and Comparisons

In this section, experiments are carried out to illustrate the advantages of LAFWA in comparison with four pioneering FWA variants.

4.1 Benchmark and Experimental Settings

Parameter setting in LAFWA is given in the following. The main parameters include:

  • n: The number of fireworks.

  • \(\lambda \): The total number of sparks.

  • \(\rho ^-\) and \(\rho ^+\): The reduction and amplification factors.

  • \(\varDelta \): The step size of LA.

  • M: The initial number of m.

For each firework, a larger n can explore more but generate less sparks. In the proposed LAFWA, we set n = 10 to get a good global search ability in the early stage. The reduction and amplification factors \(\rho ^-\) and \(\rho ^+\) are two important parameters for dynamic search. We set the two coefficients to 0.9 and 1.2 respectively according to [12]. \(\varDelta \) and M is set to 0.01 and 4 according our experiments.

The experimental results are evaluated on the CEC 2013 single objective optimization benchmark suite [22], including 5 single-mode functions and 23 multi-mode functions (shown in Table 2). Standard settings are adopted for the range of parameters, such as dimensions, maximum functional evaluation numbers, and have been widely used for testing algorithms. The search ranges of all the 28 test functions are set to [−100, 100]\(^D \) and D is set to 30. According to the suggestions of this benchmark suite, all the algorithms repeated for 51 times for each function and the maximal number of function evaluations in each run is 1000*D. All the experiments are carried out using MATLAB R2016a on a PC with Intel(R) Core(TM) i5-8400 running at 2.80 GHz with 8G RAM.

4.2 Experimental Results and Comparison

To validate the effectiveness of LAFWA, we compare it with four pioneering FWA variants, including AFWA, dynFWA, COFFWA [24], GFWA. Parameters for these four algorithms are set to the suggested values according to their published papers. The results of the performance on the solution accuracy are listed in Table 2. Boldface indicates the best results from all listed algorithms, “Mean” is the mean results of 51 independent runs, “AR” is the average ranking of an algorithm, calculated by the sum of ranking on the 28 functions divided by the number of functions. The less AR, the better performance. LAFWA shows its outstanding convergence accuracy among all the listed FWA variants. In the unimodal part of Table 3, LAFWA performs well, achieves first 3 times among 5 functions. For the challenging multimodal and composition functions, the global optimal value is more difficult to locate. LAFWA shows its superiority and get first 15 times and second 6 times among 23 functions. The A.R. of LAFWA is 1.5 in general, which ranks the first compared with other competitors.

Table 2. LAFWA accuracy compared with other FWAs.

5 Conclusion

In this work, Learning Automata-based Fireworks Algorithm (LAFWA) is proposed to assign sparks to the fireworks more reasonable by using LA. The state probability vector is average so that the global search ability is well at the early search stage. As the search proceeds, the probability vector converges which leads to a strong local search ability in the late search stage. Experimental results performed on CEC2013 benchmark functions show that the LAFWA outperforms several pioneering FWA variants. Future work focuses on improving the update strategy of the state probability vector of LA.