Keywords

1 Introduction

With the number of Web services growing dramatically, different service providers offer many Web services with the same or similar functions. It has been an urgent task to recommend suitable services for users from a large number of service candidates [1]. There are two types of properties for a service, functional properties and non-functional properties. The non-functional properties are also known as quality-of-service (QoS). It is a spotlight of how to improve the prediction accuracy of QoS and how to provide service recommendation with high quality [2].

In recent years, many researchers have applied collaborative filtering (CF) to QoS prediction and obtained better prediction accuracy [2]. A crucial task of CF-based methods is to identify neighbors. The prediction accuracy largely relies on the quality of the identified neighbors. To improve the quality of identified neighbors, some existing works focus on the improvement of similarity computation. But in the case of high data sparsity, we notice that the number of neighbors of a user is quite limited. Also, there exist some invalid neighbors that need to be filtered, which did not invoke the target service or are not observed by the target user (we name these neighbors as invalid neighbors) and those neighbors will be not used to predict missing QoS values. Such filtering is likely to lead to low prediction accuracy because of lack of available neighbors.

To address these problems, we propose a two-phase approach for QoS prediction, which is based on restricted Boltzmann machine (RBM). The first prediction phase is to predict missing values for invalid neighbors, and further identify similar neighbors. The second prediction phase is to compute the final prediction results by user-based CF method. From the experimental results, it can be seen that our approach can make better use of identified neighbors and alleviate data sparsity problem.

In summary, the main contributions of this paper are as follows.

  1. 1.

    It proposes a novel neighbors selection method. It employs RBM model to predict missing values for invalid neighbors, and further filters the neighbors with network location and geographical information. The proposed method can complete the task of selecting high-quality neighbors.

  2. 2.

    It conducts sufficient experiments on a real-world dataset, and compares the proposed methods to many existing methods. The experimental results demonstrate the effectiveness and high robustness to parameter values of our method.

The rest of this paper is organized as follows. Section 2 discusses the related work. Sections 3 and 4 elaborate the proposed approach, and the experimental results are presented in Sect. 5. Section 6 concludes the whole paper.

2 Related Work

The traditional CF methods can be classified into two categories: neighbor-based methods and model-based methods [2]. Many neighbor-based CF methods have achieved success in recommendation systems [3]. Wu et al. [4] proposed a ratio-based method to compute similarity, which improved the prediction accuracy and could be computed faster than other compared methods.

One difficulty in neighbor-based CF methods is the data sparsity problem [5]. Owing to the high sparsity of data, the neighbor-based CF methods cannot accurately obtain similar neighbors. In recent years, some studies try to solve this problem. Wu et al. [5] proposed a time-aware neighbor-based CF approach with better accuracy at high sparsity. These earlier works put attention to similarity computation, but this paper also focuses the selection of neighbors.

Moreover, it is difficult for neighbor-based methods to handle large amount of data. Therefore, researchers turned to study model-based CF methods. There are several representative model-based approaches, including Matrix Factorization (MF) [6] and restricted Boltzmann machine (RBM) [7]. In recent years, some researchers exploited the potential of RBM in extracting features and solving data sparsity problem [8].

To fully take advantage of neighbor-based methods and model-based methods, some researchers tried to combine the two types of approaches. Inspired by such an idea, in this paper, we propose a novel model by both leveraging the RBM-based model and neighbor-based model to predict missing QoS values.

3 The Whole Framework

We present the proposed whole framework in Fig. 1, which consists of two phases.

Fig. 1.
figure 1

The whole framework

In the first phase, we use the Euclidean distance to compute the similarity among users. The similarity computation result is used to build the initial similar neighbors set. Next, we employ RBM model to predict all missing QoS values for invalid neighbors. We further identify fine-grained neighbors from the initial neighbors set based on network location and geographical information. In the second phase, we propose the user-based CF model to predict final results.

4 The Proposed Prediction Approach

4.1 The First-Phase Prediction

Similarity Computation.

In this section, we propose a Euclidean distance in similarity computation.

$$ S_{u,v} = \frac{1}{{1 + \sqrt {\frac{{\sum\limits_{i = 0}^{M} {\left( {\left( {q_{u,i} { - }\bar{q}_{u} } \right) - \left( {q_{v,i} { - }\bar{q}_{v} } \right)} \right)^{2} } }}{\left| M \right|}} }} $$
(1)

where \( S_{u,v} \) is the similarity of user u and user v. \( M = M_{u} { \cap }M_{v} \) is the set of services that are invoked by both user u and user v, \( q_{u,i} \) is the QoS value of target service i invoked by target user u, and \( q_{v,i} \) is the QoS value of target service i invoked by user v. We add a one value in the denominator to prevent the denominator from being zero. \( \bar{q}_{u} \) is the average QoS of user u and \( \bar{q}_{v} \) is the average QoS of user v.

Neighbors Selection.

After the similarity computation between different users and different services, we could have directly chosen the top K most similar neighbors. However, in the current neighbors set, some neighbors are not applicable, because such neighbors probably do not have invoked each target service, which lowers the number of available neighbors and further damages prediction accuracy. To fix this issue, we propose to use RBM model to predict missing QoS values for those neighbors which have not invoked a target service (in user side). Our aim is to ensure that all neighbors can be applicable for reliable prediction. Meanwhile, we further filter some other neighbors by utilizing the network location and geographical information. Such filtering can improve the quality of similar neighbors. Then, a similar neighbors’ set N(u) of user u is finally formed. N(u) is composed of two subsets N1(u) and N2(u). N1(u) is the set of predicted neighbors and N2(u) is the set of valid neighbors.

Prediction Based on RBM Model.

The next task is the missing QoS values prediction for the users in set N1(u). We propose an RBM model to finish this task. Suppose that we have M services, N users, rounded QoS values from 1 to K, and a user invoked m services. Each user is treated as a single training case of an RBM, and we still use an RBM to predict missing values. Each RBM shares the same number of hidden units H that represent features, but an RBM only has visible softmax units U for the services invoked by that user. An RBM can only have a few connections in high sparsity of the real case of services invocation. Let U be a \( K \times m \) observed binary indicator matrix with \( u_{i}^{k} = 1 \) if the user u has invoked service i as the value of being k, and 0 otherwise. The energy function of user-oriented RBM is defined as

$$ \begin{aligned} E(u,h;W,b) & = - \sum\limits_{i = 1}^{M} {\sum\limits_{j = 1}^{N} {\sum\limits_{k = 0}^{K} {W_{i,j}^{k} h_{j} u_{i}^{k} } } } + \sum\limits_{i = 1}^{M} {\log Z_{i} } \\ & \quad - \sum\limits_{i = 1}^{M} {\sum\limits_{k = 0}^{K} {u_{i}^{k} b_{i}^{k} } } - \sum\limits_{j = 1}^{N} {h_{j} b_{j} } \\ \end{aligned} $$
(2)

where \( W_{i,j}^{k} \) is a symmetric interaction parameter between the QoS value k of the i-th service and the j-th feature. \( b_{i}^{k} \) and bj are two biases to reflect the innovation preference of services i and j. \( Z_{i} = \sum\limits_{k = 0}^{K} {\exp (b_{i}^{k} + \sum\limits_{j} {h_{j} W_{i,j}^{k} } )} \) is the normalization term that ensures \( \sum\limits_{k = 0}^{K} {P(u_{i}^{k} = 1|h)} = 1 \). According to the conditional multinomial distribution and conditional Bernoulli distribution, the distributions of services and features are

$$ P(u_{i}^{k} = 1|h) = \frac{{\exp (b_{i}^{k} + \sum\limits_{j = 1}^{n} {u_{i}^{k} W_{i,j}^{k} } )}}{{\sum\limits_{k = 0}^{K} {\exp (b_{i}^{k} + \sum\limits_{j = 1}^{n} {h_{j} W_{i,j}^{k} } )} }} $$
(3)
$$ P(h_{j} = 1|u) = \sigma (b_{j} + \sum\limits_{i = 1}^{m} {\sum\limits_{k = 0}^{K} {u_{i}^{k} W_{i,j}^{k} } } ) $$
(4)

where \( \sigma (x) = 1/1 + \exp ( - x) \) is the sigmoid activation function. With the conditional distributions in Eqs. (3) and (4), we can directly use the contrastive divergence algorithm for training [9], in which the updates for each parameter are as follows.

$$ \begin{aligned} \frac{\partial \log P(u;\theta )}{{\partial W_{i,j} }} & = \varepsilon (\left\langle {u_{i} h_{j} } \right\rangle_{data} - \left\langle {u_{i} h_{j} } \right\rangle_{rec} ),\quad u_{i} > 0 \\ \frac{\partial \log P(u;\theta )}{{\partial a_{i} }} & = \varepsilon (\left\langle {u_{i} } \right\rangle_{data} - \left\langle {u_{i} } \right\rangle_{rec} ),\quad u_{i} > 0 \\ \frac{\partial \log P(u;\theta )}{{\partial b_{j} }} & = \varepsilon (\left\langle {h_{j} } \right\rangle_{data} - \left\langle {h_{j} } \right\rangle_{rec} ) \\ \end{aligned} $$
(5)

where \( \left\langle \cdot \right\rangle_{data} \) represents the probability distribution of a hidden layer in the case of a visible unit. \( \left\langle \cdot \right\rangle_{rec} \) represents the probability distribution of the model definition after the reconstruction using the contrastive divergence algorithm, and \( \varepsilon \) is the learning rate. Among these, \( u_{i} > 0 \) ensures that only the data that satisfy \( u_{i} > 0 \) are used in the model.

After the model is trained, the probability that a user v invokes a service i with a QoS value being k can be obtained directly based on the known QoS values set U. The RBM prediction of the missing QoS value \( \hat{q}_{v,i} \) is:

$$ \hat{q}_{v,i} = l \times \hbox{max} (P(v_{i}^{k} = 1|V)_{l} ) $$
(6)

where \( \hbox{max} (P(v_{i}^{k} = 1|V)_{l} ) \) is the maximum value of the probability that the user v invoked the service i and the received QoS is k, and l is the score corresponding to the maximum value of the probability.

4.2 The Second-Phase Prediction

To improve the final prediction accuracy, we also propose a user-based CF method to predict in the second phase.

The neighbors set N(u) for target user u is generated in the first-phase, which has two subsets, that is, the predicted neighbors set N1(u) and valid neighbors set N2(u). The final prediction result is computed as follows.

$$ \hat{q}_{u,i} = \frac{{\sum\limits_{{v_{1} \in N_{1} (u)}} {(\hat{q}_{{v_{1} ,i}} ) \times S_{{u,v_{1} }} } + \sum\limits_{{v_{2} \in N_{2} (u)}} {(q_{{v_{2} ,i}} ) \times S_{{u,v_{2} }} } }}{{\sum\limits_{{v_{1} \in N_{1} (u)}} {S_{{u,v_{1} }} } + \sum\limits_{{v_{2} \in N_{2} (u)}} {S_{{u,v_{2} }} } }} $$
(7)

where \( \hat{q}_{{v_{1} ,i}} \) represents the first-phase prediction results of target service i after being invoked by user v1. \( S_{{u,v_{1} }} \) is the similarity of target user u and user v1. \( q_{{v_{2} ,i}} \) is the QoS value of target service i after being invoked by user v2. \( S_{{u,v_{2} }} \) is the similarity of target user u and user v2.

5 Experiment and Evaluation

We use the public dataset WSDream to conduct the experiments [10]. This dataset has been widely used by many researchers. WSDream dataset contains 5825 services and 339 users, including two QoS attributes: response time and throughput.

5.1 Evaluation Metric and Parameter Setting

We use the mean absolute error (MAE) and normalized mean absolute error (NMAE) metrics to evaluate the prediction accuracy. MAE and NMAE are computed as

$$ MAE = \frac{1}{N}\sum\limits_{u,i} {|q_{u,i} - \hat{q}_{u,i} |} ,NMAE = \frac{MAE}{{(\sum\limits_{(u,i) \in TestSet} {q_{u,i} )/N} }} $$
(8)

where \( q_{u,i} \) represents the real QoS value, \( \hat{q}_{u,i} \) represents the prediction result, and N is the number of values in test set.

5.2 Performance Comparison

In order to reflect the real case of service invocation, we randomly select a part of data from the original WSDream dataset as training set, and the rest data form the test set. In this study, we generate four training sets with different sparsities, where the sparsity degree d is 2.5%, 5%, 10%, 15%, and 20%, respectively.

To better evaluate the performance of the proposed method, we compare our methods with the following state-of-the-art competitive QoS prediction methods. The experimental results are present in Table 1. The parameters in the compared methods are set according to the default settings in their original papers.

Table 1. Accuracy comparison (a smaller value means a higher accuracy)
  1. 1.

    RBM (restricted Boltzmann machine) [7]: This method uses the RBM-based CF algorithm to predict missing values.

  2. 2.

    WSRec [10]: A hybrid model composed of user-based CF and item-based CF.

  3. 3.

    LFM (latent factor model) [6]: LFM decomposes the user-service matrix by dimensionality reduction to learn implicit features and produce predictions.

  4. 4.

    CAP (credibility-aware prediction model) [11]: CAP is a novel credibility-aware QoS prediction method, which employs two-phase K-means clustering algorithm.

  5. 5.

    JLMF [12]: JLMF is an MF model based on network location information and influence of neighbors.

  6. 6.

    LE-MF (location-enhanced matrix factorization) [13]: A matrix decomposition model that introduces location information and trust mechanism.

  7. 7.

    U-RBM (user-oriented RBM): Our proposed model with being named as U-RBM.

In Table 1, MAE is the mean absolute error, NMAE is the normalized mean absolute error, and d is the sparsity of training sets. We can have following observations.

  1. 1.

    The proposed prediction methods U-RBM is superior to the compared methods in on both MAE and NMAE measures.

  2. 2.

    In all cases of training set densities, our proposed models achieve consistently lower errors. Specially, in the case of high data sparsity (e.g., d = 2.5% and 5%), the prediction accuracy of U-RBM model is still the highest, which indicates that U-RBM can better deal with data sparsity problem.

5.3 Sensitivity Analysis of Parameters

Impact of TopKUser.

The parameter TopKUser denotes the number of similar neighbors. A higher sparsity means less available training data. The evaluated value of TopKUser is from 4 to 20.

As shown in Fig. 2, with the increase of TopKUser, MAE values decrease initially. This is because as the number of similar neighbors increases, the probability of selecting the real similar neighbors of a target user becomes larger. Also, the similar neighbor filtering adds those users that are highly similar to the target user. This is because there are no public invocation records, and those users are prevented from being similar neighbors on account of other users who do not have a strong similarity but are still selected. So the reliability of similar neighbors is enhanced.

Fig. 2.
figure 2

Impact of TopKUser

6 Conclusions

In this paper, we propose an approach for QoS prediction based on RBM model. We proposed a novel similarity computation method. Then, the RBM was proposed to predict missing values for invalid neighbors. We also employed network location and geographical information to further improve the selection quality of neighbors. The extensive experiments conducted in a real-world dataset verified the effectiveness of our models.