Keywords

1 Introduction

In an era of an increasing need for personalized recommendations, serendipity has become an important metric for achieving such a goal. Serendipitous recommender systems have been investigated and developed, to generate such results for their customers. Such systems can now be found in certain applications, such as in music recommendation [21].

However, as a user-centric concept, serendipity has been understood narrowly within the Recommender System field, and it has been defined in previous research as “receiving an unexpected and fortuitous item recommendation” [20]. The understanding of serendipity, as a user-centered concept, has been a gap for a while. Until recently, an awareness of this gap has led a conceptual bridge, which introduced serendipity from Information Research into Recommender Systems, by proposing an Information Theory-based algorithm [36]. To further investigate this algorithm, it needs to be implemented as an end-to-end recommender system, but it is difficult to do so.

The challenges of transferring this conceptual bridge into a real-world implementation are two-fold. Firstly, it is demanding to inject the understanding appropriately, since the implementation may forfeit the algorithm design, to develop such a run-time system. Secondly, even though the implementation can recommend serendipitous information, it is demanding to ensure an overall enhanced user experience. For example, the overall system performance may compromise a user’s experience, if the system response time is slow, since serendipity is a very sensitive feeling.

Thus, it is important, that serendipitous systems are designed with an accurate understanding of the concept, while delivering a high level of performance. Hence, we present CHESTNUT, a state-of-the-art memory-based movie recommender system to improve serendipity performance. Whereas prior research has produced many serendipitous frameworks, it has focused on applying algorithmic techniques, rather than transferring a basic understanding of serendipity into the system development (Sect. 3).

We have addressed the issues of develo** serendipitous systems by following a user-centered understanding of serendipity (Sect. 3) and focusing on runtime failures while making predictions (Sect. 5). Furthermore, we have optimized CHESTNUT by revisiting and updating significance weighing statistically to ensure a high level of system performance.

More specifically, we have made three main contributions here:

  1. (1)

    CHESTNUT Movie Recommender System. CHESTNUT applies an Information Theory-based algorithm, which aims to combine three key metrics based on a user-centered understanding of serendipity: insight, unexpectedness and value [36]. With regard to these metrics, CHESTNUT has three key functional units, respectively: 1) cInsight performs the “making connections” to expand a target user’s profile, to collect all target-user-related items (Sect. 3.1); 2) cUnexpectedness filtered out all expected items from all target-user-related items, with the help of a primitive prediction model (Sect. 3.2); and 3) cUsefulness evaluates the potential value of those candidate items through prediction, and generates a list of recommendations by sorting them from high to low (Sect. 3.3). In addition, while develo** CHESTNUT we revealed key implementation details (Sect. 4). The source codes of CHESTNUT are online at https://github.com/unnc-ucc/CHESTNUT.

  2. (2)

    Optimizations of CHESTNUT. Through system development, we observed that implementations following conventional methods could cause runtime failure in CHESTNUT. We have formulated this problem (Sect. 5.1), and optimized CHESTNUT in two ways: First, we adjust the conventional designs while generating predictions for memory-based collaborative filtering techniques (Sect. 5.2); Second, we revisited the conventional optimization method, significance weighting, to further improve the performance and effectiveness of CHESTNUT, with updates based on statistical analysis (Sect. 5.3).

  3. (3)

    Qualitative Evaluation of CHESTNUT. We conducted an experimental study to assess the performance of CHESTNUT, both a bare metal version and in optimized versions (Sect. 6). We have also benchmarked CHESTNUT with two mainstream memory-based collaborative filtering techniques, namely: item-based collaborative filtering and K-Neareset-Neighour user-based collaborative filtering from Apache Mahout. The results shows that CHESTNUT is fast, scalable and extremely serendipitous.

2 Background

CHESTNUT is built on a series of works, which aimed to understand serendipity, to quantify serendipity in many use cases and to introduce serendipity understanding into the Recommender System (i.e. would be illustrated in detail further). We have also draw inspiration from the implementation and optimization of memory-based collaborative filtering techniques to enhance the system performance [7,8,9, 27].

Within the Recommender System field, serendipity has been understood as “receiving an unexpected and fortuitous item recommendation” [20]. Many efforts have been made in the development and investigation of serendipitous recommender systems [1,2,3, 5, 6, 10,11,12, 14, 15, 23, 24, 28, 29, 31,32,33]. Until recently, the main focus of the development of serendipitous recommender systems has centered on the algorithmic techniques that are being deployed, however, there are no existing systems which aim to bring an optional serendipitous user experience by applying a user-centered approach to the development of serendipitous recommender systems.

Unlike accuracy or other metrics, serendipity, as a user-centric concept, is inappropriate for taking this narrow view within this field. Understanding the serendipity has already raised considerable interest and it has been investigated for long in multiple disciplines [18, 19, 25, 30]. For instance, to better understand this concept, a number of theoretical models have been established to study serendipity [16, 17, 26]. More recently, previous research has highlighted how “making connections” is an important point for serendipitous engineering [13]. Based on previous research outcome from Information Research, an Information Theory-based algorithm has been proposed to better understand serendipity in the Recommender System [36]. Furthermore, a systematic context-based study among Chinese Scholars has been conducted and proves the effectiveness of the proposed algorithm [35].

This proposed conceptual bridge, which is based on a more comprehensive understanding of serendipity by merging insight, unexpectedness and usefulness, has been partly developed and studied in a movie scenario with early tryouts [34]. To bring together the above aspects, the system is expected to work sequentially in three steps, as follows: it first expands the user’s profile by “making connections”; it then filters out unexpected items, according to the expanded profile and the original one; finally, it predicts ratings to calculate the value of all selected items to the target user, and then make appropriate recommendations.

However, it is still unclear how the proposed algorithm could be developed as an end-to-end recommender system in a real-world scenario, which is very practical, effective and suitable to deploy. Based on previous investigations, we have implemented CHESTNUT in a movie recommendation scenario. Below, we have presented a comprehensive overview of three major components to ensure and balance the three given metrics: insight, unexpectedness and usefulness (Sect. 3). In addition, we have presented the implementation details (Sect. 4) and optimization choices made during the development of CHESTNUT, which have been employed to attempt to improve its reliability and practicality in the real world (Sect. 5).

3 CHESTNUT Overview

Before explaining the details of the implementation, we introduce the three major functional units of CHESTNUT, which were developed consequentially with due consideration of the three metrics of serendipity mentioned above. There are three major functional units in CHESTNUT: cInsight, cUnexpectedness and cUsefulness. These units function sequentially and ensure corresponding metrics, one by one.

3.1 cInsight

The design of cInsight aims to stimulate the “making connections” process, which is a serendipitous design from Information Research, to expand the profile of target users.

Details of the functional process of making connections are as followed. With the users’ profiles uploadedFootnote 1, according to a referencing attributeFootnote 2, making connections would direct target users from their own information towards the most similar users in this selected attributeFootnote 3. This whole process is denoted as a level. The repetition of this process, by starting from the output in the previous level, would finally end with an active user or a set of active users, when the similarity between active user and target user reaches the threshold.

cInsight is not parameter-free: there are two parameters which need to be set in advance. First, the referencing attribute should be determined as the metric for making connections, and it should be related information, such as side information categoriesFootnote 4. Second, is the threshold to determine if the repetition shall end. Since more levels are formed by making connections, there is a larger distance between active users and target users. This threshold aims to make sure active users are not too “far” from the target user. Here, the thresholds could be the mathematical abstractions of similarityFootnote 5. cInsight performed the making connections process by starting with the target user profile. The repetitions of multiple levels would terminate and form a direction from target users to active users. cInsight would finally re-organize all active users’ profilesFootnote 6 for further processing. Here, assuming referencing attribute is director of movies, an example would be introduced as a brief explanation of making connections process:

Fig. 1.
figure 1

An example of the connection-making process

For a Target user who will be recommended with serendipitous information, cInsight works by analyzing his or her profile, and selects corresponding information from the profile as the starting point, which will depend on which attribute has been selected to reference.

As Fig. 1 shows, the movie Director D1, who received the most movie ratings from User A, is selected as the attribute in this example. Then, according to D1, another User B, can be selected who is a super fan of D1 and who contributes the largest number of movie ratings for D1 throughout the whole movie database. If User A and User B satisfy the defined threshold on similarity, then User B is considered as the active user to recommend movies to User A. Otherwise, the algorithm continues to find another User C, by selecting another Director D2, on the basis of User B’s profile, until User Z is found to meet the threshold between User Z and the Target user A.

3.2 cUnexpectedness

After cInsight, all relevant items, generated by making connections, have been passed forward to cUnexpectedness. The design of cUnexpectedness aims to make sure all remaining items are indeed unexpected by the target user.

The functional process of cUnexpectedness proceeds in two steps, respectively. Firstly, it aims to identify what items a target user expects, based on a broader view of results from cInsight. Here, applying the primitive prediction model, cUnexpectedness expands the original target users’ profiles into a target-users-would-expect profile. Secondly, based on the expected items generated by the first step, cUnexpectedness would remove all intersections between them and all items passed from cInsightFootnote 7).

Here, we illustrate how the first step could be abstracted. The expected movie list (EXP) consists of two parts, namely those movies that could be expected by the users (Eu), and a primitive prediction model (PM) (e.g. those movies have been rated very high on average). And this are desribed in Eq. (1).

$$\begin{aligned} EXP = Eu \cup PM \end{aligned}$$
(1)

Through cUnexpectedness, items from cInsight have been confirmed as being unexpected by the target user, which satisfies the need of unexpectedness.

3.3 cUsefulness

Following the guarantees of cInsight and cUnexpectedness, the final unit is to identify which items are valuable to target users, so cUsefulenss has been developed to achieve this goal. To evaluate potential movies’ value towards target user(s), generating prediction scores is the methodology applied in CHESTNUT, conducted by cUsefulness. cUsefulness quantifies the value of each unexpected movie to target users by predicting how they would be rated by target users.

Since the development plan is collaborative-filtering based, the following equation, which is a conventional approach for prediction, is used to calculate the movie prediction score in cUsefulenss.

$$\begin{aligned} P_{a,i} = \bar{r_{a}} + \frac{\sum _{u\in U}{(r_{u,i}-\bar{r_{u}})\times W_{a,u}}}{\sum _{u\in U}{|W_{a,u}|}} \end{aligned}$$
(2)

In Eq. (2), \(\bar{r_{a}}\) and \(\bar{r_{u}}\) are the average ratings for the user a and user u on all other rated items, and \(W_{a,u}\) is the weight calculated by the similarity values between the user a and user u. The summations are over all the users \(u \in U\) who have rated the item i.

4 Implementation Details

After giving an overview of CHESTNUT’s architecture and exploring the functionalities of the major components, in this section we will introduce some implementation details while develo** CHESTNUT, which enhanced the performance and practicality. CHESTNUT was developed in approximately 6,000 lines of codes in Java.

4.1 Similarity Metrics

As for the similarity metrics, during the development of CHESTNUT, Pearson Correlation Coefficient was selected as the similarity metric, which is described in Eq. (3).

$$\begin{aligned} W_{u,v} = \frac{\sum _{i\in I}{(r_{u,i}-\bar{r_{u}})(r_{v,i}-\bar{r_{v}})}}{\sqrt{\sum _{i\in I}{(r_{u,i}-\bar{r_{u}})^2}}\sqrt{\sum _{i\in I}{(r_{v,i}-\bar{r_{v}})^2}}} \end{aligned}$$
(3)

In Eq. (3), the \(i \in I\) summations are over the items that both users u and v have rated, \(r_{u,i}\) is the rating of u-th user on the i-th item and \(\bar{r_{u}}\) is the average rating of the co-rated items of the u-th user.

4.2 cInsight

cInsight expanded its profile through the connection-making process, after collecting the user’s profile, which relies on the referencing attribute from this target user. According to the number of movies rated by the user with respect to this very attribute and users’ effective ratings, the most related onesFootnote 8 has been selected. With this selection, another user’s profile could be generated which covers all the users that have rated movies, with this referencing attribute. Through sorting by the number of effective scores on this director from different users, the largest was chosen as the next user. This process would be repeated until the similarity between target user and selected user reached a threshold, which had been set in advance.

In CHESTNUT, the referencing attribute has been set as director of movies, and the effective scores refer to those ratings above 4.0Footnote 9. Moreover, this threshold has been set at 0.3Footnote 10. These settings are based on “cInsight”-related studies previously [34].

4.3 cUnexpectedness

cUnexpectedness preserves the unexpected items by excluding those any expected items from all active users’ items. Generating such expected items relies on the primitive prediction model.

In CHESTNUT, through the primitive prediction model, cUnexpectedness expanded the target user’s profile in two respects: first, it added all series movies, if any of those had appeared within the target user’s profile. Second, it also added the Top Popular Movies.

As Fig. 2 demonstrates, the work flow for generating the target-user-expected movies. While we implemented, we have specifically done in the following ways: for the first step, cUnexpectedness determines whether a movie belongs to a film series, by comparing their titles. To speed up this process, here we applied a dynamic programming approach. In the second step, we selected Top Two Hundred because we observed that there is an obvious fracture in this very number, through sorting counts from high to low, based on the number of ratings have been given in the whole data set.

Fig. 2.
figure 2

Work flow of primitive prediction model

4.4 cUsefulenss

cUsefulness is responsible for examining the potential value of all movies, which have been filtered by cUnexpectedness. In the very first prototype development, cUsefulness functioned as the same as other memory-based collaborative filtering techniques, by exploring target users’ neighbors, finding one with the most similarities and generating predictions according to the method mentioned in Sect. 3.3. However, through lightweight tests, we observed how this have caused run-time failures. We will discuss about it in Sect. 5.

4.5 User Interface

For user interactions, a website has been developed as a user interface for CHESTNUT. After logging in, the user is able to view their rated movies, as shown in Fig. 3. For each movie, the interface would offer an image of the movie poster, the title, the published year, the director and the rating from this user.

The follow-up pages, which enable users to view results and give feedback, are organized very similarly. However, when viewing the results, users are able to gather more information via their IMDB links (e.g. for more details or trailers), to present their own ratings, to answer the designed questionnaire and to leave comments.

Fig. 3.
figure 3

The user interface

5 Optimization

In this section, we introduce some key insights for the related optimization of CHESTNUT. Through lightweight tests, we found out that CHESTNUT could only produce one to two results for almost every user. To improve the system’s overall performance and deployability, we optimized CHESTNUT by applying a new significance weighting and reforming the prediction mechanism. We first explored the problem, and then introduced them respectively.

5.1 Problem Formulation

After breakdown evaluations of each component in CHESTNUT, we found that for every target user in the test set, only two to three items were predicted via cUsefulness, when the recommendation list was set to 1,000.

We believe this problem is two-fold. First, memory-based collaborative filtering relies on users’ existing profiles to assist the prediction, and this method was directly conducted by searching co-rated items within the users’ neighbors. However, with CHESTNUT, neighbor users are very unlikely to have co-rated items. From our observations, almost every user could not be supported by their top two hundred neighbors in CHESTNUT.

The second issue is more interesting. Owing to the characteristics of Pearson-Correlation Coefficient, the smaller the intersection between two users, the more the possibility that the value is higher. In other words, some similarities are not trustworthy and these led indirectly to CHESTNUT’s runtime failures.

5.2 Mechanism Adjustment

Rather than searching a target user’s neighbors from high similarity to low, cUsefulness applied a greedy approach to ensure the prediction process could proceed. Each time cUsefulness needs to make a prediction, it first selects all users who have co-rated need-to-predict items. Then, within this group, cUsefulness cross-checks to find if there are any neighbors. If so, cUsefulness regroups and ranks from high to low, according to the similarity. With these settings, cUsefulness would proceed and make predictions for as many items as possible.

This mechanism adjustment demonstrated its benefits. First, it optimized the overall system performance. Since prediction is the most time-consuming element of CHESTNUT, this adjustment ensured that the prediction would not reach a dead end, when finding predictable neighbors. Second, since it guaranteed the co-rated item in advance, it ensured that CHESTNUT would not have any runtime failures, caused by prediction interruptions.

However, this mechanism has intensified the formulated problem which mentioned previously. Since the computing sample size was smaller, owing to the features of serendipitous recommendation, the reliability of the similarity values would inevitably affect the overall recommendation quality.

5.3 Similarity Correction

We are not the first to recognize the necessity of similarity correction. Previous research has identified this kind of issue and has offered a solution known as significance weighting [8]. By setting a threshold, all similarity values, with fewer counts of co-rated items than this threshold, would divide a certain value to correct the value and maintain the exact similarity value.

In previous trials, 50 has been selected as the number for significance weighting to optimize the prediction process. However, in existing literature there is no explanation for how such a number has been obtained, and it appears to be a threshold obtained from previous experience. Since this threshold could be quite sensitive for the data set, we decided to explore and analyze its usage from a statistical perspective. As previously explained, the characteristics of Pearson-Correlation Coefficient could be too extreme when co-rated items are very limited (e.g. only one or two). Therefore, we have assumed the distribution shall be a normal distribution and we take advantage of the Confidence Ratio to illustrate this very problem.

All Pearson-Correlation values are computed and collected. All the values are then clustered and plotted on a new graph, with the average co-rated movie counters as y-axis and these values as x-axis. As shown in Fig. 4, it is evident that this nonlinear curve can be fitted into a GaussAMP model, which illustrates that the global Pearson-Correlation Coefficients approximate a normal distribution.

Fig. 4.
figure 4

Plotted pearson-correlation vs. ave co-rated number

Inspired by the Confidence Ratio in a Normal Distribution, we defined the quantity of edge areas as the unlikelihood. This unlikelihood aimed to quantify the unreliability of similarity values from global views. Based on the results presented in Fig. 5, the Reliability, or the Confidence Ratio, could be abstracted as calculus mathematically. We then further selected four confidence ratios, in comparison with the initial value of 50. According to the different ratios of the complete areas, determine the height reversely and apply into M and calculate the corresponding n, Table 1 could be obtained:

Table 1. The average number of co-rated items under different ratios

We substituted the obtained results with the significance weighting respectively, and applied this similarity correction to improve the reliability of these values, in all related components of CHESTNUT.

6 Experimental Study

In this section, we introduce details of CHESTNUT’s experimental study. The HeteRec 2011 data set was selected as the source data for this experimental evaluation. It contains 855,598 ratings on 10,197 movies from 2,113 users [4]. In addition, all users’ k-nearest neighbors’ data are also prepared in advance.

The experiment began by initializing the database and makes the supplement for information about directors of all the movies via a web crawler. Bearing in mind that some movies have more than one director, and there are no rules of distinction which are recognized by the public, only the first director was chosen during this process. After completion of the data preparation, CHESTNUT with different correction levels was run through each user in the database in turn.

Since CHESTNUT is a memory-based collaborative filtering system, to examine overall performances, we chose mainstream memory-based collaborative filtering techniques, namely: item-based and user-based collaborative filtering from Mahout as the benchmark [22].

All the implementations were conducted in Java and all the experiments were run on a Windows 10 Pro for Workstations based workstation Dell Precision 3620 with Inter Xeon E3-1225 processor (Quad Core 3.3 GHz, 3.7 GHz Turbo, 8 MB, w/HD Graphics P530) and 32 GB of RAM (\(4\times 8\) GB, 2400 MHz, DDR4).

Our experimental study aimed to answer the following three questions:

  1. (1)

    How much performance improvement can be achieved with CHESTNUT, compared with mainstream memory-based collaborative filtering techniques?

  2. (2)

    How many performance benefits have been gained with CHESTNUT, when different optimization levels are deployed?

  3. (3)

    What tradeoffs are caused if CHESTNUT is optimized with significance weighting?

6.1 Recommendation Performance

We first demonstrated that CHESTNUT can significantly improve the unexpectedness of recommendation results and while maintaining its scalability. For this purpose, we varied the number of items in the recommendation lists from 5 to 1000, and each time increased the number by 5. As shown in Fig. 5, CHESTNUT could perform unexpectedness between 0.9 and 1.0. However, item-based and user-based collaborative filtering could only perform unexpectedness within the ranges 0.75 to 0.8 and 0.43 to 0.6 respectively. This is because unexpectedness was one of the major goals set during the design and development of CHESTNUT (Fig. 7).

Fig. 5.
figure 5

Levels of unexpectedness in CHESTNUT and for benchmarks

Fig. 6.
figure 6

Levels of serendipity in CHESTNUT and for benchmarks (Color figure online)

Fig. 7.
figure 7

Service time of CHESTNUT and for benchmarks

Fig. 8.
figure 8

Levels of accuracy in CHESTNUT and for benchmarks

Fig. 9.
figure 9

Unexpectedness breakdown within CHESTNUT

Fig. 10.
figure 10

Serendipity breakdown within CHESTNUT

Figure 6 shows that CHESTNUT could continue its dominant performance in serendipity, which follows the same experiment settings. As benchmark systems, item-based and user-based systems perform serendipity within the ranges of 0.05 to 0.08 and 0.3 to 0.4, respectively. Nevertheless, CHESTNUT still outperformed these conventional systems in serendipity performance. There are two interesting observations within this series of experiments. One is that, although the item-based approach could produce more unexpected results than the user-based, the user-based approach provided more serendipitous recommendations.

The other interesting fact is that serendipity performance degraded gradually, when applying CHESTNUT without optimization. However, optimized versions of CHESTNUT performed better scalability. More details of this observation, will be discussed in Sect. 6.2.

As for time consumption, more details are provided in Fig. 9. It is necessary to highlight that, in the item-based case, approximately 10,000 ms were required, on average. However, the user-based approach did achieve very good performance, by consuming 17.24 ms on average. As for CHESTNUT, although it is slightly slower than the user-based approach, it is still much faster than item-based implementation. All versions of CHESTNUT could finish the service between 59.85 and 74.34 ms on average, which supports the assertion that CHESTNUT’s performance is very competitive.

Finally, yet importantly, we have explored the accuracy of the recommendation results among the three systems. As their design goals, item-based and user-based approaches achieved 0.4804 and 0.4439 in MAE, which implies that they produce quite accurate results. However, for CHESTNUT, the results, irrespective of whether they are with or without the optimization, are less accurate than the benchmark systems.

6.2 Performance Breakdown

Based on Sect. 6.1, we observed the necessity to explore a performance breakdown analysis. We first examined the unexpectedness evaluations in detail. Different from previous settings, we took a closer view of unexpectedness performance, by narrowing the recommendation list size from 5 to 1,000 to 5 to 200. The most interesting observation is that, unexpected results were irrelevant to the optimization levels of CHESTNUT. As Fig. 9 shows, although there are variations in this metric, unexpectedness still remains over 0.992. However, we have found that significance weighting did not affect the unexpectedness performance at all, which indicates that the levels of optimization did not affect the performance of cInsight. This is because the threshold in cInsight served as the lower boundFootnote 11, and our optimization mainly aims to correct any extremely high similarities, which are caused by too small an intersection size between users.

However, optimizations do play a role in cUsefulness. To examine this in more detail, we maintained a very narrow view by setting the recommendation list size from 5 to 50. It has been observed that when a recommendation list size is smaller than 15, all optimized versions produce more serendipitous results than in the original version, although they were already very serendipitous. When the size is between 15 to 50, the situation was reversed. However, if we combined Fig. 10 with Fig. 6, the overall scalability of CHESTNUT is much weaker than the optimized versions.

This performance variation could be explained from two aspects. Since CHESTNUT could only make predictions within a small group compared to the other systems, and when there was no optimization, the predictions could be virtually high and this led to an obvious drift, as illustrated in Fig. 6 (the blue line). We believe that the most important benefit of optimization is that, it stabilizes serendipity performance and improves the scalability of the whole system, by improving the reliability of the similarity values.

6.3 The Tradeoff Caused by Optimization

Here, we have mainly focused on the tradeoff caused by Similarity Correction, since the other optimization aims to make CHESTNUT runnable. There are two main tradeoffs to discuss about.

First, there are some runtime overheads when values are corrected. As Fig. 9 shows, all optimized versions have a slight increase in the service time. As for the variations within these optimized versions, this is because if the correction rate were too high or too low, it would increase the computation difficultly and then cause overheads.

Second, we observed a very interesting situation. In the early investigations of significance weighting, researchers claimed that this approach was able to improve the accuracy of recommendations, and further investigation has supported that this very setting is effective [7,8,9]. However, optimized versions of CHESTNUT has conflicted with this. Figure 8 reveals a slight trend of accuracy loss, when the optimized levels were increased. We believe this is because of CHESTNUT’s characteristics. What has been improved, via this optimization, is the trustworthiness of the similarity values. Unlike accuracy-oriented systems, it cannot be equal to the accuracy in serendipitous systems.

7 Discussion

Our experimental study revealed two main points for further discussions. First, CHESTNUT has been proven that it is applicable to deploy the Information Theory-based algorithm, as an end-to-end recommender system which can induce serendipitous recommendations. Especially, while the recommendation size is less than 50, CHESTNUT has dominated the serendipity performance, with close to the upper bound in evaluations. Second, during the system implementation, it has been observed that CHESTNUT still needs optimizations via value corrections, to improve overall recommendation quality. Through revisiting and updating significance weighting concepts, CHESTNUT has been optimized to improve the overall scalability and serendipitous recommendation performance, because of the reliability of similarity values has been improved greatly.

8 Conclusion and Future Work

In this paper, we have presented CHESTNUT, a state-of-the-art memory-based collaborative filtering system that aims to improve serendipitous recommendation performance in the context of movie recommendation. We implemented CHESTNUT as three main functional blocks, corresponding to the three main metrics of serendipity: insight, unexpectedness and usefulness. We optimized CHESTNUT by revisiting and updating a conventional method “significance weighting”, which has significantly enhanced the overall performance of CHESTNUT. The experimental study demonstrated that, compared with mainstream memory-based collaborative filtering systems, CHESTNUT is a fast and scalable system which can provide extremely serendipitous recommendations. To the best of our knowledge, CHESTNUT is the first collaborative system, rooted with a serendipitous algorithm, which was built on the user-centred understanding from Information Researchers. Source codes of CHESTNUT is online at https://github.com/unnc-ucc/CHESTNUT.

The future work of CHSETNUT will focus on its extendibility. On the one hand, though CHESTNUT is not parameter-free, it wouldn’t be difficult to extend into different usage context (e.g. shop**, mailing and etc.) since parameters of CHESTNUT could be obtained through our previous implementation experiences. On the other hand, as mentioned in Sect. 4, the levels of connection-making still rely on our previous experience and function as thresholds, which is the major limitation for system extension. We would further study CHESTNUT’s effectiveness and its extendibility through a series of large-scale user studies and experiments.