Keywords

1 Introduction

At present, most organizations in the public or private sector own one or more websites for communication purposes [1]. The main goal of these applications is to provide the user with access to updated and relevant content [2] about the products and services of the organization that owns the property. User profiles, communication goals and content structure can change depending on the industry [3]: government websites can be focused with giving to the general population an understanding about administrative process and legislation, higher education institutions can be focused in providing information about their academic programs and admissions, banking with promoting of a portfolio of services and so on.

Because information driven websites are publicly available and do not have necessarily a captive audience, usability and user experience are critical factors for user retention and in consequence, to the project success [4]. By contrast, other web applications such as e-commerce, e-banking or intranets have registered users that have already been convinced of using the service provided by the organization that owns the application. Information driven websites are usually in the early stages of the user/customer acquisition process [5]. This means that generic approaches for user experience or usability evaluation won’t address the specific nuances that define success for web applications that are driven by information acquisition, brand representation, persuasion [6] and trust [7].

We understand usability and user experience by the following definitions, as specified in the ISO 9241 standard [8]:

  • Usability: An extension in which a system, product or service can be used by specific users to achieve determined goals with effectiveness, efficiency and satisfaction in the defined context of use.

  • User Experience (UX): Extends the concept of usability (effectiveness, efficiency and satisfaction) to the perception and responses resulting from the use and/or anticipated use of a product, system or service.

Because information driven websites are the sum of its parts (software, content, brand identity and visual design) they require to be evaluated at the user experience level including the following criteria: usability, content, navigation, aesthetics, performance, and the emotional response of the user after completing the task they wished to accomplish [9]. However, most of the times website development is treated as software project that addresses only the customization or creation of a Content Management System (CMS). CMS is a computer application that allows publishing, editing and modifying content, organizing, deleting as well as maintenance from a central interface [10]. This means both content and emotion can’t be evaluated because they are aspects that are out of the bounds of the software development process. Because the extent and nature of the content defines the website’s navigation [11] and the aesthetics of the visual design, this approach can lead to navigation issues and designs that are not suitable to the content they display, which in turn can result in usability issues because it’s hard for the user to accomplish an information seeking task. This can lead to user frustration, and impact the user’s perception of the brand, since he can translate the poor user experience he had while using the website as an attribute of the brand that owns it.

The systematic review presented in this paper seeks to recognize the state of the art in methods, tools and criteria used to evaluate user experience for information driven websites. The organization of this paper is as follows: Sect. 2 details the process performed for the systematic review (criteria and process of selection of studies), Sect. 3 shows the results obtained, and Sect. 4 the conclusions and future work.

2 Systematic Review

The systematic review presented below was executed according to the parameters provided by Kitchenham and Charters [12]. The activities carried out for the implementation of the review were: definition of research questions, definition of the search chain and inclusion and exclusion criteria, selection of primary studies, data extraction, and synthesis of results.

2.1 Research Questions

For the definition of the research questions we used the PICOC technique (population, intervention, comparison, results and context:

Criteria

Value

Population

Websites and portals

Intervention

Methods, tools and criteria for usability and user experience evaluation

Result

Identify de efficiency and effectiveness of the methods and tools used to evaluate the usability and user experience from existing studies

Context

Primary studies that present and evaluate the performance or new, existing or combination of existing methods, tools and criteria for the evaluation of usability and/or user experience in websites, microsites, portals and mobile websites. Studies must include the validation of the proposed methods at some level

The defined research questions are:

  • What methods, tools and criteria are used to evaluate user experience and usability in websites?

  • Which aspects of the user experience are considered for evaluation?

  • At what stage of development does the evaluation apply?

  • Efficiency and effectiveness: how satisfactory were the results obtained? How much did they cost in terms of time and resources?

2.2 Search Strategy

Search terms.

For the extraction of studies the criteria of population, intervention, result and context were considered.

  • T1 = framework OR tool OR technique OR groundwork OR approach OR scheme OR plan

  • T2 = user experience OR ux OR customer experience OR cx OR usability OR User centered design OR interaction design

  • T3 = website OR websites OR site OR Web page

The databases used for the gathering of information are: Web of science, IEE y Scopus

2.3 Study Selection

Papers that fall in the following categories were included:

  • Papers that present surveys, case studies or experiments of one or more methods or tools for the evaluation of usability or other aspects of user experience. The paper must include the description of the empirical validation process of the proposed method.

  • Empirical studies that show comparisons between two or more methods/tools or combination of them.

Documents with the following characteristics were excluded:

  • Tools and methods for mobile app usability/accessibility testing, since they are mostly focused in complex functionality/tasks.

  • Articles that condense previous knowledge, collections of best practices and recommendations that are not applied to a specific case, reflections upon existing metrics or models.

  • Tools and methods for usability/accessibility evaluation of web applications/websites which main goal is other than informative, for example, websites focused in ecommerce or eLearning.

  • Methods focused only in the requirement generation process, since they do not validate the method results in the context of a real project.

  • Studies focused only in the optimization of search processes or form submission.

  • Excluded studies that are only focused on accessibility for a specific group, for example, blind users.

  • Papers that focus on the software development process of applications for expert analysis.

  • Work in progress that describe the data gathering process (usually web logs), but are not applied to a specific case.

  • Complimentary tools proposed for the usability testing process (not fully described methods).

2.4 Data Extraction

32 results were obtained from the Web of Science database. Of these, 13 were relevant according to the selection criteria. 161 results were obtained from the IEE database. Of these, 32 were relevant according to the selection criteria. 46 results were obtained from the Ebsco research database. Of these, 20 were relevant according to the selection criteria. The final count of evaluated studies is 65. The following table shows the list of recovered studies, including the code that was assigned to each of them as part of the systematic review process (Table 1).

Table 1. List of the reviewed studies.

2.5 Synthesis Strategy

The studies were grouped following the following criteria:

1st research question: Method and tools

  • Method: Identifies the methodology approach selected for the study. Values are: User testing, Expert evaluation, Automated, Data mining. A paper can have more than one category.

  • Tools: Identifies the tools that were used for the research: card sorting, questionnaire, focus group, observation or the think aloud protocol.

2nd research question: Aspects of evaluation

  • Criteria: Specific criteria applied in the study. Values are: Accessibility, Usability, User experience, Content, Aesthetics, Information architecture, trust, emotion.

3rd research question: Stage in the development process.

  • Development phase: Stage in the software development process in which the research was applied.

4th research question: Efficiency and efficacy

  • Satisfaction obtained from the proposed method application.

  • Cost to apply the proposed research methodology in terms of people, time and money: low, medium, high.

  • Level of technical expertise required from the evaluator: low, medium, high.

Additionally, we observed the number of evaluated sites: Some approaches are at the level of proposal, which means they have been tested with very few websites.

3 Results

With regard to the method used, the results show that 47% of the selected studies used some form of user testing, and 27% used expert evaluation. 12% studies used a combination of user testing and expert evaluation, and 8% used data mining techniques that included content mining and pattern identification in links or content. The remaining 6% other methods that include automated tools focused in the evaluation of usability and accessibility (usually using a tool that implements the latest version of the Web Content Accessibility Guidelines - WCAG specification [13] that requires the input of the website url to perform a compliance analysis), analytics to identify navigation patterns or modifications in the software development process to incorporate tasks that prevent known usability issues. Remote testing was applied in some studies that implemented user testing to reduce costs in time and subject availability (Table 2).

Table 2. Methods used in reviewed studies.

With regard to the tools used, the results show that the most frequently used tool is questionnaires, which are used as a guide to give structure to the user testing process while applying interviews, focus groups and the think aloud method. Questionnaires can also be directly applied to the user as a data collection tool by itself (System Usability Scale - SUS) [14]. Heuristic evaluation was used as the tool of choice for expert evaluation. The most frequently used specification was the heuristics set proposed by Jacob Nielsen [15]. Other specifications used for the expert evaluation process were the Microsoft Usability Guidelines [16], custom measures derived from other knowledge field such as psychology (psychometric scales) or a combination and adaptation of an existing heuristics set with new measures proposed by the researcher.

Interviews were used as a complimentary tool, mostly to obtain information of aspects of the user experience that were difficult to measure because of their subjective nature (emotions, attitudes or trust), and also to explain user behavior in specific contexts.

The think aloud method was used in combination with direct observation and task completion. Focus groups were mostly used to discuss expectations and perceptions. Card sorting was used specifically to identify improvements in the navigation of the website, by proposing an optimized information architecture from existing terms. Web logs, data mining and clickstream analytics were used to identify patterns in user navigation. Eye tracking was used to examine fixations in existing visual designs. Benchmarking was used as a tool in the early planning moments to compare existing websites in an specific industry with the goal of defining usability requirements for the implementation or critical content. Paper and digital prototypes were used as a tool in both early planning and development phases. Other tools as webmaster emails, word prompts and psychometric scales were used scarcely as a complimentary to existing methods. One study proposed software to improve the efficiency of the usability process by means of providing an application that contains all the information generated during the evaluation process (Table 3).

Table 3. Tools used in the reviewed studies

With regard to the criteria used for the evaluation, the results show that usability is the most commonly evaluated aspect, followed by content, aesthetics and information architecture (navigation). User experience is mentioned as a research goal but it’s always decomposed in more specific aspects, usually the above mentioned content, aesthetics and information architecture, or custom measures proposed by the researcher conditioned to the website’s industry. Task completion is also frequently measured; however, it’s limited to tasks related to information finding using the proposed navigation or visual interface. Subjective criteria included in several studies are trust, emotion, engagement and persuasion. In some cases, industry specific adaptations are made to the evaluation criteria to allow focus on specific tasks related to the website’s communication goals or user profiles (Table 4).

Table 4. Criteria used to evaluate user experience and/or usability in websites.

With regard to the development phase, 82% of the selected studies were conducted when the evaluated website(s) is in the final stages of implementation or already published, 6% of the studies were made in the planning/requirements stage, and 8% in the graphic design stage. Only one study proposed an iterative methodology during the development phase. These proportions can be explained because the evaluation of the complete user experience of an information-driven website requires the evaluation of the published information, the visual design and the functioning navigation as a complete system. Since these aspects are not fully formed while the website is in process of being coded, studies that require an evaluation in the early phases of a web development project must resort to methods that evaluate existing websites in the same industry (benchmarking) or simulate the final product (prototypes). These methods do not guarantee that new user experience problems can appear in the finished product. Since the user testing process is expensive in time and resources it is logical that most studies are executed when the implementation is already complete, when the evaluation will yield the most complete set of information (Table 5).

Table 5. Project phases in which studies are conducted

With regard to the efficacy of the selected methods, the results show that researchers were most satisfied with the information obtained from expert evaluation, user testing, or a combination of both. Expert evaluation is proportionately under user testing because researchers are aware that it can generate blind spots caused by the expert’s familiarity with the website’s topics and structure. Data mining provides high results and low cost but does not explain why the user acted in a specific way, and requires a high level of technical expertise.

With regard to the efficiency of the selected methods, the most efficient is expert evaluation, followed by data mining and automated tools. User testing is the most expensive method. Modifications to reduce costs in application include the use of remote testing tools such as online surveys, and limitation of the collected data to predefined values, however this approaches tend to impact negatively the quality of the knowledge generated by the methodology (Table 6).

Table 6. Satisfaction and cost per methodology.

4 Conclusions and Future Work

This paper presents a systematic review conducted to identify the methodology, tools and criteria used to evaluate the user experience in information driven websites, and the efficacy and efficiency reported by the researchers after the application of the selected methodology. Papers that evaluated usability were also included because they included references to user experience evaluation. 65 studies were selected from 239. Empirical evidence was extracted from these studies, coded and aggregated.

We identified that the dominant methodologies are user testing and expert evaluation because of the quality of the obtained information. New methods proposed by researchers include data mining and automated tools to improve the data collection and processing process. Evaluation criteria can be general (compatible with all types of websites) or adapted according to the industry’s communication goals. After usability, content, information architecture, aesthetics and task completion are the most frequently used criteria for the evaluation. Balance between usability and aesthetics is seen as a compromise, especially since website owners require customized interactivity to differentiate themselves from other websites. Proposed methods and tools required that the evaluator is already familiar with user experience/usability and has some degree of technical competence (background in information technology, statistics or data science); however, tasks such as questionnaire application can be delegated to evaluators with less experience.

Most studies were conducted over already published websites because navigation, content and visual design are aspects that need to be included for a complete user experience evaluation. This also means that there is not an established methodology for user experience evaluation during the software development process of an information divan website. This does not imply that companies do not conduct this type of research in their projects, only that this type of knowledge is not registered in academic databases.

Further research can be developed in the following topics:

  • Differences in the user experience from recurring users and new users, since the information they would be interested in, and the expectations of the website could differ.

  • Usability/user experience evaluation in websites developed with agile methodologies.

  • Impact on user experience of pop ups windows, and areas reserved for display of different formats of advertising.

  • Differences in user experience between users of mobile version websites compared to responsive interfaces.