Background

Qualitative research can be defined as research that involves “the collection, analysis and interpretation of data that are not easily reduced to numbers; these data relate to the social world and the concepts and behaviors of people within it” [1]. Data from qualitative research can address certain types of significant questions that may not be answered by quantitative research methods, such as “how” and “why”a given intervention engenders its effects. Qualitative research is now widely used for a variety of purposes in the field of healthcare, for example, the identification of patients’ concerns, the manner in which people select and use healthcare services, and the circumstances under which healthcare interventions play a role in practice [2, 3].

Taking the merits of qualitative research into account, it has attracted the attention of guideline developers and is gradually becoming accepted to inform guideline recommendations, for example WHO (World Health Organization) has affirmed in its handbook for guideline development that qualitative evidence should be considered and used in the process of guideline development and the WHO Guidelines Review Committee (GRC) internet site also provides additional guidance on when and how to use qualitative research data to inform WHO guidelines [4]. Many professional scholars and researchers have also used qualitative research or evidence to conduct projects on the development and implementation of guidelines such as addressing questions about the values and preferences of relevant stakeholders (e.g., patients, caregivers, and the public), the acceptability and feasibility of the interventions and the influence of the interventions on equity and human rights [4,5,6,7,8,9]. This provides opportunities for qualitative research methodologists to be involved in the process of develo** guideline recommendations [10, 11] and exploring facilitators of and barriers to the guideline’s implementation [12].

As Lewin & Glenton said, qualitative research may be entering a new era of being used in the process of guideline development, and it is beneficial for decision making [13]. Our aim was to further understanding of the way qualitative evidence has been used in the process of the existing guideline development process, for example, whether qualitative evidence was retrieved or how many recommendations are supported by qualitative evidence. To achieve this we conducted a systematic search, a rigorous quality evaluation of guidelines, and comprehensive information extraction related to qualitative evidence in guidelines. We also performed content analysis for the purpose of providing clear views on the roles and functions of qualitative evidence in the process of guideline development.

Methods

The systematic review was performed according to the PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analysis) guidelines [14].

Criteria for guideline selection

We included guidelines focused on improving healthcare that met the following criteria: 1) the guidelines were primarily published in Chinese or English from January 1, 2011 to February 25, 2020. In 2011, IOM (Institute of Medicine) claimed that for a CPG to be trustworthy it needs to “be developed via a transparent process by a group of multidisciplinary experts (including patient representatives), screened for minimal potential bias and conflicts of interest, and supported by a systematic review of the evidence” [15]. This, which is the first statement of criteria for clinical practice guidelines, plays an important role in guideline development, so we chose it as the start date for retrieval; 2) the guidelines met the above mentioned IOM criteria; 3) the guidelines mainly focused on clinical questions, such as diagnosis, treatment or care for certain diseases or patients symptoms, to provide suggestions for healthcare staff or community health services; 4) qualitative research or qualitative evidence was used in the process of guidelines development; 5) if the guidelines were updated, only the most recent version of the guidelines were included. The guidelines were excluded, if they had the following characteristics: 1) the same guidelines had been repeatedly published in multiple journals; 2) the full texts of guidelines were not available.

Search strategy for guidelines

Relevant representative guidelines repositories, such as WHO, NICE (the National Institute for Health and Care Excellence), SIGN (Scottish Intercollegiate Guidelines Network), NGC (National Guideline Clearinghouse), RNAO (Registered Nurses’ Association of Ontario), and other databases, including three English databases (PubMed, Embase, Web of Science), four Chinese databases (China National Knowledge Infrastructure, CNKI; Wanfang Data; Chinese BioMedical Literature Database, CBM; and VIP Database for Chinese Technical Periodicals, VIP), were systematically searched from January 1, 2011 to February 25, 2020. The search strategy used MeSH terms, Title/Abstract and text words. Taking PubMed as an example, the retrieval strategy is shown in Fig. 1.

Fig. 1
figure 1

Search strategy on PubMed

Guidelines selection and data extraction

Three (C.L.,Y.X.S and J.Z) authors experienced in literature retrieval independently selected eligible guidelines. Three reviewers (D.D.L.,Y.C and C.F) extracted significant information from the guidelines and completed data extraction forms by means of reading the text content of the guideline, references and the online relevant attachments. The detailed process of data extraction is presented in Additional file 1. The forms included: (1) the basic characteristics of included guidelines (such as title, publication/update date, and developer); (2) how qualitative research or evidence was used in the process of the guidelines development (were experts proficient in qualitative research invited to be involved in guideline development group, was qualitative research used to identify clinical questions, was qualitative evidence retrieved; was this used to support recommendations; and was this applied when considering facilitators and barriers to recommendations’ implementation); (3) details of the methodology for qualitative research or evidence used in the development process of guidelines (such as qualitative research quality assessment tool, the quality of the primary qualitative research study used to formulate recommendations and the grade of recommendations supported by qualitative evidence).

We hypothesized that the development of guidelines using qualitative research or evidence would be relevant to these items in the forms. The hypothesis was based on related methodological literature, COnsolidated criteria for REporting Qualitative research (COREQ) checklists [16] and discussion between all authors with methodologists in evidence-based guidelines development who were willing to engage in dialogue with us. Another researcher (Y.H.J) examined the data extraction forms to make sure no errors had occurred.

Appraisal of included guidelines

Two researchers (Y.YW and D.H) independently evaluated the quality of the guidelines by using the Appraisal of Guidelines for Research and Evaluation (AGREE II) tool, which consists of 23 items under 6 domains involving scope and purpose, stakeholder involvement, rigor of development, clarity of presentation, applicability, and editorial independence [17]. Each item was rated from 1 to 7 points with 1 point for “strongly disagree” and 7 points for “strongly agree”. We summarized the domain scores individually and scaled the total of that domain, calculated by the following formula: (obtained score - minimal possible score)/(maximal possible score - minimal possible score) × 100% [17].

Statistical analyses

Descriptive statistics were computed for the scores for each AGREE domain. Data for each AGREE II domain were provided as medians and interquartile ranges (IQRs). Intraclass correlation coefficients (ICCs) were calculated to evaluate the agreement between two reviewers for each domain [18, 19]. When the ICC value was less than 0.4, the consistency between raters was poor; if the ICC range was from 0.4 ~ 0.75, the consistency between raters was moderate; and a value of ICC over 0.75 the consistency was high [20]. The data were analyzed using SPSS version 17.0 (SPSS Inc. Chicago, IL, USA) and R version 3.3.2 (R Foundation for Statistical Computing, Vienna, Austria) for Windows.

Results

Guideline identification and selection

The searches identified 10,245 discrete records, of which 449 were selected for a full-text review. Sixty-four guidelines were eventually included [21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84]. The flow diagram for the guidelines is shown in Fig. 2.

Fig. 2
figure 2

Flow diagram of guidelines identification and selection

Characteristics of included guidelines

As Table 1 shows, the sixty-four guidelines concentrated on different topics such as cancers, chronic pain and smoking, and were developed by NICE, SIGN, RNAO, WHO or other professional organizations. The majority of guideline developers used GRADE (the Grading of Recommendations Assessment, Development and Evaluation) criteria for grading of evidence and recommendations. When formulating recommendations, they considered the quality of evidence, the risk-benefit analysis of some interventions, supporting resources and stakeholders’ values and preferences. The number of recommendations ranged from 2 to 262. The largest number of recommendations supported only by qualitative evidence in each included guideline was 8 [68]. The largest number of recommendations supported by both qualitative and quantitative evidence in each included guideline was 23 [70]. The majority of recommendations were supported by qualitative evidence based on primary studies, a few on systematic reviews).

Table 1 The basic characteristics of guidelines included

Quality appraisal of the guidelines

The ICC values for all six domains were over 0.75, which indicated high consistency in the assessment results between the two raters.

As Table 2 and Fig. 3 show. The final domain scores ranged between 0% (domain 6 of 6 guidelines) [75, 77, 78, 81, 82, 84] and 96% (domain 6 of 11 guidelines) [21, 22, 25,26,27, 29,30,31,32,33,34]. When comparing the total domain scores, Domain 1 (Scope and Purpose) was ranked the highest with a median score of 83% (IQ 78–83). Domain 2 (Stakeholder involvement) and Domain 5 (Applicability) were ranked the lowest with median scores of 67% (IQ 67–78) and 67% (IQ 63–73) respectively. The median scores of Domains 3, 4, 6 (Rigour of development, Clarity of presentation, Editorial independence) were 71% (IQ 69–74), 72% (IQ 58–78) and 79% (IQ 75–83) respectively.

Table 2 Analysis of the included N-CPGs according to AGREE II (%)
Fig. 3
figure 3

The summary of scaled domain score over all included guidelines

The process of the guidelines development using qualitative research or evidence

As Fig. 4 shows, no guideline developers invited experts proficient in qualitative research to be involved in guideline development groups. 20% guidelines (13/64) used qualitative research to identify clinical questions [68, 71, 73,74,75, 77,78,79,80,81,82,83,84]. 83% (53/64) guidelines retrieved qualitative evidence [21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70, 75, 77, 81]. 86% (55/64) guidelines used qualitative evidence to support recommendations [21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70, 72, 75,76,77, 81]. And 19% (12/64) guidelines applied qualitative evidence when considering facilitators and barriers to recommendations’ implementation [55, 56, 60, 62,63,64,65,66,67,68,69,70].

Fig. 4
figure 4

The process of the guidelines development using qualitative research or evidence. a Experts proficient in qualitative research to involve in guideline development group. b Using qualitative research to identify clinical questions. c Retrieving qualitative evidence. d Using qualitative evidence to support recommendations. e Applying qualitative evidence when considering facilitators and barriers of recommendations' implementation

The methodology for evidence used in the guidelines development

As Table 3 shows, one guideline used qualitative research based on grounded theory, phenomenology [55]. 52% (27/52) guideline developers evaluated the quality of the primary qualitative research study using the CASP (the Critical Appraisal Skills Programme) tool or NICE checklist for qualitative studies [35, 38, 46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70]. No guidelines evaluated (0/18) the quality of qualitative evidence synthesis used to formulate recommendations. 17% (11/64) guidelines presented the level of qualitative research using the grade criteria of evidence and recommendation in different forms such as I, III, IV, very low [35,36,37,38,39,40, 42, 44, 73, 77, 81]. They were based on JBI, GRADE or adapted from SIGN or Pati D. A framework [35,36,37,38,39,40,41,42,43,44,45, 85,86,87] respectively. 28% guidelines (15/54) described the grades of the recommendations supported by qualitative and quantitative evidence in different ways such as “strong”, “good”, “B”, “C” or “D” and “weak” [21, 22, 24, 25, 27, 28, 30,31,32,33,34, 73, 76, 77, 81], which also complied with JBI, GRADE or adapted from SIGN and (or) Pati D. A framework respectively. But no guidelines (0/10) described the grade of recommendations supported only by qualitative evidence.

Table 3 The methodology for qualitative research or evidence in the process of included guidelines development

Discussion

Our review shows that the majority of the included guidelines were high-quality. Qualitative evidence was mainly used to identify clinical questions, support recommendations, and consider facilitators and barriers to recommendations’ implementation. However, the methodology still needs more attention, as there were, no experts proficient in qualitative research involved in guideline development group, no assessment of the quality of qualitative evidence synthesis and a lack of detailed reporting the level of qualitative evidence and its grade of recommendations’.

The summary findings of this review

The majority of the included guidelines introduced the overall aim of the guideline, the specific health questions, and the target population in tabulated form, bold, or using separate paragraphs. They described the gathering and synthesis of the evidence, gave details of updating and dealt with the language, structure, and format of the guideline recommendations.. However, the guidelines still had some noticeable shortcomings. For instance, a few guidelines did not describe the methods of formulating recommendations [74, 76, 82]; a few did not clearly introduce the different options for management of the conditions or health issues [76, 82]; a minority of guidelines did not give details of conflict of interest statements [75, 77, 78, 81, 82, 84]. In addition, although the majority of the guidelines stated that the guideline development group consisted of all relevant professional experts, and clearly defined the guidelines’ target users, a number of developers did not consider values and preferences of the target population [71, 78, 83, 84] or lacked adequate information on how they gained patients, doctors or other stakeholders’ views. And also the majority of the guidelines did not describe facilitators and barriers to their application in detail.

The methodological quality of qualitative evidence affects interpretation of its results. Unfortunately, while the majority of guidelines developers used qualitative evidence synthesis to formulate recommendations, they did not appraise confidence in each individual review, which resulted in some difficulties in explaining relevant themes or theories formulated in different articles. In addition, only three of the grade systems used, referred to single qualitative studies or synthesis of qualitative research as a level of the grade criteria of evidence and recommendation [35,36,37,38,39,40,41,42,43,44,45, 85,86,87]. The majority of guideline developers did not concentrate on the important influence of qualitative evidence on the grade criteria of evidence and recommendation.

Comparison of findings with prior research

When comparing our findings with similar relevant articles, lack of statements about conflict of interest, details on how to gain patients, doctors or other stakeholders’ views, consideration of facilitators and barriers to guidelines’ implementation are also common issues e.g. oncology CPGs [88], inflammatory bowel disease guidelines [89], nursing CPGs [90], guidelines for management of cholangiocarcinoma [91]. Our review firstly identified whether qualitative research or evidence had been used to obtain stakeholders’ values and preferences, and in identifying facilitators and barriers to guidelines’ implementation in the process of guidelines development. Other researchers also used qualitative research to explore practice gaps based on existing guidelines: Feyissa et al. used a semi-structured interview to assess contextual barriers and facilitators to the implementation of a guideline developed to reduce HIV-related stigma and discrimination (SAD) in the Ethiopian healthcare setting [92]; Lind et al. interviewed local politicians, chief medical officers and health professionals at acute care hospitals to investigate perceptions regarding guidelines for palliative care and identify obstacles and opportunities for their implementation in acute care hospitals [93].

In Addition, qualitative research is increasingly being recognised as having an important role to play in addressing questions relating to interventions or system complexity, and guideline development processes. As with our topic, other researchers have also focused on the methodology of involving qualitative research in the development process of guidelines. Flemming et al. provided guidance for the choice of qualitative evidence synthesis methods in the context of guideline development for complex interventions by using a best fit framework synthesis to address interactions between components of complex interventions; interactions of interventions with context and multiple (health and non-health) outcomes; using meta-ethnography to deal with sociocultural acceptability of an intervention [94]. In addition, Moore et al. also put forward designs and methods for the applicability of quantitative and qualitative evidence in guidelines including complexity-related questions of interest in the guideline, types of synthesis used in the guideline, mixed-method review design and integration mechanisms, observations, concerns and considerations [95].

Implications for guideline developers

The development of guidelines is a complex undertaking which needs a significant focus on its methodology. Based on our findings, we put forward some proposals for guideline developers, which may be helpful to improve their guideline’s quality. Firstly, guidelines developers can record and report details about how they reach agreement on recommendations and how they deal with possible disagreement when formulating recommendations and present different options for the same CQs with information on population characteristics or clinical situations for each option. Secondly, they can also develop a series of methods to avoid potential COI before the initiation of the guideline development project. Guideline developers may also obtain the target population’ views by interviewing stakeholders or extracting some relevant themes from existing qualitative data on the topic of interest. Finally, guideline developers should formally consider how to evaluate and grade single qualitative studies or synthesis of qualitative research into the grade system for guideline development prior to start-up of the guideline development project, and identify which factors influence the grade classification with the help of experts proficient in qualitative research. They should also select appropriate tools to appraise the quality of qualitative evidence such as CASP tool, NICE checklist for primary studies, GRADE-CERQual (Grading of Recommendations Assessment, Development and Evaluation-Confidence in the Evidence from Reviews of Qualitative research) for qualitative evidence synthesis, which is an approach for assessing how much confidence to place in findings from qualitative evidence syntheses in terms of four components (methodological limitations, coherence, adequacy of data, relevance) [13, 96].

Limitations and strengths

Our study has some potential limitations. Firstly, although we selected eligible guidelines by means of reading their text content, references and the online relevant attachments, we used a quick search strategy on the guideline development. We also used the filter capability when using Endnote to manage literature from databases. But because of the size of the task there may be selection bias because of unavailable guidelines published in government documents, books or other guideline publication platforms. Additionally, we did not specify how many guidelines were recommended, recommended with modifications, and not recommended, because AGREE II protocol states that no overall score is calculated to determine if a CPG is recommended or not recommended and the main focus of this article was the methodology for qualitative research or qualitative evidence used in guidelines development [17]. Nonetheless, there may be several advantages. Firstly, a systematic literature search was performed for screening eligible guidelines. Secondly, we discussed the potential effect of qualitative research or evidence on the AGREE II appraisal, and then put forward some suggestions on how to use qualitative research or evidence to improve the quality of future guidelines. Thirdly, this is the first attempt to systematically analyze the role of qualitative research or evidence in guidelines development based on published guidelines.

Suggestions for ongoing research

Qualitative research or qualitative evidence will be extensively used in the guideline development process in the future. There are three interesting topics needing further research. Firstly, when available data exists, this can be explored to provide more reliable conclusions related to the potential association between AGREE appraisal and the identification, incorporation and reporting of qualitative research by means of statistical methods such as non-parametric tests. Secondly, it will be interesting to compare the use of qualitative and quantitative data when formulating recommendations in guidelines, perhaps by matching guidelines on similar topics or key questions, and comparing those which did and didn’t use use qualitative evidence. Thirdly, exploring how qualitative research may be used to obtain the information related to conflict of interest will also be useful to inform guideline transparency. These topics are worthy of future exploration.

Conclusion

The majority of the included guidelines were high-quality. Qualitative evidence was mainly used to identify clinical questions, support recommendations, and consider facilitators and barriers to recommendations’ implementation. However, more attention needs to be given to the methodology, for instance, no experts proficient in qualitative research have been involved in guideline development group, there has been no assessment of the quality of qualitative evidence synthesis, and there is a lack of detail when reporting on the level of qualitative evidence and its grade recommendations’.