Introduction

With the growing demand for medical care, health spending and pharmaceutical expenditure has increased rapidly over the past few decades [37,38,39,40,41,54,55,56,57,58,59,60,61,62,63,64,74,75,76,77,80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99,100,101,102,103,104,105,106] is presented in Table 1. These studies involved 20 drugs (14 targeted therapies, 5 biologics, and 1 chemotherapy) out of 42 negotiated oncology drugs. In terms of the number of authors, studies of 1 ~ 3 authors were 21 (26.25%), 4 ~ 6 authors were 36 (45.00%), and more than 6 authors were 23 (28.75%). These studies were published between 2009 and 2021. Figure 2 shows that the number of studies published has grown during this period. There have been more publications in recent years: 15 (18.75%) studies were published between 2009 and 2015, 31 (38.75%) between 2016 and 2018, and 34 (42.5%) after 2018. Most studies were published in English in international journals (46 of 80 articles, 57.5%). The affiliation of the first authors is mainly hospital (62, 77.5%), and most of the first authors (63.75%) are from the eastern region, followed by the western region (22.5%).

Fig. 1
figure 1

Flowchart of search results and screening process

Fig. 2
figure 2

Number of articles by year

Table 1 Basic information of the included studies

Of these studies, most of them (83.75%) conducted the cost-utility analysis, and only one article conducted the cost-minimization analysis. Most (82.5%) of studies used modeling for their analyses, and 23.75 and 47.5% were health insurance and healthcare system perspective, respectively. In addition, two-fifth (38.75%) of studies used the 6-10-year time horizon for analysis, 20% used the 1 ~ 5 year, and 18.75% used a more than 10-year time horizon. Most studies were funded by the government (40%). Non-small cell lung cancer (46.25%), colorectal cancer (16.25%), and hepatocellular carcinoma (10%) were the most common tumor type for these studies.

Figure 3 shows the proportion of the included studies scored as entirely adequate, partially, or not based on each CHEERS item. Several items demonstrated that less than half of the studies obtained full points, including the abstract, time horizon, discount rate, estimating resources and costs, currency, price date, and conversion, choice of model, heterogeneity, and discussion. In contrast, over 90% of the studies gave a clear title, setting and location, study perspective to estimate cost, and outcome indicators based on economic evaluation type.

Fig. 3
figure 3

Reporting quality of publications based on per items of the CHEERS checklist

Overall, the average CHEERS score of all articles was 17.68 ± 3.41 and ranged from 9.5 to 22.5 (Supplement Table 1). Converting to the 0-100 scale shown in Table 2, the average score of all articles was 74.63 ± 12.75 (range, 43.48–93.75). The average categorical scores for six main categories (title and abstract, introduction, methods, results, discussion, and other) of the CHEERS checklist were 85.00 (SD = 15.20), 80.00 (SD = 24.65), 76.97 (SD = 11.63), 69.69 (SD = 19.87), 67.50 (SD = 24.00) and 59.69 (SD = 42.93), respectively. The mean reporting score of all articles in the title and abstract was the highest (85.00), followed by the introduction section (80.00). In contrast, the other section reported the lowest mean scores, including the source of funding, and conflicts of interest items.

Table 2 Reporting quality scores of included studies based on CHEERS checklist

Table 3 shows the CHEERS scores of all articles by the article characteristics.The Chinese articles’ scores were significantly lower than those published in English (P < 0.001). There was a significantly rising time trend in reporting quality scores: 68.81 (± 12.12) between 2009 and 2015), 73.69 (± 14.15) between 2016 and 2018, and 78.06 (± 10.78) after 2018 (trend testing P-value = 0.045). Regarding the author numbers, the articles with fewer authors assessed lower scores than those with more than six authors (P = 0.013). Regarding the type of economic evaluation, the mean score of articles reporting a CUA was 78.43 (± 9.27), which was significantly higher than those articles reporting a CEA and CMA (P < 0.001). The mean score of articles that used modeling design was 78.47 (± 9.12), significantly higher than the articles using prospective study design 49.40 (± 3.00), or the articles using retrospective study design 59.39 (± 13.14) (p < 0.001). The studies that used a longer time horizon for analysis had higher scores than those articles with a time horizon of less than one year (P < 0.001). In terms of source of funding, studies funded by the pharmaceutical industry had the highest mean scores (85.16 ± 6.34), followed by the government (79.58 ± 10.11), and studies with not mention funding sources (63.72 ± 10.55) had the lowest scores (P < 0.001). There are significant differences in the mean scores among articles that used different study perspectives, and the study did not mention that the study perspective had the lowest scores (P < 0.001).

Table 3 Univariate analysis of reporting quality scores for included studies

Table 4 reports the influencing factors of CHEERS scores of included studies from regression analysis. Higher scores were associated with articles published between 2019 and 2021 year (P < 0.05) and English publications (P < 0.01). Studies without the disclosed source of funding and study perspective (P < 0.05) were statistically significant factors of lower scores.

Table 4 Influencing factors of reporting quality scores from Liner regression model

Discussion

To the best of our knowledge, this is the first systematic review to examine the reporting quality of economic evaluation studies focusing on the negotiated oncology drugs included in China’s NRDL for 2020. A drug price negotiation package should include an economic evaluation study. Transparent clear reporting and high-quality studies are essential for supporting decision-making in the process [107]. Furthermore, the Chinese National Health Insurance Administration does not disclose drug price negotiating dossiers, including economic evaluation evidence provided by manufacturers. We intended to review the currently available publications on this topic as a proxy for economic evaluation evidence from negotiations and evaluate them using CHEERS for reporting quality, and we hope to contribute to the renegotiation process in the future. The CHEERS checklist was one of the three most widely used quality assessment tools in pharmacoeconomic system review [108]. Many system reviews have used this checklist for quantitative assessment of economic evaluation since its publication [108, 109].

The overall mean score of reporting quality of economic evaluations in the present study was 17.68, and the scores ranged between 9.5 and 22.5, which showed less than 75% adherence to the CHEERS 2013 checklist. The CHEERS score was nearly the same as the reporting quality score of health economic evaluation research in India, Myanmar, Cambodia, and Laos, ranging between 17 to 17.8 [110]. Before our study, a study including pharmacoeconomic research from 2003 to 2014 in China reported a mean score of 18.7 assessed using the same CHEERS checklist, which had 1.02 higher than the score of our study [20]. The Jiehua Cheng et al. study showed that the average quality score of the included studies in China from 2006 to 2015 was 56.59 ± 16.90 [25], less than 74.63 ± 12.75 from our study. The reporting quality on China’s published economic evaluation studies of negotiated oncology drugs in 2020 latest NRDL may have been improved, but it is still lower than some studies.

The CHEERS scores can be divided into three categories: high quality for scores over 75, medium quality for scores between 50 and 75, and low quality below 50 [111, 25]. This difference may be because the editors of Chinese journals did not require authors to report standardized economic evaluations nor to supplement their details. For Chinese publications, the authors may be required to report each part of the economic evaluation based on China Guidelines for Pharmacoeconomic Evaluations. Furthermore, the Chinese standard checklist, similar to the CHEERS, could be developed to assess the Chinese economic evaluation studies.

There were some limitations in this study. Firstly, the CHEERS was intended to qualitatively evaluate the report quality of studies without specific rules for quantitative assessment [23, 119]. We may introduce a bias against publications that are not required to follow the CHEERS guideline. Secondly, some studies were published before the publication year of the CHEERS. In addition, the updated CHHERS 2022 was not used in our study because it had not been published at the time our study was completed. Moreover, compared with CHEERS 2013, the 2022 version contains additional content related to patients or service recipients, the general public, and community or stakeholder involvement and engagement; reporting and availability of a health economic analysis plan; and the description of distributional effects, among others [120]. These studies included in the article were also largely unreported. Finally, this study only assessed the report quality of included economic evaluation studies. Although this quality does not represent the quality of economic evaluation outcomes, it is also important to the decision-making process.

Conclusion

This study reveals moderate reporting quality of economic evaluations of negotiated oncology drugs listed in the 2020 NRDL. The number and reporting quality of economic evaluations of negotiated oncology drugs in mainland China have improved. However, most studies, especially those published in Chinese, do not fully report CHEERS items, significantly decreasing the studies’ transparency. Therefore, the reporting quality of economic evaluations conducted in mainland China should continue to improve. Also, the Chinese journals maybe explore introducing a reporting standard for economics evaluations, not only based on the CHEERS checklist.