Introduction

Prostate cancer (PCa) is one of the most frequently diagnosed cancers in men worldwide, and its prevalence continues to increase annually [1]. Thus, it is imperative to improve the accuracy of diagnosis for PCa, particularly for clinically significant PCa (csPCa) that requires curative treatment and active monitoring, so as to reduce the mortality due to malignancy [2]. As a serum marker, prostate-specific antigen (PSA) is a common clinical screening index. However, numerous trials have confirmed that this approach has the risk of over-diagnosis [3], since PSA is organ-specific but not cancer-specific. Clinically, patients with PSA levels > 10 ng/mL are highly suspected of having PCa, such that they necessitate a biopsy. In contrast, it is still debatable as to whether biopsies should be carried out in patients with PSA values in the range of 4–10 ng/mL [4], referred to as the “gray zone.” Notably, conducting biopsies in men with PSA in the “gray zone” may lead to over-diagnosis and over-treatment, as well as other negative effects, such as bleeding, genitourinary infections, and urinary retention [5].

With its morphological and various functional imaging modalities, multiparametric magnetic resonance imaging (mpMRI) of the prostate has been applied for the detection, localization, and staging of PCa, and for patient treatment planning [6]. Numerous studies have shown that using MRI prior to biopsy can diminish the detection of indolent prostate cancer while improving the accuracy of diagnosis for csPCa, thus leading to a reduction in unnecessary prostate biopsies [7, 8].

A meta-analysis published by Sathianathen et al indicated that the pooled negative predictive values (NPVs) of MRI for csPCa diagnosis with different combinations of negative mpMRI and csPCa definitions were satisfactory, ranging from 86.8 to 97.1%, suggesting that there was a reliable value for negative MRI in excluding non-csPCa patients [9]. However, this study did not analyze the efficacy of MRI in the context of the PSA gray zone, which is a challenging issue encountered in clinical practice. The diagnostic performance of prostate MRI has been widely studied recently in individuals with PSA levels of 4–10 ng/mL, albeit with high variability found among various centers.

Therefore, we conducted a comprehensive analysis of existing literature and performed a meta-analysis to investigate the performance of prostate MRI in patients with PSA levels of 4–10 ng/mL and explored the potential benefits of MRI in the management of patients in the PSA gray zone.

Materials and methods

This meta-analysis and systematic review (CRD: 42023473553) were reported in compliance with the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines [10].

Search strategy and selection criteria

Two authors systematically searched PubMed, Embase, Cochrane Library, Medline, and Web of Science for studies published from inception to October 31, 2023, with language restricted to English. The search strategy is detailed in Supplemental Materials (S-1). The titles and abstracts of all studies obtained through the search strategies were independently screened by two reviewers. The reviewers then read the full text of the articles to determine whether they appropriately satisfied the inclusion criteria.

Inclusion and exclusion criteria

Studies were considered eligible if they met the following criteria, applying the participants, intervention, control, outcomes, and study (PICOS) format.

  1. 1.

    P: men with elevated PSA levels in the range of 4–10 ng/mL.

  2. 2.

    I: patients who underwent MRI for assessing csPCa or PCa.

  3. 3.

    C: pathological results from radical prostatectomy or biopsy taken as the reference standard.

  4. 4.

    O: outcome indicators reflecting the true positive (TP), false positive (FP), false negative (FN), and true negative (TN), or the sensitivity and specificity of MRI diagnostic performance.

  5. 5.

    S: original articles.

The exclusion criteria included the following:

  1. 1.

    Articles that were unrelated to the field of interest of this study.

  2. 2.

    A PSA level not in the range of 4–10 ng/mL.

  3. 3.

    Data insufficient to construct a 2 × 2 table.

  4. 4.

    Non-original articles such as editorials, case reports, narrative reviews, meta-analyses, or conference abstracts.

  5. 5.

    Languages other than English or unavailable full text.

Data extraction and quality assessment

The following three types of information were extracted from the included articles: (1) demographic and clinical characteristics—i.e., number of patients, number of malignancies, age, and PSA level; (2) study characteristics such as publication year, study period, country, study design, blinding, Prostate Imaging Reporting and Data System (PI-RADS) version with its cutoff value, reference standard, and analysis (patient or zone); and (3) technical characteristics of MRI, such as magnet strength, vendor, MRI sequences, number of readers, readers’ experience, and coil. The quality of the diagnostic accuracy studies was assessed by implementing the Quality Assessment of Diagnostic Accuracy Studies-2 tool (QUADAS-2) [11].

The process described above was completed by two independent reviewers (L.L.X. and E.J.G., with 6 and 2 years of experience, respectively), and disagreements were resolved through discussion or by consulting a senior reviewer (S.H.).

Data synthesis and analysis

The heterogeneity of the results of the included studies was quantified using I2 statistic [12]. Cochran’s Q test with p < 0.1 indicated significant heterogeneity. The summary estimates of sensitivity and specificity, the combined positive predictive value (PPV) and negative predictive value (NPV), and their corresponding 95% confidence intervals (CIs) were computed using the bivariate random-effects model [13]. A hierarchical summary receiver operating characteristics (HSROC) [14] curve with a 95% confidence region and prediction region was presented graphically to illustrate our results and to show the amount of variation between the studies. The presence of publication bias was tested by applying the Deeks’ funnel plot, and statistical significance was determined by the Deeks’ asymmetry test [15].

The following categories were used in the subgroup analysis to investigate the sources of heterogeneity in the detection of csPCa.

  1. 1.

    PI-RADS version (PI-RADS v2.1 vs. PI-RADS v2).

  2. 2.

    Sequence (mpMRI vs. biparametric magnetic resonance imaging [bpMRI]).

  3. 3.

    Standard reference (TRUS-guided systematic biopsy [TRUS-SB] combined with cognitive MRI fusion-guided targeted biopsy [CMF-TB] vs. TRUS-SB).

  4. 4.

    Study design (prospective vs. retrospective).

The performance of MRI in detecting PCa was the secondary objective, with the sensitivities and specificities pooled. We then performed subgroup analysis and meta-regression based on whether or not the PI-RADS assessment was applied.

Statistical analysis was performed using Stata 14.0 (StataCorp LLC, College Station, TX, USA) with p < 0.05 considered the statistically significant difference. RevMan 5.3 (Cochrane Library) was implemented for processing the assessment of quality.

Results

Literature search

The detailed study selection process is presented in Fig. 1. A total of 19 studies with 3879 participants that met the inclusion criteria were chosen for the final analysis [16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,22, 28, 22,23, 28, 29, 31, 32, 34], six studies used 1.5-T scanners [20, 24,25,26,27, 30], one study used both [16]; one article did not provide relevant explanations [33]. The relevant elements are described in detail in the Supplemental Materials (S-4).

Table 3 MRI characteristics of the included studies

Quality assessment

In general, the quality of the studies was moderate, with 15 studies satisfying four out of seven items in the QUADAS-2 tool. The detailed quality assessment of the enrolled studies is depicted in Fig. S1, a detailed description is presented in the Supplemental Materials (S-5), and the specific evaluation results of each study are presented in Table S2.

Diagnostic performance of MRI for detection of csPCa

The pooled sensitivity of MRI for csPCa detection was 0.84 (95%CI, 0.79–0.88) and the pooled specificity was 0.76 (95%CI, 0.65–0.84) (Fig. 2a, b). The summary PPV and NPV were 0.62 (95%CI, 0.51–0.71) and 0.91 (95%CI, 0.87–0.93), respectively (Fig. 2c, d). The area under the HSROC curve was 0.88 (95%CI, 0.85–0.90) (Fig. 3). The Deeks’ funnel plot showed no evidence of publication bias, with a p value of 0.95 for the asymmetry test (Fig. 4). Heterogeneity was observed as indicated by the Cochran’s Q test (p < 0.01), with the I2 statistic denoting substantial heterogeneity in relation to the sensitivity (I2 = 90%) and specificity (I2 = 71%). The HSROC curve revealed significant differences between the 95% confidence and prediction zones, further highlighting the variability within the studies.

Fig. 2
figure 2

Coupled forest plot of pooled sensitivity (a) and specificity (b). Coupled forest plot of pooled PPV (c) and NPV (d). Numbers are the pooled estimates with 95%CI in parentheses. Corresponding heterogeneity statistics are provided at the bottom right corners. Horizontal lines indicate 95%CI

Fig. 3
figure 3

HSROC curve of diagnostic performance of MRI for csPCa detection

Fig. 4
figure 4

Deeks’ funnel plot. The likelihood of publication bias was low, with a p value of 0.95 for the slope coefficient. ESS, effective sample size

Subgroup analysis and meta-regression

The results of the subgroup analysis and meta-regression are presented in Table 4. The pooled sensitivities and specificities were significantly different between bpMRI and mpMRI (0.87 vs. 0.83, p < 0.01; 0.87 vs. 0.69, p < 0.01). The pooled sensitivity for PI-RADS v2.1 was significantly higher than that for PI-RADS v2.0 (0.88 vs. 0.81, p = 0.02). As for the standard reference, TRUS-SB in combination with CMF-TB revealed significantly high sensitivity compared with TRUS-SB alone (0.88 vs. 0.77, p = 0.02). The pooled sensitivity of prospective studies was lower than that of retrospective studies (0.83 vs. 0.85, p = 0.01).

Table 4 Subgroup analysis of the diagnostic performance of MRI for csPCa detection

Diagnostic performance of MRI for detection of PCa

The pooled sensitivity and specificity of the 13 studies [18, 19, 23,24,25,26,27,28,29,30, 32,33,34] for PCa detection were 0.82 (95%CI, 0.75–0.87) and 0.74 (95%CI, 0.65–0.82), respectively, with the area under the HSROC curve of 0.85 (95%CI, 0.82–0.88) (Figs. S2 and S3). Detailed information on secondary outcomes is depicted in the Supplemental Materials (S-6). We further conducted subgroup analysis on the diagnostic performance of MRI in the diagnosis of prostate cancer according to whether PI-RADS was used. The results are provided in Table S1.

Discussion

In this meta-analysis, we investigated the diagnostic efficacy of MRI in the detection of both csPCa and PCa among patients with PSA levels between 4 and 10 ng/mL. Generally, MRI demonstrated a favorable diagnostic performance for csPCa detection, with the area under the HSROC curve, sensitivity, and specificity of 0.88 (0.85–0.90), 0.84 (0.79–0.88), and 0.76 (0.65–0.84), respectively. The pooled NPV of prostate MRI for csPCa detection was satisfactory with the value of 0.91 (0.87–0.93). Regarding PCa detection, the pooled sensitivity and specificity were 0.82 (0.75–0.87) and 0.74 (0.65–0.82), respectively.

The pooled NPV for csPCa detection obtained in our study was excellent, indicating that there is sufficient certainty to exclude non-csPCa patients when the MRI result is negative [9], which would lead to a reduction in unnecessary biopsies. A previous study shown that in patients with a PSA < 10 ng/mL, the median NPV of MRI for overall PCa was 86.3% (IQR, 73.3%–93.6%), with a median cancer prevalence of 35.4% (IQR, 27.6–42.5%) [35]; this was similar to our research results. Similarly, the results of the study published by Xu et al showed that at a median PSA value of 4.65 (0.22–86.00) ng/mL and a csPCa prevalence of 42%, the NPV of mpMRI for csPCa detection was 87.8%, whereas the NPV of bpMRI for csPCa detection was 85.0% [36]. Although the NPV varied in the studies included in our analysis (ranging from 76–95%—potentially the result of the heterogeneous prevalence among them, which ranged between 18 and 67%), this indicator was generally satisfactory and supported the benefits of MRI in reducing unnecessary biopsies.

Compared with the excellent pooled NPV in our study, the pooled PPV of 0.66 (0.54–0.76) was less than ideal. Because prostate MRI is a screening tool and clinical priority is defined as not missing any significant cancer, we paid more attention to the NPV of MRI screening. However, the PPV also reflects vital clinical significance as it describes whether positive mpMRI consistently supports the presence of csPCa. Regarding the relatively low PPV for patients with a positive prostate MRI, additional clinical information may need to be considered before proceeding to a biopsy. A review by Schoots et al argued that multivariable risk prediction tools—including clinical, biochemical parameters and MRI suspicion scores—possess the potential to significantly reduce the number of biopsies and the detection of clinically insignificant prostate cancer; these tools will then assist doctors and patients in making appropriate biopsy decisions [37]. Among these modalities, PSA density (PSAD) is an important parameter in guiding biopsy decisions. In a systematic review by Wang et al, a quantitative risk assessment was performed that combined different PSAD cut-offs and MRI results to predict the occurrence of csPCa [38]. PSAD demonstrated complementary performance and predictive value, especially among patients with negative MRI and PI-RADS 3 or Likert 3 lesions. However, the diagnostic performance of bp-MRI combined with PSAD did not demonstrate statistically significant improvement in all evaluation schemes, according to the research by Cuocolo et al [39]. In light of these findings, the role of PSAD remains to be further investigated. We envision that incorporating prostate MRI, clinical factors, and possible biomarkers into the biopsy decision-making process will enhance the diagnostic accuracy and increase the confidence to avoid unnecessary biopsies in patients with PSA gray zone.

Considerable heterogeneity was observed among the included studies regarding the PI-RADS versions, standard reference, and study design. Compared with three studies that entailed bpMRI, eight studies with mpMRI produced relatively low pooled sensitivity and specificity. In consensus with our results, in the study by Han et al [21], the area under the curve value for bpMRI in csPCa detection (0.86) was significantly higher than that for mpMRI (0.82). In addition, several published meta-analyses in which the diagnostic effectiveness of bpMRI and mpMRI were compared, suggested that bpMRI exhibited performance comparable to that of mpMRI when diagnosing csPCa in men with any PSA level [36, 40]. As a standard acquisition protocol of mpMRI, DCE images are of limited value in prostate cancer detection according to the PI-RADS recommendations. A study reported by Messina et al [41] suggested that upgrading peripheral lesions with DWI Score 3 to PI-RADS 4 due to a positive DCE negatively impacted the accuracy of MRI and decrease the true csPCa detection rate of PI-RADS 4 lesions. Nevertheless, DCE remains valuable in cases when DWI and/or T2WI do not reach an adequate level of quality. The ESUR/ESUI expert panel has emphasized the importance of regularly monitoring and reporting MRI quality in clinical practice [42]. The PI-QUAL scoring system is a useful tool for standardized quality assessment and reporting, with a potential impact on patient care [43]. Ponsiglione et al found that the detection efficiency of extracapsular extension was significantly improved in high-quality mpMRI scans, with diagnostic accuracy improving from 0.564 in low-quality scans to 0.849 in high-quality scans (PI-QUAL ≥ 4) [44]. A study by Brembilla et al showed that in scans of suboptimal quality, the proportion of biopsies for PI-RADS 3 MRI rose by 18% while the detection rate of csPCa declined by 35%, confirming the potential impact of MRI scan quality on the performance of mpMRI relative to biopsy results [45].

The inherent disadvantages of setting bpMRI as a default approach should also be noted, including low reproducibility in community hospitals, lack of widely accepted imaging quality standards, and the impossibility of performing loco-regional staging [46]. However, early detection of csPCa by MRI is a priority. Given the analysis above, we postulate that bpMRI could serve as a potential substitute for mpMRI to optimize the clinical workup in men with PSA levels of 4–10 ng/mL, allowing us to avoid the extra expense and scan time as well as the side effects of contrast media. Moreover, there is an urgent need for the standardization of prostate bpMRI acquisition and reporting, and more robust validations of this imaging methodology should be carried out [47].

An additional significant factor that affected heterogeneity in the sensitivity was standard reference. Eight studies using TRUS-SB combined with CMF-TB as a standard reference produced significantly higher sensitivity than the other three studies that did not incorporate CMF-TB. A comparative study conducted by Elkhoury et al [48] revealed that the clinically significant prostate cancer detection rate (CDR) by systematic biopsy was 15.7%, while the CDR using cognitive fusion biopsy was 33.3% on a per-core basis. The reason behind our outcome may have been the greater CDR of CMF-TB. CMF-TB involves an operator who is cognitively aware of the obtained MRI interpretation and uses anatomic landmarks to target suspicious lesions on real-time transrectal ultrasound (TRUS) [49]. There has thus been an improvement in the diagnostic performance of prostate MRI commensurate with the increasing rate of cancer detection.

Li et al in a relevant review systematically evaluated the effectiveness of MRI and magnetic resonance spectroscopy (MRS) in detecting PCa and csPCa in patients with PSA levels in the gray zone before biopsy, as well as their applications in guiding prostate biopsy [50]. As one of the sequences of mpMRI, MRS imaging (MRSI) enables noninvasive assessment of certain metabolites in the prostate gland; however, it is not recommended in the latest version of PI-RADS. Thus, we did not include articles about MRS. In addition, our study is an adjunct to clinical practice, not only summarizing the diagnostic efficacies of prostate MRI for csPCa in patients in the PSA gray zone, but also further assessing the impact of prostate MRI on the decision-making of patients with PSA gray zone.

Our study has several limitations. First, the design of the majority of studies regarding csPCa detection was retrospective, which may have generated some bias in the patient selection domain. Second, there was significant heterogeneity among the studies, thus affecting the general applicability of our summary estimates. We conducted subgroup analyses and meta-regression to explore the potential factors underlying the heterogeneity, but a portion of heterogeneity remained unexplained. Finally, even though our study showed that MRI screening for csPCa detection in men with PSA levels of 4–10 ng/mL exhibited satisfactory diagnostic performance with good NPV and moderate PPV, the prevalence is an influencing factor that should be taken into consideration when conducting clinical decision strategies.

Conclusion

In this study, it was found that MRI could be considered a reliable and satisfactory tool to instruct clinical decisions for patients with PSA in the “gray zone,” particularly for csPCa detection. Furthermore, the high NPV of prostate MRI for csPCa detection indicates that negative MRI can reliably rule out the non-csPCa, sparing patients unnecessary biopsy.