Introduction/background

Gastric cancer (GC) is an aggressive tumor and it remains the third cause of mortality for cancer worldwide [1].

Even if The Cancer Genome Atlas (TCGA) has recently allowed a comprehensive molecular characterization of GC, heralding what could be eventual targets for future target-oriented therapy, at present, the therapeutic options and the prognosis of GC patients are still related to the tumor stage and to the modality of the spread of the disease [2, 3].

Endoscopic therapy, surgery, systemic chemotherapy, target therapy, radiotherapy, and loco-regional treatments are available with multimodal and multidisciplinary approaches according to the stage and the phase of the disease [4,5,6].

In particular, endoscopic mucosal resection/submucosal dissection is the preferred approach in very early, superficial cancers (T1a), whereas, in the early-stage cancers not suitable for endoscopic resection, surgical resection is considered the treatment of choice [7].

Total/distal gastrectomy (depending on the site of the tumor), plus regional D2 lymph-nodal dissection, associated with neoadjuvant chemotherapy, represents the standard treatment for locally advanced GC ( ≥ T3, any N or ≥ T2, N+) in Western countries [8,9,10,11,12,13]. Neoadjuvant therapy (either pre-operative or peri-operative) can improve the R0 surgical resection rate, reduce distant metastasis and recurrence rate, and improve survival of patients by tumor downstaging [1, 8, 14, 15].

In advanced unresectable/metastatic GC (35–40% of cases at the time of the first diagnosis), chemotherapy is still considered the standard treatment [1, 16]. In the last decade, the use of locoregional treatment is also increasing: in selected cases with peritoneal carcinomatosis (PC), radical gastrectomy associated with cytoreductive surgery and intraperitoneal hyperthermic chemotherapy (HIPEC) can be performed, with significant advantages in overall survival and peritoneal recurrence rates [17, 18]. Furthermore, the introduction of new anticancer agents and the development of polychemotherapy regimens have made macroscopic complete curative resection possible in some patients with metastatic or unresectable GC before therapy. This type of surgery, known as “conversion” surgery, is defined as a surgical treatment with the goal of R0 resection in initially unresectable GC patients after response to chemotherapy [19,20,21].

It is therefore crucial to correctly define the clinical stage of the disease and the pathway of dissemination (lymph-nodal, hepatic, peritoneal), including the distribution and the burden of the disease, in order to choose the most effective therapeutic path (up-front surgery versus chemotherapy treatment, whether it is for neoadjuvant, palliative, or conversion purposes), thus strongly impacting the prognosis of such patients. Furthermore, it is important to assess the response to neoadjuvant chemotherapy for properly choosing the timing of surgery, as well as the response to chemotherapy in metastatic or unresectable cancers to perform the conversion surgery.

The majority of the decision-making process reported above is largely driven by imaging and, in particular by CT that represents the workhorse in the imaging routine of GC patients [22, 23].

In this scenario and in the era of multidisciplinary and personalized medicine, the radiology report plays a key role to correctly address the flowchart of treatment and remains the main means of communication with clinicians; furthermore, the need for a uniform and standardized reporting scheme and language in imaging oncology has been welcomed by major scientific societies [24]. In the present study, the Italian Society of Medical and Interventional Radiology (SIRM) and the Italian Research Group for Gastric Cancer (GIRCG) have promoted a critical shared discussion between radiologists and clinicians (surgical oncologists), both experienced in GC, by means of multi-round consensus-building Delphi exercise, to develop a comprehensive focused structured reporting template for CT of patients with GC.

The objectives of the study were as follows: (1) the development of a comprehensive focused structured reporting template for CT of patients with GC (including those esophago-gastric junction tumors redefined as gastric cancer in the 8th edition of TNM [25]), taking into account the most relevant parameters according to the point of view of surgical oncologists and the assessment of the agreement among experts on the proposed criteria; (2) the standardization of the CT report for GC usable by the radiologist or the multidisciplinary team in a high-volume and reference centers for the treatment of GC; (3) the provision of a comprehensive and organized view of the GC report to radiology residents with educational purposes.

Methods and materials

Writing committee

Initially, a group of coordinators, composed of a radiologist (M.A.M.), a radiology resident (G.B.), and an internal student in radiology (I.C.), flanked by a statistician (F.F.), conducted a bibliographic search on various platforms about the Delphi method, structured reports, and CT for GC.

A four-member writing committee composed of a radiologist (M.A.M.), two surgical oncologists (G.M., D.M.), and an oncologist (R.P.), all with decades of experience in the diagnosis and treatment of GC, proposed a total of 24 “Delphi items”. These items were organized according to the broad categories of the structured report as suggested by the European Society of Radiology [26] (clinical referral, technique, findings, conclusion, and advice) and grouped into three different “CT report sections” according to the “diagnostic phase” of the radiological assessment for the oncologic patient: (1) staging (CT exam performed at the time of the first diagnosis and before any treatment to obtain a clinical TNM, “cTNM”), 9 items; (2) restaging (CT exam performed after a non-radical therapy, including neoadjuvant therapy to obtain a y-clinical-TNM, “ycTNM”), 9 items; (3) follow-up (CT performed after a radical therapy), 6 items. Thus, tracing the new TNM staging edition that presents separate classifications applicable for therapeutic strategy: clinical staging cTNM (prior to any treatment), pathological staging pTNM (after upfront surgery), and neoadjuvant pathologic staging ypTNM (after neoadjuvant treatment followed by surgery) [25].

The 24 items were discussed by the writing committee both by e-mail and throughout in-person and telephone meeting, until a shared agreement was reached. Points inside single statements that did not reach a complete agreement were included as additional suggestions.

Panel composition

An “expert panel” was then set up. The adequate number of members of the expert panel for the purpose of the study was stated, in accordance with the literature [27], to be equal to 20, evenly distributed between 10 radiologists and 10 surgical oncologists, recommended by the presidents of the SIRM and GIRCG (R.G. and F.R.), respectively. The invitations were individually e-mailed to the selected specialist, and anonymity was guaranteed for the entire Delphi process (during all three rounds). Adhesions received were 20 out of 20 experts, with a positive response rate of 100%. The Delphi survey consisted of 3 different rounds. The writing committee members did not participate in the Delphi survey.

Questionnaires and Delphi iterations

Three rounds of questionnaires were sent out and the anonymous responses of the expert panel were aggregated and anonymously shared with the group by coordinators after each round as a feedback. The coordinators also fixed thresholds and goals to be obtained in order to reach an adequate consensus for the proposed items and to include additional suggestions into a statement in the following round (Table 1). Only the items that did not reach adequate consensus during a round have been reformulated according to free comments/additional suggestions proposed in the same round by the expert panel and then resubmitted for the vote in the next round.

Table 1 Thresholds established to evaluate the items during round iteration.

Each round was administered through the Google Form survey platform. Questionnaires were sent with a maximum of 3 days of allowed delay for the response and with a gap of 15 days between the response collection and the next round; one week of time to send the feedback was given to the coordinators. Due to the COVID-19 emergency, days between the second and the third rounds have been prolonged to 90.

Figure 1 reports a detailed scheme of how the Delphi rounds and iteration were organized and operated.

Figure 1
figure 1

Delphi Iterations. The figure reports a scheme of how the Delphi rounds were organized and operated. Y = yes; N = not

In the first round (round 1), the original statements formulated by the writing committee were evaluated by the expert panel using a Likert scale ranging from 1 to 4, whereas additional suggestions (statements that did not reach a complete agreement in the pre-round phase) were voted using numeric range threshold as reported in Table 1. In round 1, the possibility to propose additional suggestions and add free comments was also given to the members of the expert panel.

Free comments suggested by the expert panel and collected in the first round were then directly considered together with the additional suggestions by the writing committee which have reached an appropriate consensus during round 1 (≥ 15–20 votes) in the formulation of round 2 items. Conversely, the additional suggestions proposed by the expert panel in round 1 and the additional suggestions by the writing committee, which have reached an intermediate consensus during round 1 (10-14 votes), were re-tabled for the vote in round 2. In round 2 and round 3, an agreement scale ranging from 1 to 10 was adopted.

Statistical analysis

The Delphi rounds were conducted using a Likert scale with 4 points in round 1 and 9 points in round 2 and round 3. Data were analyzed in terms of consensus, agreement, and stability in all rounds with the exception for round 1, where a Likert scale with 4 points was used in order to facilitate the lecture for the experts and let them concentrate on the contents. Consensus is intended as a degree of accordance between experts and it was expressed as the interquartile range (IQR) whereas agreement is intended as the degree of accordance with statements, expressed in terms of the median. Finally, the stability, intended as the coherence of subjects’ responses in successive rounds, was evaluated through a Wilcoxon matched-pairs signed-rank test (significant if p < 0.05 meaning there is no stability).

Results

The response rate from the expert panel in round 1 and round 2 resulted to be 100%, with the participation of 10 out of 10 radiologists and 10 out of 10 oncological surgeons. Nevertheless, in round 3, the response diminished to 95%, with 9 out of 10 oncological surgeons responding. Figure 2 summarized the number of items that reach the fixed threshold used during the Delphi iteration for each round.

Figure 2
figure 2

The number of items that reached fixed threshold during rounds iteration

Round 1

During round 1, 10 out of 24 items did not reach the threshold to be promoted as unchanged in the next round (80% of the rates of the expert panel equal to 4 on the Likert scale).

The item “clinical referral” in all the three sections of the CT report (staging, restaging, and follow-up) did not reach the threshold of ≥ 80% Likert 4/4 in round 1 and were considered not understandable by the expert panel in free comments; for that reason, during the data elaboration and item reformulation for round 2, the item was split into two different sub-items in each section of the CT report, the former listing the “clinical referrals” to be included in the final version of the report and the latter reporting “clinical information” the radiologist should know before performing the CT examination (Table 2).

Table 2 Clinical information the radiologist should know before performing the CT examination

Round 2

Since the clinical referral items for each section of the CT report (3 items: staging, restaging, and follow-up) were reformulated for round 2, it was not possible to evaluate its agreement and consensus during this round. Round 2 was then composed of 21 items to be voted. Among these 21 items, only the “conclusion” item of the follow-up section of the CT report did not reach the threshold (agreement: median ≥ 8 and consensus: IQR ≤ 2 ).

Regarding the “clinical referrals” items, for each section of the CT report, a very large cluster of statements were proposed to the expert panel during round 2, asking them to vote the appropriateness in order to properly incorporate each of them into the sub-items “clinical referrals” to be included in the final version of the report and the latter reporting “clinical information” the radiologist should know before performing the CT examination, excluding from “clinical referrals” those which did not reach at least 15 out of 20 approvals/votes.

Round 3

Tables 2, 3, 4, and 5 reported the final items to be included in the CT report and the result of Delphi iteration in terms of agreement, consensus, and stability.

Table 3 Final version of the items and results for the staging section of the CT report
Table 4 Final version of the items and results for the restaging section of the CT report
Table 5 Final version of the items and results for the follow up section

Round 3 was composed of 24 items to be voted. It was not possible to evaluate the stability for 4 items (“clinical referrals” items for each section of the CT report and “conclusion” item for the follow-up section) since they were reformulated in round 2.

Two items did not reach the fixed threshold: (1) the item “conclusion” in the staging section did not reach the goal for consensus (IQR > 2). (2) The item “lymph node status” in the restaging section did not reach enough stability between round 2 and round 3 (p < 0.05).

The “conclusions” item of each section of the CT report resulted to be the item with the lowest degree of agreement and consensus (although it resulted sufficient in the majority of cases); in more detail, the surgical oncologists had a lower degree of accordance than radiologists in each of the three sections (staging 7 ± 3 vs 8 ± 1, restaging 7 ± 3 vs 8 ± 1, follow up 8 ± 3 vs 8 ± 1).

Discussion

GC is a severe disease, often diagnosed in an advanced stage in Western countries, which can benefit from an aggressive and multimodal treatment that requires a complex decision-making process, the latter largely driven by imaging and in particular by CT [28].

At the time of staging, imaging is aimed at splitting patients who can benefit from an up-front surgery from those who need a chemotherapy treatment whether it is for neoadjuvant, conversion, or palliative purposes. Anyway, imaging is also crucial at the time of re-staging, after neoadjuvant or conversion therapy, in order to anticipate yc-staging in a pre-surgical time, thus giving the possibility to steer the patient’s subsequent treatment from a surgical approach alone to a possible multimodality approach [28,29,30,31].

Considering the complexity of GC patient management and in the era of personalized therapy, it is essential that the patient is addressed to a diagnostic and therapeutic work-up in a multidisciplinary team (MDT) through a collaboration between the radiologist and the medical and surgical oncologists. A prerequisite for this purpose is the expertise of medical staff related to the many aspects and issues of GC [4].

An international survey by the European Society of Oncologic Imaging revealed that most radiologists attend MDT, but less than half of them review clinical images in advance due to time constraints. The time required for a radiologist to review the images of a case reported by him/herself is different from that necessary to review a case reported by another colleague, or even multiple examinations performed in different hospitals. Furthermore, sometimes they are obligated to write an additional report which could influence the clinical decision-making of patients, independently of whether his/her opinion agrees or not with the previous report [32].

For this reason, a structured radiological report, shared between radiologists and clinicians (surgical oncologists), may reduce the radiologist’s time review and could improve the communication between the radiologist and the other members of MDT.

It is also crucial to note that the written radiological report is part of the patient’s permanent health record, and, as also stated by the European Society of Radiology (ESR), “the appropriate construction, clarity, and clinical focus of a radiological report are essential to high-quality patient care” [26].

Moreover, the dissemination of the structured report is increasing in oncology since many advantages due to the standardization of contents that allows comparing CT examination performed in different diagnostic times by different radiologists and hospitals and the realization of a CT report that contains all the necessary answers to the questions of the clinicians at different diagnostic times (staging, restaging, and follow-up).

According to the ESR good practice for radiological reporting, the broad categories of a structured report can be summarized as (a) clinical referral, (b) technique, (c) findings, (d) conclusion, and (e) advice, each of which is adequately described in the same document [26].

Regarding the category “clinical referral,” in round number 1 of our survey, there was no agreement about the clinical referral to be provided by clinicians to radiologists and those that radiologists have to include into the CT report on each section (staging, restaging, and follow-up); thus, the clinical information to be included into the report was separated from clinical information to provide to radiologists, choosing only the clinical referral deemed necessary (essential) and promoting these as clinical referral to be included into the report, thus easily accessible by all members of MDT [32].

Regarding the category “technique,” in GC, an appropriate methodology of CT exam execution is at least as important as a correct description of the findings. The distention and hypotonization of the gastric wall, together with the use of a late arterial contrast-enhanced phase, high kVp and mA, and a thin slice thickness, are essential technical requirements to improve the accuracy of T staging [14]. In the same way, an equilibrium phase, better if performed using the dual-energy technique, a new technology that is rapidly emerging in oncology, will dig up a PC without ascites or allow differential diagnosis between PC and fibrosis after treatment (Fig. 3) [33,34,35,36,37,38]. Both radiologists and surgical oncologists considered it essential to include the technique in the CT report with a very good agreement in all rounds for all sections of the structured report.

Figure 3
figure 3

a, b Peritoneal recurrence in a 54-year-old man with diffuse GC, who underwent cytoreductive surgery and HIPEC. Monoenergetic images at 40 keV (a) show a conspicuous vascularization of a nodule of PC (arrowhead) in the left external iliac site, next to an area of fibrosis (arrow) and without ascites. It is not possible to distinguish between PC and fibrosis at standard 140 kVp images (b)

The category “findings” in staging and restaging sections was subdivided into 6 sub-categories to better organize the CT report: T-parameter, N-parameter, peritoneal carcinomatosis, liver metastases, other metastases, and useful information for surgeons. Conversely, the categories conclusion and advice have been merged into a single category that has not reached the agreement (Tables 3, 4, and 5). In particular, in round number 1, surgeons did not accept that radiologists gave suggestions about the patient’s subsequent management or their therapeutic approach. The debate on this issue is based on the fact that if a suggestion about treatment type is explicated in the CT report and the surgeon makes a different choice (for example dictated by the patient’s comorbidity or poor compliance not fully known to the radiologist), the disagreement can lead to medico-legal problems in case of a negative outcome for that patient. Therefore, it is desirable that the main decisions are always taken by the MDT. Controversy has been solved in round number 2: “the radiologist should promote/recommend the discussion of the clinical case in a MDT” is the shared advise for staging and restaging sections of the CT report, whereas in the follow-up section of the CT report, the radiologist should suggest the most feasible type and site for biopsy in case a disease relapse is suspected [39,40,41,42].

Some reflections about the sustainability of the CT report have been carried on in the initial phase of the study by the coordinators. In particular, the proper balance between the scientific and practicability of the report has been carefully considered during the process of drawing the CT report itself. Anyway, the formulation of a structured report in GC patients, shared by surgical and medical oncologists and radiologists, is extremely useful in order to obtain an appropriate, clearer, and focused CT report essential to high-quality patient care. Moreover, such kind of report provides a dedicated checklist of findings to be mentioned at different diagnostic times (staging, restaging, and follow-up) of GC assessment. This dedicated checklist is extremely useful to experts as to inexperienced radiologists/residents who are approaching this disease, in order to write a more appropriate CT report avoiding the exclusion of key radiological information useful for multidisciplinary decision-making.

In conclusion, the complex decision-making process that underlies choices in GC treatment, largely driven by CT, well lends itself as a model for a CT report shared among experts. In this sense and in view of personalized medicine, there are advantages to push ahead a more uniform style and content of radiological report, with consequent benefits for the patients and the physicians involved in their treatment, as well as to make audit, research, and teaching easier.