Background

Intensity-modulated radiation therapy (IMRT) is capable of improving the overall survival and long-term quality of life in patients with nasopharyngeal carcinoma [1, 2]. Patient-specific pre-treatment quality assurance (QA) is necessary for the implementation of IMRT [3], and it has been a consensus of the researcher community that patient-specific QA can be done by film dosimetry combined with ionization chambers measurement [4,5,6], or by a 2D/3D detector arrays test in a phantom to compare and validate the dose accuracy of the treatment [7,8,9,10]. Most of these pre-treatment QA use the ‘γ evaluation method’ for the result analysis, which is a composite analysis of distance-to-agreement (DTA) and dose difference (DD) [11,12,13].

The phantom measure-based γ evaluation method provides a quantitative analysis of the degree of agreement between the measured and calculated dose distributions. It can be used to confirm or evaluate if the treatment plan was delivered with sufficient accuracy based on patient-specific quality assurance. The AAPM TG-119 report [14] recommended action levels of 88% and 90% for composite and per field gamma passing rate GP (%) analysis, respectively. However, it only determines the ratio of points out of tolerance without giving any information about the spatial location of points that the dose deviated from in the origin plan, including the volumetric dose deviation for planned target volumes (PTVs) and organ at risk (OAR) of the patient [15]. Some of the recent research showed that γ passing rates of per beam planar IMRT QA did not predict clinically relevant dose errors [16], owing to a lack of correlation between the gamma passing rates (GP (%)and the volumetric dose errors in the anatomic regions-of-interest [17, 18]. It has, therefore, raised a question whether the patient OARs are safe or if the PTVs are covered by the prescribed dose when a higher passing rate is achieved.

Recently, a 3D dose reconstruction method was introduced in the IMRT QA; this method reconstructed the delivered 3D dose distribution on the patient CT image based on per beam measured doses. Olch [19] validated the software called 3DVH for 3D dose analysis of IMRT verification. In this study, the 3DVH was used to retrospectively analyze the 3D dose distribution of a group of NPC cases treated with IMRT at our center. Each treatment plan was validated with the pre-treatment 2D phantom QA and passed the 3 mm/3% GP (%) examination. The correlation between the 2D/3D GP (%) and the deviation in reconstructed DVHs were assessed as well.

Methods

Clinic data

The data of treatment plans for 30 NPC patients who finished IMRT treatment courses were randomized selected from our database and fully anonymized for the purpose of this retrospective analysis study. Of the total group of cases, 16 were males and 14 were females, with a sex ratio of 1.1:1. According to UICC 2009 staging criteria, there were 2, 19, 8 and 1 cases with stage II, III, IVa and IVc disease, respectively.

IMRT Planing

All the studied cases were treated with 9-field static IMRT using a linear accelerator (Synergy, Elekta AB, Stockholm, Sweden) with 1-cm MLC and a 6 MV photon beam. The primary gross target volumes (GTVnx), nodal gross target volumes (GTVnd), and clinical target volumes (CTV1 and CTV2) were delineated manually by radiation oncologists, and the relevant planning target volumes (PTVnx, PTVnd, PTV1 and PTV2) were generated by adding a set-up margin to the corresponding volumes in all directions according to the immobilization and localization uncertainties [20, 21]. The prescribed doses were 70 Gy to PTVnx, 60–66 Gy to PTVnd, 60 Gy to PTV1, and 54 Gy to PTV2, 5 times per week with a total of 30 fractions. The dose constraints for all PTVs were that over 95% of the PTV covered by the prescribed dose, The main constrained OARs included the spinal cord, brainstem, parotid gland, temporal lobes, and larynx. All planned dose distributions were optimized and calculated with an inverse treatment planning system (TPS) (Monaco V3.0 Elekta AB, Stockholm, Sweden) using the Monte Carlo (MC) algorithm. The calculation grid was 3 mm, and 3% statistic uncertainty was used.

Pre-treatment QA

All the 30 IMRT plans were validated with a 2D diode detector array (Mapcheck2, Sun Nuclear Corporation, Melbourne, FL). A QA plan was generated using a fractional treatment plan, and the dose distribution was recalculated in the QA phantom. The delivery of the QA plan was verified by a measurement using the diode array, and (3 mm/3%) GP (%) of greater than 90% was accepted for composite dose verification.

Review of 3D dose reconstruction

3DVH system

A commercialized 3D dose reconstruction system (3DVH, Sun Nuclear Corporation, Melbourne, FL) was used for the study, which can reconstruct 3D dose distribution in patients’ CT images based on the 2D dose distribution measured in the pre-treatment QA with a planned dose perturbation (PDP) algorithm [22]. The 3DVH software uses the dose differences between the 2D array measurement and the TPS dose calculation for each beam to produce the PDP files and then projects it back into the TPS calculated 3D dose distribution to reconstruct the delivered dose. For comparing the difference between the measurement and the original plan dose calculated by the TPS, interpolation is needed for the dose between the diode detectors of the 2D array. A so called “Smarterpolation” method is built in the 3DVH software to interpolate the measured dose to the same resolution and voxel size as the TPS calculation. The Smarterpolation estimates the dose changes in the neighborhood of every detector according to the high spatial resolution dose distribution calculated by the TPS and uses these changes to interpolate the measurement data [23, 24]. After importing the patient CT sets, RT plan, RT dose, and RT structures, the PDP files will be applied directly by the 3DVH system to perturb the planned 3D dose to produce a new 3D dose distribution in patients’ CT images, and evaluate clinically relevant dose discrepancies for each OAR or PTVs.

Reconstructed 3D dose analysis

Using the reconstructed 3D dose distribution, the following dosimetry related parameters were analyzed.

Gamma pass rate comparison

In this study, 2D GP (%) was retrieved from recorded patient QA data. A 3D dose verification review for each plan was done by the above-described PDP algorithm, and the delivered dose distributions were reconstructed on the patient CT images. The global and each organ-specific 3D GP (%) between the reconstructed dose distribution and original treatment plan were calculated using the 3DVH software. Three different criteria were used for analysis: 3 mm/3%, 2 mm/2%, and 1 mm/1%. The percentage dose differences were normalized to the global maximum dose. The GP (%) was calculated for all dose points over a threshold of 10% of the maximum dose, indicating that the detectors whose values fell within 0 to 10% would be excluded from the statistic.

DVH parameters comparison

To evaluate the actual delivered dose distribution and DVH deviation in patients, the reconstructed and original planned DVH parameters were compared for each of the PTVs and OAR, including: (1) dose coverage for PTVs: percentage target volume received at least 100% and 95% of the prescription dose, V100% (%) and V95% (%); minimum dose covered 98% and 95% of the target volume, D98%, and D95%; and mean dose in target volume, Dmean. (2) dose for OARs: D1cc of the spinal cord, brainstem, and temporal lobe (the maximum dose covering 1 cm3 volume of the organ); V60Gy (%) of the brainstem (percentage volume that received at least 60 Gy); V30Gy (%) and Dmean of the parotid gland (percentage volume that received at least 30 Gy dose and mean dose of the parotids); and Dmean of the larynx.

The percentage deviation (%) of the absolute dose and the DVH parameters were calculated using the following equations:

$$ \Delta D\left(\%\right)=\frac{D_{3 DVH}-{D}_{plan}}{D_{plan}}\cdot 100\% $$
(1)
$$ \Delta V\left(\%\right)={V}_{3 DVH}\left(\%\right)-{V}_{plan}\left(\%\right) $$
(2)

Correlation analysis of DVH deviation with gamma pass rate

Statistical correlation of DVH deviation (absolute value) and GP (%) was studied with the Pearson’s coefficient (r), calculated using the SPSS (19) software. The Pearson’s coefficient value of 0.8 was considered to be a significant correlation.

Results

Gamma pass rate comparison

For all studied cases, the GP (%) using three different criteria were evaluated for 2D, 3D, and organ-specific areas. Table 1 showed the average GP (%) of the 30 NPC cases; the maximum and minimum GP (%) values were also reported. Both the GP (%) using criteria of 3%/3 mm and 1%/1 mm for 2D planar phantom dose verification and the global 3D reconstructed dose verification were significantly different, based on the paired samples T test. Compared to the global 3D GP (%), the mean GP (%) was relatively lower in PTVs but relatively higher in the main OAR for the 3 mm/3% criterion. However, the GP (%) decreased a lot in both PTVs and some OAR when a stricter criterion (1 mm/1%) was used.

Table 1 The comparison of the 2D, 3D globe, and organ-specific GP (%) for 30 NPC cases with different gamma criteria

Reconstructed DVH

The average relative difference in the volumetric dose (DV) and dose volume (VD) between the 3D dose reconstruction and the planned dose ranged from − 2.93% to 0.02% for PTVs, and − 1.66% to 1.17% for OAR (Table 2). Although the average deviations were slight, clinically significant deviation was found in some individual cases. In Table 3, eight of the 30 cases were under-dosed with a discrepancy of − 5% in V70 Gy (V100%) of the PTVnx. One of the 30 cases received a 5% higher dose than the planned dose separately in D1cc of the spinal cord and the mean dose of the larynx. Fig 1 shows the two cases with the highest dose deviation in PTV and OAR, one with a largest negative deviation (− 15.66%) in V100% of the PTVnx and another case with a significant positive deviation (6.66%) in D1cc of the spinal cord.

Table 2 Relative deviations in DV and VD in PTVs and main OARs between the 3D reconstructed dose and planned dose
Table 3 Percentage of cases with clinically significant dose deviation (more than 5% decrease in the prescribed dose coverage of PTVs or increase in the planned dose of OAR)
Fig. 1
figure 1

Examples of two cases with clinically significant dose deviation, (a) Underdose of − 15.66% in V100% of the PTVnx, (b) Increase of 6.66% in D1cc of the spinal cord

Correlation analysis of DVH deviation and gamma pass rate

The results of statistical correlations between DV, VD, and GP (%), described by Pearson’s coefficient (r), are shown in Table 4. No obvious correlations (both criteria R > 0.8 and p < 0.05 were met) were found between all the DVH metrics and the global GP (%) got from the 2D QA measurement and the 3D reconstructed dose. In the measurement-based 3D dose verification, only the reconstructed D2% and the Dmean of the PTVnx showed a significant (p < 0.01) strong correlation with the organ-specific GP (%) for the PTVnx, when a Pearson’s coefficient value of 0.8 was used for the correlation evaluation. The plots of the correlation analysis with the R2 value is available in the additional figures files [see Additional file 1: Figures S1 to S10].

Table 4 Pearson correlation coefficient with three type gamma pass rate and DV, VD

Discussions

Phantom measurement and global GP (%) evaluation are widely accepted in the radiation therapy (RT) community as a routine IMRT QA procedure. According to the report of AAPM TG119, the 3 mm /3% criterion is suggested for this kind of verification. In this study, an average GP (%) of 96.4%, ranging from 89.1% to 99.7%, was achieved using the AAPM suggested criteria. However, the GP (%) significantly decreased using a stricter acceptance criterion, which is similar to the report of Benjamin E, et al. [16], although it did not reflect a volumetric dose deviation in the PTVs and OAR.

The results of the correlation analysis showed that all the coefficient values (r) were much lower than 0.8 for correlations between the global GP (%) and DV or VD for each of the PTVs and OAR. It indicated either no correlation or only very weak correlation existing between the global GP (%) and the deviation of DVH parameters. M. Stasi, et al. (17) have reported similar results in their study of 2 groups of IMRT cases (prostate and pelvic IMRT, and head and neck IMRT), wherein all coefficient values were smaller than 0.8, indicating a weak correlation between the GP (%) and the dose deviation.

In the organ-specific GP (%) analysis, the GP (%) of three different criteria all showed strong negative correlation with the deviation of mean dose in the PTVnx-specific evaluation. A coefficient value larger than 0.8, indicated that the higher the GP (%) in the PTVnx, the less the deviation in the mean dose of its volume. Also, the strength of the correlation coefficients (r) of the organ-specific GP (%) was higher than that of the global GP (%). These results are consistent with the findings of M. Cozzlino et al [18]. In their study of a group of RapidArc treatment plans for the prostate, on using the COMPASS system (IBA Dosimetry, Germany) to reconstruct the delivered dose distribution, a stronger correlation was observed between the organ-specific GP (%) and dose deviation rather than with the global GP (%).

A high global GP (%) did not always mean a high organ-specific GP (%) (e.g. target volume specific GP (%)), and vice versa, a low global GP (%) did not always indicate a low GP (%) in the specific organ volumes. As depicted in Fig. 2, the case on the left one showed a high global GP (%) which meet the QA criteria, but not ensured the clinical concerned dose errors within tolerance. In fact, a significant low-dose area was located in the PTVnx leading to a large reduction (12.8%) in the V70Gy, which might reduce local control of the treatment. The case on the right showed a relatively low global GP (%), but the dose error all distributed out of the gross tumor, high risk and critic structure areas.

Fig. 2
figure 2

Examples of two reviewed cases. The left one (a) had a high global GP (%) of 99.2% with a low dose region inside the target area (in blue color) yielded a poor PTVnx-specific GP (%) of 85.7%. The right one (b) had lower global GP (%) of 90.4% but 100% PTVnx-specific GP (%) and an increasing dose region in the lower neck (in red color). The gamma pass rate for all the cases was calculated using the 3 mm/3% criterion

M. Stasi et al. [17] observed that the measurement-based reconstructed delivery doses to the PTVs were all negative discrepant in their analysis of a group of cases of prostate and head and neck cancers using the same 3DVH system. M. Cozzolino et al [18] reported the discrepancy between the measurement-guided dose reconstruction using a 3D QA system (COMPASS, IBA Dosimetry, Schwarzenbruck, Germany) and the original plan, in which the actual dose could be 5% greater than the planned value in some cases. In our review study, the deviation of the reconstructed DVH from the planned values ranged between 6.66% and − 15.66%. There were 27% (8/30) of cases in which coverage of the prescribed dosage in the gross tumor volume (V70 Gy of the PTVnx) decreased by 5% or more, implicating the possibility of a potential effect on local control of the treatment, which was concealed during the pretreatment 2D phantom verification. In addition, there were two cases with > 5% dose increment in critical structures, separately in the D1cc of the spinal cord and in the mean dose of the larynx, compared to the planned doses. In the case of the largest dose increase, in the spinal cord, the planned D1cc was 47.019 Gy, and the reconstructed D1cc was 50.149 Gy which was already beyond our clinically tolerated dose. This big discrepancy in the dose should be noticed before treatment and carefully re-evaluated, especially in cases where the planned dose was close to the tolerated dose. Except for the above-mentioned cases, all other OARs showed a very small deviation in DVHs. A carefully review of DVHs of the PTVs and OARs revealed that these kinds of dose deviations could be overlooked when only global GP (%) evaluation is used in the pretreatment QA. Hence, a volumetric dose verification and evaluation might be needed in clinical practice by means of 3D dose reconstruction based on delivery measurement.

In this report of our study, the gamma pass rate (GP (%)) evaluation was based on the percentage dose differences normalized to the global maximum dose. This is good for the high-dose regions close to the target. However, for some organs at risk which are found in the lower-dose region, this normalization might underestimate the real difference in dose, and a local dose difference might be helpful for understanding the sensitivity of the GP (%) in some cases. For this reason, we also analyzed the GP (%) using local dose normalization and found that it was lower than that using global maximum dose normalization. Nevertheless, both GP (%) of global maximum dose normalization and local dose normalization had the similar results in the DVH correlation analysis, having no significant strong correlation with the DVH errors, except in the PTVnx-specific GP% and the DVH error (detail data is available in an additional table file [see Additional file 2: Table S1-S3]).

Although the measurement-guided 3D dose reconstruction method can be used to predict the actually delivery dose distribution on patient before IMRT treatment, the actual delivered dose distribution in patient, during the whole treatment course underwent a long period of time, may be affected by many factors such as the change in multi-leaf collimator (MLC) position accuracy, beam energy fluctuation, gross machine monitor (MU) errors, tumor shrink and anatomy changes, etc. The accumulated actual delivered dose distribution on patient will be interesting in our future work.

Conclusions

Traditional 2D Phantom QA and global GP (%) evaluation is not sufficient for ensuring the clinically accurate volumetric dose for IMRT treatment, as there is no strong correlation between the global GP (%) and percentage deviation in DVH of both PTVs and OAR, even when a strict 1%/1 mm gamma criterion was used. According to the results of our study, 3D dose verification and organ-specific GP (%) evaluation is a more effective QA method, and the PTVnx specific GP (%) has a strong negative correlation with the mean dose of the PTVnx. Although the IMRT treatment plan passed a 2D phantom-based dosimetry QA of GP (%) evaluation, there is still a potential risk of volumetric dose deviation, such as lack of dose coverage in the target or an overdose in the OAR. Three-dimensional dose reconstruction based on measurement and DVH verification are recommended for IMRT QA, rather than taking the GP (%) evaluation only.