Background

Differentiation between benign and malignant superficial lymph nodes (LNs) is of utmost importance and is considered as a diagnostic challenge. To date no specific criteria had been firmly assigned with conventional ultrasound (US) to provide a pertinent discriminative feature [1].

In the current literature, there is a growing interest to establish the role of Ultrasound Elastography to study the elasticity of tissues based on the fact that malignant lesions are harder than the benign. Stiffness can be assessed by measuring the displacement of examined tissue in response to freehand compression; referred to as strain elastography (SE) or by measuring the rate of propagation of high intensity focused beams within a region of interest (shear wave elastography with Acoustic radiation force impulse imaging (ARFI, Siemens Healthineers, Erlangen, Germany) [2].ARFI proved usefulness in liver and breast disease, however its use in cervical lymph nodes assessment is not widely practiced [3].

Many workers had been investigating the usefulness of a non-invasive sonoelastography to assess peripheral lymph nodes in adult or pediatric groups to avoid unwarranted biopsies [2, 4]

We herein try to examine the diagnostic performance of ARFI compared to SE to identify benign superficial lymph nodes with equivocal or worrisome features at B mode ultrasound.

Patients

Study population

This prospective case control study was conducted on all consecutive patients referred to the radiology department of our university hospital for ultrasound examination of their palpable superficial LNs between January 2021 and January 2022.

Methods

The ethical committee of our university hospital approved the study. We initially enrolled patients with LNs carrying at least one worrisome feature at B mode ultrasound (short axis ≥ 8 mm, Long/Short Axis < 2, absent or eccentric Hilum, or abnormal vascularity [either isolated peripheral or mixed peripheral and hilar]). Included LNs were additionally examined with elastography (with SE and by measurement of mean SWV (in m/sec) using ARFI examinations). At the same session, the LN of interest was subjected to ultrasound-guided core needle biopsy. When multiple LNs were identified in a patient, the largest and most accessible one for needle biopsy (as assumed by the operator) was chosen as a representative to be included in the study. When sonoelastography study was not feasible (i.e., LN in a location that does not allow adequate compression to get correct SE or with a size smaller than the measurement box of ARFI) or when the histopathological data of the biopsied nodes was lacking, the nodes were excluded from the study. We ended with 115 patients (78 men and 37 women) with a mean age of 37.5 ± 4.3 years (ranging from 13 to 65 years) with 115 LNs. All patients signed written informed consents prior to biopsy procedures including their approval for the possible anonymous usage of their data for research purpose.

Ultrasound imaging

Patients were lying supine with neck in extended position. All ultrasound examinations were performed by a single radiologist with more than 15 -years’ experience in B mode sonography and guided biopsies and 7-years’ experience in elastography. Siemens ACCUSON S2000 ultrasound system (Siemens Medical Solution, Mountain View, CA, USA) equipped with a linear transducer (9L4 multi-D-probe) with a bandwidth of 9 to 12 MHZ was used. SE was performed, using the eSie Touch elasticity where a region of interest was chosen to include the node and the surrounding fat to help as a reference. Elastogram (displayed as a color-coded image overlaying the B mode image background) and native B-mode US images were simultaneously displayed. Free-hand light external vertical compression was applied with the same ultrasound probe while the patient was asked to hold his breath and swallowing.

During compression, a numeric digital count was displayed on the screen. When it reached a value of 60 or more (as advised by the US system manufacturer), the compression level was considered as adequate. The corresponding color-coded image frame was then chosen for further interpretation.

SWV with ARFI measurements were obtained by applying a rectangular region of interest (ROI) with fixed predefined dimensions; 0.8 × 0.6 cm in different portions of the node, trying to avoid the fatty hilum or any necrotic parts. When a button labeled “Update” was pressed, SWV was measured in meter per second (m/sec) using the available Virtual touch quantification (VTQ) software (Siemens Medical Solution, Mountain View, CA, USA).

Image evaluation

Image assessment and data collection was performed by another radiologist with more than 10 -years’ experience who was blinded to the clinical data and histopathological results at the time of data collection.

On conventional US examination, the following data were recorded for every LN: Short axis, diameter ratio (Long /Short axis), hilum (either normal (central), or abnormal (eccentric or absent)) and the vascular pattern at Doppler study; either normal (central) or abnormal (central and peripheral or absent).

On SE, the findings were interpreted by visual assessment of the produced color-coded image. According to the available ultrasound machine scale, the red color presented the maximum stiffness; green for the softer tissues while blue was for the tissue of intermediate stiffness. We adopted the same 4-scale scoring system proposed by Bhatia et al. [5] as follow: scale 1 is predominantly green (red < 10% of the area colored), scale 2: the total red area is between 10 and 50%, scale 3: the red area is between 50–90% and scale 4 is predominantly red (> 90%). Scale 1–2 presented soft and moderately soft tissue stiffness and were considered as benign whereas scale 3–4 are moderately hard and hard in consistency and were considered as malignant [6]. Three ARFI measurements were obtained in different cross-sectional levels of the examined lymph node. The mean of the three readings was recorded.

Biopsy and histopathological analysis

The final reference in our study was based on the results of histopathological analysis of the biopsied LNs. At least 5 cores biopsies were obtained using 14-gauge semi-automatic core biopsy needle (Geotek, Geotek medical LTD; Ankara Turkey). Samples were preserved in 1% formalin solution and were sent for histopathological assessment.

Statistical analysis

Statistical analysis was carried out with SPSS (Statistical Package for the Social Sciences) software package, V20 (SPSS Inc., Chicago, USA). Data are expressed in the form of mean ± standard deviation. Comparison of categorical variables was performed utilizing the nominal data and was assessed by the Chi-Square or Fisher exact test. A p-value < 0.05 was considered as statistically significant. The diagnostic performance was evaluated by calculation of sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV) and accuracy. A cut-off value of different ARFI measurements that produced the maximum sensitivity and specificity to detect malignancy were calculated from the generated Receiver-operating curve (ROC).

Results

Based on the histopathological data of the enrolled 115 LNs, 78 (67.8%) were malignant and 37 (32.2%) were benign. Among the malignant group, there were 56 metastatic LNs (due to squamous cell carcinoma (n = 29), adenocarcinoma (n = 20), papillary carcinoma (n = 4), malignant melanoma (n = 3)) and 22 lymphomas (non-Hodgkin’s (n = 14) and Hodgkin’s (n = 8)). Among 37 benign LNs, there was 28 reactive adenitis and 9 were due to tuberculosis.

The most affected LNs groups were cervical (n = 81) followed by axillary (n = 28) and inguinal (n = 6).

Conventional ultrasound (B-mode and Doppler assessment)

At conventional ultrasound scanning, the short-axis diameter of LNs ranged from 6 and 16 mm (mean 9.9 ± 3.35 mm) in the benign group and from 5 to 20 mm (mean; 11 ± 5.3 mm) in the malignant group. The short axis and vascularity were the only factors that exhibited statistically significant difference between benign and malignant LNs (p < 0.001 for each). The entire results are summarized in Table 1.

Table 1 Diagnostic performance of conventional ultrasound

US elastography

At SE, a significant relationship between the elasticity scale and the histopathological diagnosis (p < 0.001) was noticed. Scale 1–2 was obtained in 75.7% [28/37] of the benign group (reactive and tuberculous LNs) and scale 3–4 in 74.4% [58/78] of the malignant (metastatic and lymphomatous LNs). Scale 4 and 1 were not recorded in any of the benign and malignant groups, respectively (Table 2).

Table 2 Correlation between strain elastography scores and final histopathological diagnosis

SE disclosed a correct diagnosis in 89.3% (25/28) of the reactive LNs (scale 1–2) and in 92.2% [52/56] of metastatic LNs (scale 3–4). Nevertheless, the elasticity scale was incongruous with the final histopathological diagnosis in 30 out of 115 LNs (26.1%), expressing scale 1–2 (benign feature) in 72.7% [16/22] of lymphomas and 7.8% [4/56] of metastatic LNs and scale 3–4 (malignant feature) in 66.7% [6/9] of the tuberculous LNs and 10.7%% [3/28] of the reactive LNs.

Overall, SE showed 74.4% sensitivity, 73% specificity, 85.3% PPV, 57.4% NPV and an overall accuracy of 73.9%.

When applying ARFI measurements; SWV ranged from 1.56 to 2.76 m/sec (mean; 2.49 ± 0.54 m/sec) in the benign group and from 1.91 to 6.5 m/sec (mean; 3.83 ± 0.87) in the malignant group. Mean ARFI measurements were 2.33 ± 0.43 m/sec at reactive LN, 2.77 ± 0.613 m/sec at the tuberculous, 2.75 ± 0.43 m/sec at the lymphomas and 3.43 ± 0.58 m/sec at the metastatic LNs. At the generated ROC curve, a cut-off value of ≥ 2.70 m/sec to predict malignant group (lymphoma and metastasis) was chosen to obtain best sensitivity (88.5%) and specificity (89.2%) (Area under the curve (AUC): 0. 819, 95% confidence interval (CI): 0.744 and 0.894) and the relationship between both groups was significant (p < 0.001). The PPV, NPV and accuracy were 94.5%, 78.6% and 88.7%, respectively. (Figs. 1, 2, 3 and 4).

Out of the 30 LN that were erroneously determined by SE, ARFI (at a cut of value of 2.7 m/sec) suggested correct diagnosis in 17 LN (56.7%) [5 benign and 12 malignant] while 13 LNs kept the same incorrect diagnosis (comprising 7 lymphomas, 2 metastases, 3 tuberculosis, and 1 reactive LN). (Figs. 5, 6).

Fig. 1
figure 1

Receiver Operating Characteristic (ROC) curve for different ARFI measurements showing a best cut off value to identify malignant LN ≥ 2.70 m/sec, (Area under the curve:0.819, 95% confidence interval (CI): 0.744 and 0.894)

Fig. 2
figure 2

Benign cervical lymph node correctly diagnosed by SE and ARFI. A Grayscale US image of LN showing absent echogenic hilum. B Corresponding SE image showed total red area less than 50% consistent with elasticity scale 2. C Acoustic radiation force impulse (ARFI) measurements showed SWV of 1.52 m/sec. D A low power view of reactive follicular hyperplasia in a sentinel LN showing a preserved nodal architecture and lymphoid follicles that are highly variable in size and shape. Margins of the follicles are sharply defined and surrounded a mantle layer of small lymphocytes and separated by abundant inter follicular tissue. (H&E, × 40)

Fig. 3
figure 3

Tuberculous LN correctly diagnosed by SE and ARFI as being benign. A Grayscale US image of cervical LN showing heterogeneous texture, long/short axis < 2 and lost hilum. B SE image total red area 10–50% consistent with elasticity scale 2. C ARFI showed SWV of 2.05 m/sec

Fig. 4
figure 4

Metastatic LN correctly diagnosed by SE and ARFI as being malignant. A Grayscale US image of small cervical LN with long/short axis < 2 and lost hilum. B SE image total red area more than 90% consistent with elasticity scale 4. C ARFI showed SWV of 2.92 m/sec

Fig. 5
figure 5

Metastatic LN erroneously diagnosed as benign by SE and correctly diagnosed as malignant by ARFI. A Grayscale US image of small axillary LN showing hypoechoic texture, lost fatty hilum with long/short axis < 2. B SE image total red area 10–50% consistent with elasticity scale 2. C ARFI showed SWV of 3.17 m/sec

Fig. 6
figure 6

Hodgkin ‘s Lymphoma, erroneously diagnosed as benign by SE and correctly diagnosed as malignant by ARFI. A Grayscale US image of cervical lymph node with homogenously hypoechoic texture reticulation; absent echogenic hilum. B SE image showed total red area 10–50% consistent with elasticity scale 2. C ARFI measurement showed a velocity of 3.31 m/sec. D Non-Hodgkin’s lymphoma (Follicular lymphoma), in LN showing effacement of the nodal architecture and replacement by nodules that are similar in size and shape, poorly defined and crowded (H&E, 40)

When compared with SE outcomes, ARFI had, boosted the diagnostic accuracy in reactive LNs from 89.3 to 96.4% and in lymphoma group from 27.3 to 68.2% (Table 3).

Table 3 Correlation of diagnosis obtained by SE and ARFI and the final histopathological result

Overall, the results of ARFI were better than SE and the best B-mode feature in depicting non-diseased LN (true negatives) with a specificity and NPV of 89.2% and 78.6% versus 73% and 57.4% for the SE and 78.4% and 65.9% for the best conventional ultrasound B mode feature, respectively. (Table 4).

Table 4 Diagnostic performance of SE scales and ARFI at cut-off value of 2.7 m/sec

Discussion

In the present study, we intended to evaluate the diagnostic usefulness of SE and SWV using ARFI measures to identify benign superficial LNs. Our main purpose was to avoid unwarranted biopsies. Using conventional ultrasound, short axis diameter was the most reliable predictive feature in our study with sensitivity, specificity and accuracy reaching 80.7%, 78.4% and 80%, respectively. Nevertheless, it had been demonstrated that the size could not be an absolute discriminative element, as many reactive or inflammatory LN could possibly be large in size and contrarily some malignant node with micro-metastasis might be small [18]. In our study, the enrollment of LNs with different histopathological nature had reduced our specificity rate to 73% when contrasted to the results of other series enrolling merely reactive and metastatic LNs reaching 95.6% [19].

From the above analysis, it is possible to anticipate that SE can differentiate between reactive and metastatic LNs; however, its usefulness to exclude lymphoma or TB is arguable. Thus, SE would not be sufficient to consistently overlook a biopsy for a LN with worrisome features in a scenario of accidentally discovered LNs.

Quantitative assessment of tissue stiffness by measuring the SWV using ARFI presents an objective way to overcome the main drawbacks of SE, being operator independent [2]. The mean SWV was significantly different between benign and malignant groups in our series, and we attained best sensitivity (88.5%) and specificity (89.2%) at cut-off value ≥ 2.70 m/sec to predict malignancy. ARFI increased the diagnostic accuracy among 56.7% of LNs erroneously diagnosed with the SE notably among the lymphoma group.

The overall performance of SWV using ARFI are very encouraging. In the current study, ARFI attained best sensitivity (88.5%) and specificity (89.2%) at cut-off value ≥ 2.70 m/sec. In the larger meta-analysis available to date, including 18 articles with total of 1666 LNs, ARFI showed overall sensitivity and specificity of 87% and 90%, respectively, to depict malignant superficial LNs at a cut off value of 2.85 m/sec. The same authors concluded that ARFI can be eligible to select suspicious nodes candidate for biopsy, to stage tumor and is useful in follow ups [20].

Different cut-off values had been proposed across the current literature, to discriminate between benign and malignant LNs, ranging from 1.16 m/sec [21] to 4.64 m/sec [22]. Our proposed cut-off value is nearly close to the result of a large series of 166 LNs, suggesting cut-off values of 2.68 m/sec with specificity and sensitivity of 81% and 81.6%[23].

ARFI improved the diagnostic accuracy of B-mode ultrasound in the series of Chanda et al. [24] with a boosted specificity from 69% with B mode alone to 99.7% when ARFI was added. Other workers examined ex vivo 374 cervical nodes and developed nomograms that integrates the B mode US features with ARFI. They showed an increased specificity (99.2%) and negative predictive value (95%) to identify metastatic cervical nodes in oral cancers [25].

In the current series, ARFI improved the diagnosis of lymphomas wrongly considered by SE as being reactive. Nevertheless, lymphoma could be sometimes soft and misdiagnosed with reactive nodes. Vinayagamani et al. [26] used a cut of value of 2.8 m/sec, failed to diagnose 5 lymphomas in their series and concluded that ARFI may not be useful alone to differentiate benign nodes from lymphomas. Similarly, diagnosis of tuberculous nodes poses a diagnostic challenge. They could be stiff and cannot be differentiated from metastasis. Fifty percent of the tuberculous LNS in the series of Cheng et al. [27] were wrongly diagnosed by ARFI. Chen et al. [22] proposed a higher cut of value of 2.97 m/sec to separate reactive from tuberculous nodes resulting in sensitivity of 92.9% and a specificity of 100%. Furthermore, they tried to discriminate lymphomas from metastasis at a cut off value of 7.3 m/sec with sensitivity and specificity of 88.5% and 81.5%, respectively.

We were not interested to disclose a cut-off value for this purpose, as we believe that a tissue biopsy will be required in both cases by oncologists and therefore a diagnostic distinction with ARFI will not have implication on the management plan. What was important for us was to diagnose reactive LN with confidence to avoid an unnecessary invasive biopsy. ARFI get us closer to our target with a better specificity and NPV as compared to both SE and B-mode ultrasound.

We showed a better performance of SWV as compared to SE in our series, nevertheless, various recent publication reached a conclusion that both SE and SWV are valuable to depict malignancy, and a combination of both would boost their diagnostic accuracy [28,29,30]

Limitations

Our research has several limitations. First, all ultrasound procedures were performed by only one operator, which might subject our interpretations and results to personal errors. Second, LNs from different locations were enrolled, this could potentially subject the amount of elasticity to some variability according to the surrounding background. Third, we didn’t incorporate those LNs that lack any worrisome features at the conventional ultrasound. This makes an important subject remained unanswered; regarding the ability of sonoelastography to depict early malignancy before any kind of architectural variations. Substantial research in this domain would be therefore needed to valorize the role of visco-elastic imaging as a whole.

Conclusions

In conclusion, SE could be sufficient to discriminate benign from metastatic LN; however, its usefulness to determine lymphoma and TB is arguable. ARFI at a cut-off value of < 2.7 m/sec offers an even better specificity and NPV to identify benign reactive LNs, reduce the likelihood of missing lymphoma and consequently could be helpful to lessen the need for unnecessary biopsies. Further larger multicentric studies are still needed to validate these data and suggest more standardized cut-off values.