Introduction

Spring ligament (SL), also called as plantar calcaneonavicular ligament, plays an important role as the main stabilizer of the foot internal arch with great repercussion in the hindfoot function1. In addition, secondary structures such as plantar fascia, superficial deltoid ligament and other plantar ligaments in conjunction with the SL may be considered as the responsible anatomic structures to maintain the foot internal arch. SL is extended from the calcaneus to navicular bones forming a sling that stabilizes the hindfoot2.

Indeed, SL injuries seem to be very common and related to the rupture of the ligament, leading to plantar flexion talus displacement and valgus hindfoot deformity, which may generate an adult acquired pes planus. Consequently, surgery may be required for this rupture after traumatic conditions or others factors such as degenerative disease, iatrogenic injury, infection or tumors on the hindfoot, which require a full understanding of the SL anatomy. In addition, SL reconstruction provides a good correction of the foot internal arch with main implications in the rehabilitation field. Accordingly, anatomic landmarks of the SL may be essential to play down surgery-associated lesions and deformities of the hindfoot3,4. Several studies have evaluated this anatomic structure by ultrasound imaging (US) and caliper measurements in cadavers5,6,7,8, although prior investigations have not yet analyzed the reliability for measuring SL dimensions using caliper and US to determine the anatomic correlation of the SL in cadaveric feet. Thus, given that there is existing literature correlating ultrasound measurements to clinical decisions and other imaging modalities5,6,7,8, absolute accuracy of the measurement needs to be determined due to the lack of studies about reliability and correlation between caliper and US measurements of the SL in cadaveric feet addressing width, thickness and length for a better accuracy of these evaluations and the improvement of ultrasound-guided procedures9. Both intra and inter-rater reliability needs to be detailed in order to determine absolute accuracy and repeatability of these measurements within a same evaluator and between both evaluators by both US and caliper10,11,12,13. Separately, both tools have shown appropriate reliability for SL dimensions measured by experienced raters, nevertheless US and caliper measures have not been compared as well as SL width dimensions have not yet been measured by US5,6,7,8. We hypothesized that caliper and US use may show an excellent reliability to analyze of the SL anatomic dimensions in cadaver. Thus, the study purpose was to evaluate the intra and inter-rater reliability between US and caliper measures to determine SL width, thickness and length in cadaveric feet at intra and inter-session.

Materials and Methods

Study design

A reliability study was carried out in order to determine the intra and inter-rater reliability between US and caliper measures detailing SL width, thickness and length in cadaveric feet at intra and inter-session. The Updated List of Essential Items for Reporting Diagnostic Accuracy Studies (STARD 2015) criteria were followed14.

Sample size calculation

The minimum number of specimens required was calculated based on reliability testing to determine reliability. In this study, the ICCs were used for reliability testing at a target value of 0.8 and a 95% CI of 0.2. We calculated the sample size to be 36 specimens with a Bonett’s approximation15.

Ethical aspects

This research was approved by the local Research Ethical Committee in the University of Rey Juan Carlos (URJC), in the town of Móstoles, province of Madrid (Spain), with internal code 0801201800618. In addition, all methods were performed in accordance with the relevant guidelines and regulations. Consent for this study was previously obtained from the anatomy department.

Cadavers and embalming method

Sixty-two feet from formaldehyde-embalmed human cadavers, 8 males and 26 females, without presence of any type of trauma were recruited in our research protocol9. The mean (SD) age was 76.46 (6.46) years; range from 66 to 89 years old. The human cadaveric feet comprised 30 right and 32 left feet. The adult cadavers came from the Scientific Anatomy Center, S.L. in the town of Valencia (Spain) in the town of Valencia (Spain). Scientific Anatomy Center, S.L. which included informed consent as part of the cadaver donation process.

The used preservation method for embalming the cadavers was perfusion through the femoral artery with a blending of formaldehyde, ethanol, methanol, phenol and glycerine that improve the longevity of the body and tissues, reducing the infection risk16.

Ultrasound measurements

US images were recorded by a Mindray Z6 Digital Ultrasonic Diagnostic System (Shenzhen Mindray Bio-Medical Electronics Co., Ltd, Shenzhen, China) by using a linear transducer type L4-P with a frequency bandwidth range of 5–10 MHz.

All human cadaver feet were located at the same immobilized neutral position. Then, two independent and experienced musculoskeletal podiatrists (with at least 5 years of musculoskeletal US experience) collected the US measurements to determine the width, thicknesses and length (cm) of the SL in cadaveric feet (Fig. 1).

Figure 1
figure 1

Ultrasound measurements of the Spring Ligament for length (A), thickness (B) and width (C) dimensions in cadaver foot. Abbreviations: N, Navicular; ST; Sustentaculum Tali; T, Talus; TP, Tibial Posterior tendon. Green arrows showed bone references of Navicular and Sustentaculum Tali for length measurements. Red arrows showed Spring Ligament references for thickness and width measurements.

Caliper measurements

Thereafter, the foot cadaver dissection was carried out in order to expose the SL for its measurement with a digital LCD caliper (BURG-WÄCHTER KG, Wetter, Germany) with the subtalar joint foot in neutral position. Two podiatrists recorded the width, thicknesses and length (cm) of the SL in cadaveric feet with this device (Fig. 2).

Figure 2
figure 2

Caliper measurements of the Spring Ligament for length (A), thickness (B) and width (C) dimensions in cadaver foot. Red arrows showed Spring Ligament references for thickness and width measurements.

Reliability study protocol

After two days, the protocol design was repeated identical to the first session of measure. The values of the measurements from 1st and 2nd sessions as well as 1st and 2nd observers were used to analyze the intra and inter-rater reliability at intra and inter-session. The podiatrists did not have access to the information records of the 1st session until recorded values were registered after the 2nd session.

Statistical analyses

Statistical analyses were carried out by the statistical package of SPSS 19.0 software for windows (SPSS Inc., Chicago, USA). First, the Kolmogorov-Smirnov test was used to assess normality. All variables were parametric data due to a normal distribution was shown (according to a P-value > 0.05 of the Kolmogorov-Smirnov test). Second, mean ± standard deviation (SD) as well as upper and lower limits for 95% confidence interval (CI) were used in order to describe all data. Finally, differences between two measurement values were analyzed by the Student t test for paired samples.

Reliability between two measurement values was determined by the Intraclass Correlation Coefficient (ICC) and Pearson´s correlation coefficient (r). Indeed, ICC values were interpreted as poor (ICC < 0.40), fair (ICC = 0.40–0.59), good (ICC = 0.60–0.74), and excellent (ICC = 0.75–1.0)17. In addition, r coefficient values were categorized as weak (r = 0.00–0.40), moderate (r = 0.41–0.69), and strong (r = 0.70–1.00)18.

The 95% limits of agreement (LoA) between sessions and devices expressed the degree of error proportional to the mean of the measurement units, and these statistics were calculated using the methods described by Bland and Altman11. If the differences between the measurements tended to agree, the results were close to zero.

Standard errors of measurement (SEM) were calculated to measure the range of error of each parameter. The SEM was calculated from the ICCs and SDs for each of the three measurements. SEM were calculated according to the formula SEM = SD × sqrt (1 − ICC). Indeed, the minimum detectable change (MDC) was calculated from the SEM values by the following formula MDC = \(\sqrt{2}\times 1.96\times SEM\) at a 95% CI which reflected the magnitude of change necessary to provide confidence to be sure about these changes were not the result of random variations or measurement errors. Both SEM and MDC were analyzed according to Bland and Altman12. Furthermore, values of normality (VN) of the sample for all outcome measurements were obtained by the formula VN = Mean + /_1.96 * SD.

Finally, Bland-Altman plots11,12 were calculated to display the agreement between US and caliper. These plots showed the difference between each pair of measurements on the y-axis against the mean of each pair of measurements on the x-axis. A P-value < 0.05 with a 95% CI was used for the data analysis.

Results

Analysis of reliability of the SL morphology by US between the first and second session by first observer (Table 1) showed excellent intra-rater (ICC(1-1) = 0.992–1.00) and inter-rater reliability (ICC(1-1) = 0.997–0.999) with a strong correlation (r = 0.994–0.998; P < 0.05) for length and thickness measurements. Nevertheless, poor to good intra-rater (ICC(1-1) = 0.545–0.612) and inter-rater reliability (ICC(1-1) = 0.279) with a weak non-significant correlation (r = −0.124; P > 0.001) was shown for width measurements. In addition, there were not statistically significant differences (P < 0.05) between sessions.

Table 1 Analysis of reliability of the Spring Ligament dimensions by ultrasound between the first and second session by first observer and normalized values.

Analysis of reliability of the SL dimensions by caliper between the first and second session by first observer (Table 2) showed excellent intra-rater (ICC(1-1) = 0.875–1.00) and inter-rater reliability (ICC(1-1) = 0.958–0.996) with a strong correlation (r = 0.922–0.992; P < 0.001) for length, thickness and width measurements. In addition, there were not statistically significant differences (P > 0.05) between sessions.

Table 2 Analysis of reliability of the Spring Ligament dimensions by caliper between the first and second session by first observer and normalized values.

Analysis of reliability of the SL dimensions by US between the first and second session by second observer (Table 3) showed excellent intra-rater (ICC(1-1) = 0.987–0.999) and inter-rater reliability (ICC(1-1) = 0.995–0.998) with a strong correlation (r = 0.991–0.996; P < 0.001) for length and thickness measurements. Nevertheless, poor to fair intra-rater (ICC(1-1) = 0.276–0.540) and inter-rater reliability (ICC(1-1) = 0.213) with a weak non-significant correlation (r = 0.124; P > 0.05) was shown for width measurements. In addition, there were not statistically significant differences (P < 0.05) between sessions.

Table 3 Analysis of reliability of the Spring Ligament dimensions by ultrasound between the first and second session by second observer and normalized values.

Analysis of reliability of the SL dimensions by caliper between the first and second session by second observer (Table 4) showed excellent intra-rater (ICC(1-1) = 0.877–1.00) and inter-rater reliability (ICC(1-1) = 0.996) with a strong correlation (r = 0.936–0.993; P < 0.001) for length, thickness and width measurements. In addition, there were statistically significant differences (P < 0.05) between sessions for length, but not for thickness or width measurements (P > 0.05).

Table 4 Analysis of reliability of the Spring Ligament dimensions by caliper between the first and second session by second observer and normalized values.

Analysis of reliability of the SL dimensions by first observer between US and caliper measurements (Table 5) showed excellent intra-rater reliability (ICC(1-1) = 0.877–0.978) with a strong correlation (r = 0.805–0.957; P < 0.001) for length and thickness measurements. Nevertheless, poor intra-rater reliability (ICC(1-1) = 0.207) with a weak non-significant correlation (r = 0.127; P > 0.05) was shown for width measurements. In addition, there were inter-session statistically significant differences (P < 0.05) between US and caliper measurements for thickness and width, but not for length measurements (P > 0.05).

Table 5 Analysis of reliability of the Spring Ligament dimensions by first observer between ultrasound and caliper measurements and normalized values.

Analysis of reliability of the SL dimensions by second observer between US and caliper measurements (Table 6) showed excellent intra-rater reliability (ICC(1-1) = 0.862–0.996) with a strong correlation (r = 0.781–0.993; P < 0.001) for length and thickness measurements. Nevertheless, poor intra-rater reliability (ICC(1-1) = 0.232) with a weak non-significant correlation (r = −0.104; P > 0.05) was shown for width measurements. In addition, there were inter-session statistically significant differences (P < 0.05) between US and caliper measurements for thickness, but not for length and width measurements (P > 0.05).

Table 6 Analysis of reliability of the Spring Ligament dimensions by second observer between ultrasound and caliper measurements and normalized values.

Analysis of reliability of the SL dimensions by US between inter-session first and second observer (Table 7) showed excellent inter-rater reliability (ICC(1-1) = 0.938–0.994) with a strong correlation (r = 0.893–0.989; P < 0.001) for length, thickness and width measurements. Nevertheless, there were inter-rater statistically significant differences (P < 0.05) between first and second observer for width measurements, but not for length and thickness measurements (P > 0.05).

Table 7 Analysis of reliability of the Spring Ligament dimensions by ultrasound between inter-session first and second observer and normalized values.

Analysis of reliability of the SL dimensions by caliper between inter-session first and second observer (Table 8) showed excellent inter-rater reliability (ICC(1-1) = 0.825–0.998) with a strong correlation (r = 0.725–0.998; P < 0.001) for length, thickness and width measurements. In addition, there were not any inter-rater statistically significant differences (P > 0.05) between first and second observer for length and thickness, width measurements.

Table 8 Analysis of reliability of the Spring Ligament dimensions by caliper between inter-session first and second observer and normalized values.

Analysis of reliability and correlation of the SL dimensions between inter-session US and caliper measurements for both observers (Table 9) showed an excellent inter-rater reliability (ICC(1-1) = 0.911–0.966) with a strong correlation (r = 0.852–0.937; P < 0.001) for length, thickness and width measurements. In addition, there were not inter-session statistically significant differences (P > 0.05) between US and caliper measurements length, thickness and width measurements.

Table 9 Analyses of reliability and correlation of the SL dimensions between inter-session US and caliper measurements for both observers.

The LoA (95% CI) of the measurements using both devices, US and caliper, showed values for all dimensions which tended to almost perfect agreement, showing no variability. Figures 24 showed the Brand-Altman plots for length, thickness and width dimensions, respectively, between US and caliper measurements. For each variable and almost every specimen, the difference between device´s means fell within the 95% CI of all measurements.

Figure 3
figure 3

Bland-Altman plot comparing ultrasound and caliper devices to measure length of Spring Ligament in each foot specimen.

Figure 4
figure 4

Bland-Altman plot comparing Ultrasound and caliper devices to measure thickness of Spring Ligament in each foot specimen.

Discussion

Several investigations about dimensions of the SL have used magnetic resonance imaging (MRI) to evaluate the anatomy of this structure in cadaveric feet6,19,20, specially Mengiardi et al. described accurately the SL complex in asymptomatic cadaveric feet.

Despite both US and caliper measurements of SL dimensions have been previously carried out in cadavers5,6,7,8, our study may be considered as the first study showing an excellent inter-session and inter-rater reliability (ICCUS = 0.825–0.990; ICCCaliper = 0.825–0.998; ICCUS vs Caliper = 0.911–0.966)17, absolute accuracy showing adequate SEM (SEMUS = 0–0.025 cm; SEMCaliper = 0–0.030 cm), MDC (MDCUS = 0–0.069 cm; MDCCaliper = 0–0.083 cm) and VN (VNUS = 0.44 [0.17] - 1.57 [0.62] cm; cm; VNCaliper = 0.41 [0.19] - 1.58 [0.62] cm) values12, and almost perfect agreement according to the 95% CI LoA (LoAUS = −0.01 [−0.12–0.10] − 0,03 [−0.16–0.23]; LoACaliper = −0.003 [−0.02–0.01] − 0.05 [−0.50–0.60]; LoAUS vs Caliper = 0.03 [−0.15–010] − 0.04 [−0.24–0.31]) values and Bland-Altman plots distribution (Figs 35)11,12, as well as strong correlations (rUS = 0.893–0.989; rCaliper = 0.725–0.998; ICCUS vs Caliper = 0.852–0.937)18 between caliper and US to determine all SL dimensions in cadaveric feet.

Figure 5
figure 5

Bland-Altman plot comparing ultrasound and caliper devices to measure width of Spring Ligament in each foot specimen.

According to repeatability analyses10,11,12,13, our measurements showed good repeatability (P-value > 0.05) for the SL dimensions by US (Table 7), caliper (Table 8) and comparison between both tools (Table 9) between inter-session first and second observers values, except for SL width dimension measured with US (P-value = 0.019). Despite SL width dimensions should be considered with caution due to these US repeatability differences, to the authors’ knowledge, our study may be considered as the first research work providing reliability, absolute accuracy, correlation and repeatability for SL width dimension measured by US, due to prior US reliability studies mainly focused on SL length and thickness5,6,7,8.

In addition, MDC values for the SL dimensions, such as length (MDCUS = 0.069 cm versus MDCCaliper = 0.083 cm), thickness (MDCUS = 0 cm versus MDCCaliper = 0.021 cm) and width (MDCUS = 0.013 cm versus MDCCaliper = 0 cm), showed that US measurements presented a higher absolute accuracy with lower MDC values than caliper measures for SL length and thickness dimensions, while caliper displayed greater absolute accuracy with lower MDC for SL width dimensions. According to MDC may be used as the change magnitude necessary to provide measuring confidence to be sure about these values are not the result secondary random variations or measurement errors12, these MDCs may be considered as cut-off reference values to determine SL dimensions modifications secondary to anatomic abnormalities5,6,7,8, ultrasound-guided invasive procedures9, and ligament injuries course after treatment21,22.

In accordance with our findings suggesting that these two techniques may be accurate for determining SL dimensions in human cadaveric feet, Harish et al. showed that US may be an effective imaging tool to evaluate SL abnormalities in patients with symptomatic posterior tibial tendon conditions compared to MRI as the gold standard tool23. In addition, Crim24 stated that MRI may be considered as the first-line evaluation procedure for the assessment of the SL conditions. Nevertheless, our study findings did not consider US and caliper measurements under SL conditions, while US and MRI have already been compared showing excellent findings23. As a future research line, we propose that both US and caliper reliability should be studied under SL pathologies.

The present study supported an ultrasound technical study for SL dimensions evaluations compared with caliper measures as gold standard which may be used as a reference for ultrasound-guided procedures in formaldehyde-embalmed human cadavers9. Future studies should consider these procedures in fresh-frozen cadavers as well as in vivo with healthy subjects and SL injured patients21,22.

Several limitations should be recognized regarding our approach for anatomical dissection and US procedures. Thus, we could not determinate the whole SL complex morphology and anatomic variations and further investigation is need in this field. First, only 2 observers were compared in the present study and future research studies should consider several observers for a better accuracy. Second, echogenicity changes could have modified the ability to perform the ultrasound measurements in ligament morphology, especially in the width dimensions showing a worse accuracy in the present study, given that the tissues have been infused with formalin for preservation due to this procedure can lead to asymmetric contraction of the tissue secondary to its anisotropic nature9.

Conclusion

Both US and caliper could be recommended for all SL dimensions evaluation due to their excellent reliability and strong correlation in cadavers, although width dimensions should be considered with caution due to US repeatability differences.