Introduction

Osteoarthritis (OA) is a highly prevalent, disabling and costly disease which may occur in smaller joints such as hands, but most often manifests in the larger weight-bearing joints including knees and hips [1, 2]. Knee OA is characterized by structural changes in cartilage (either detectable on X-ray as joint space narrowing or on magnetic resonance imaging [MRI] with greater specificity), bone (e.g., osteophytes) and/or meniscal damage, and symptoms and signs may include joint pain, stiffness, tenderness, crepitus, limitation of movement and effusion [3]. Diagnosis can be made on the basis of radiographic evidence of structural defects without symptoms (radiographic OA), radiographic evidence with pain or other symptoms (symptomatic radiographic OA), or oftentimes can be made on the basis of clinical examination [4]. Using nationally representative data from the 2015 Medical Expenditure Panel Survey, the self-reported prevalence of OA among noninstitutionalized U.S. adults was estimated at 10.5% (25.6 million) [1]. OA is more prevalent with older age and in obese people and thus, OA constitutes an increasing public health burden. In the U.S., the adjusted incremental annual healthcare cost and wage loss among adults with self-reported OA vs. those without OA resulted in an estimated national excess cost totaling $46.7 billion in 2015 [1].

In addition to radiographic assessments of structural knee OA such as the previously de facto coarse-grained Kellgren-Lawrence (KL) grade, [5] or modern MRI measures of structural OA such as the fine-grained OA-COM scoring system, [6] various clinical assessments/signs of OA are investigated when determining OA. Clinical signs of OA include (among others) medial and lateral tibiofemoral (MTF/LTF) tenderness and patellofemoral (PF) grind [7,8,9]. Tenderness (pain on palpation of a specific part of the joint) and PF grind are not well understood with respect to structural defects [10]. Structural defects can coincide with, or in some cases precede by years the onset of physical symptoms such as tenderness or grind. Develo** a better understanding of how structural changes due to OA are associated with present or future symptoms could help guide preemptive strategies aimed at reducing the prevalence or incidence of those symptoms, and therefore presents a valuable research question. While the literature is somewhat deeper on the question of structural correlates of pain on movement in knee OA, there have been relatively few studies investigating structural correlates of knee joint tenderness (on palpation) or PF grind, and the structural measures in these studies were generally aggregated, using for example such aggregated measures as “tibiofemoral OA” or “patellofemoral OA” [7]. However, pertinent structural defects related to OA may occur in various specific structures (e.g., cartilage/osteophytes/meniscus) and within various specific compartments (medial/lateral, anterior/posterior, tibiofemoral/patellofemoral/trochlear groove). The data utilized in our study include such specificity. Improving our understanding of the structure-symptom relationship by analyzing these lesser studied symptoms versus highly specific structural variables would be immediately beneficial to OA researchers, and in future, could also help guide clinical interventions aimed at reducing symptomatic disease [10].

The purpose of this study is two-fold. First, we seek to elucidate which (if any) MRI knee joint scores on cartilage, osteophytes and/or meniscal damage within a multitude of subregions are associated cross-sectionally with prevalent MTF/LTF knee joint tenderness and/or PF grind. Secondly, we explore structural predictors of 3-year incident (new onset) tenderness and grind.

Materials and methods

Ethics approval

This study was conducted in accordance with the declaration of Helsinki and was approved by the Clinical Research Ethics Board of the University of British Columbia. All participants gave written informed consent at all three time points.

Data collection

Source data came from a longitudinal study conducted in Vancouver, Canada, [11] a population-based cohort of individuals aged 40 to 79 with knee pain “on most days of the month at any time in the past and any pain in the past 12 months.” Data collection has been previously described [12, 13]. The clinical examination was performed by an experienced rheumatologist (JC). We have previously reported in this cohort that, based on MRI cartilage damage and X-ray findings, 13% had no OA (KL < 2 and no cartilage damage), 49% had pre-radiographic OA (cartilage damage but KL < 2), and 38% had radiographic OA (KL ≥ 2) [13]. This cohort enrolled 255 individuals, stratified by age decade and sex in roughly equal group sizes to ensure adequate sample size across the age-sex spectrum [14]. Baseline visits occurred between 2002 and 2005. In addition to the baseline cycle, two follow-up cycles were undertaken, at stratum-sampling-weighted mean 3.3 (SD 0.6) and 7.5 (SD 0.6) years. The present study uses the baseline sample (N = 255) for the cross-sectional modelling, as well as the intersection of the first and second follow-up cycles (N = 108 × 2 = 216) for longitudinal modelling.

The study knee was the more painful knee at baseline. X-rays were obtained using a weight-bearing fixed-flexion posteroanterior view with the SynaFlexer (BioClinica Inc., Newark, CA, USA) positioning frame, and a skyline view in the supine position [15]. Radiographs were read blinded to clinical information by two independent readers for KL 0–4 grading [5]. Previous studies using these data have demonstrated good interrater reliability (ICC = 0.79) [12]. Differences in readings were adjudicated by consensus readings with both readers. MRIs were acquired on a GE 1.5 T magnet at a single centre using a transmitter–receiver extremity knee coil. The imaging protocol included four MRI sequences, as previously described [13, 14]. MRIs were scored by a board-certified musculoskeletal radiologist (AG) who was blinded to clinical, radiographic, and time sequence information. Cartilage was scored in 6 subregions: lateral and medial femur, lateral and medial tibia, patella and trochlear groove. The trochlear groove was delineated from the weight bearing surfaces of the femur by oblique lines, tangent to the anterior tips of the anterior horns of the medial and lateral menisci [14]. Cartilage was graded on a 0–4 semi-quantitative scale based on the following definitions, previously described by Disler et al.: [16] 0: normal, 1: abnormal signal without cartilage contour defect, 2: contour defect of < 50% cartilage thickness, 3: contour defect of 50–99% cartilage thickness, 4: 100% cartilage contour defect with subjacent bone signal abnormality. 0 and 1 were collapsed since 1 represents signal hyperintensity on T2-weighted images of indeterminate significance, hence the analysis variables ranged from 0–3. Osteophytes (defined as osteo-cartilaginous protrusions growing at the margins of osteoarthritic joints from a process that involves endochondral ossification) were scored using criteria described in Hunter et al [17]. Osteophytes (0: absent, 1: small, 2: moderate, 3: large) were scored in 8 regions: lateral and medial femur, lateral and medial tibia, and lateral, medial, superior and inferior patella. Meniscal damage was scored as: 0: normal, 1: intra-substance signal, 2: tear. 0 and 1 were collapsed, hence the analysis variables ranged from 0–1. Meniscal damage was scored in the following 6 regions: lateral anterior, lateral body, lateral posterior, medial anterior, medial body and medial posterior. Intra-rater reliability analyses were previously performed on the scoring of each surface within each feature. The ranges of intraclass correlation coefficients (ICCs) across regions were: cartilage 0.84–1.00, osteophytes 0.77–0.89, meniscus 0.60–0.83 [8].

MTF tenderness (ICC = 0.94) and LTF tenderness (ICC = 0.85) were assessed in examination by palpating the medial or lateral tibiofemoral joint line while the patient sits with legs hanging over the edge of the examination bed. PF grind (ICC = 0.94) was assessed with the subject lying on the examination bed with their legs extended, then asked to contract their quadriceps muscle while the examiner applies downward and inferior pressure on the patella. Pain was the positive signal in both tests (not grinding).

Statistical methods

To obtain population-representative results, a baseline sample weight was developed as the ratio of knee-pain population age-sex distribution over the baseline knee-pain sample distribution, and was used in the cross-sectional models. Cross-sectional (prevalence) models were weighted with the baseline sample weight. A sample weight was developed for the longitudinal sample as the baseline sample weight multiplied by the ratio of baseline sample proportion in a given age-sex cell over the longitudinal sample proportion in that cell. Longitudinal (incidence) models were weighted with the longitudinal sample weight.

For our first objective (modelling cross-sectional association), we used baseline data to fit age-sex-BMI adjusted logistic models predicting prevalent MTF and LTF knee joint tenderness as well as PF grind versus each relevant cartilage/osteophyte/meniscus (COM) predictor in separate models. Relevant COM predictors for MTF tenderness included medial femoral cartilage (MFC), medial tibial cartilage (MTC), medial femoral osteophytes (MFO), medial tibial osteophytes (MTO), medial anterior meniscus (MAM), medial meniscal body (MBM), and medial posterior meniscus (MPM). The relevant COM predictors for LTF tenderness included the lateral equivalents of those medial predictors. The relevant COM predictors for PF grind included patellar cartilage (PC), trochlear groove cartilage (TC), medial patellar osteophytes (MPO), lateral patellar osteophytes (LPO), superior patellar osteophytes (SPO) and inferior patellar osteophytes (IPO). In addition to the age-sex-BMI adjusted single COM predictor models, we also fit fully adjusted multivariable models including as predictors all relevant COM predictors together, plus age, sex and BMI. For our second objective (modelling 3-year incidence), we combined baseline to 3-year follow-up with 3-year to 7-year follow-up (two records per subject), and fit binary generalized estimating equations (GEE) models predicting 3-year incident MTF and LTF knee joint tenderness as well as 3-year incident PF grind versus the relevant COM predictors at “baseline” for each cycle (i.e., either actual baseline or 3 years depending on the cycle represented on a given record). Incidence models were also fit as age-sex-BMI adjusted single COM predictor models, as well as fully adjusted models including as predictors all relevant COM predictors together, plus age, sex and BMI. All longitudinal models were also adjusted for individual follow-up time between cycles. Model fit was assessed via the Hosmer and Lemeshow goodness of fit test [18]. The predictive utility of each model was assessed via the area under the receiver operating characteristic (ROC) curve (AUC).

Due to theoretical considerations (lack of a plausible biological pathway), in the primary models we did not regress PF grind versus medial or lateral tibiofemoral MRI scores. However, in a sensitivity analysis, we fit models predicting PF grind versus each of the medial and lateral predictor sets listed above.

Analyses were performed using SAS version 9.4 (SAS Institute Inc., Cary, NC, USA).

Results

Table 1 describes the baseline predictors and covariates for cross-sectional and longitudinal analyses, respectively weighted N = 255.0 and 217.3 (2 times the weighted intersection between cycles 2 and 3). In the cross-sectional sample, weighted n = 143.6 (56.3%) were female, weighted mean age was 56.7 (standard deviation [SD] = 10.4) and mean BMI was 26.5 (SD = 4.9). Radiographic OA (defined as KL grade ≥ 2) was observed in 98.0 (38.4%) subjects. Abnormal grades (> 0) were observed for MFC at 66.5%, MTC at 50.5%, MFO at 61.7%, MTO at 79.0%, MAM at 4.3%, MBM at 25.4%, MPM at 33.5%, LFC at 57.8%, LTC at 25.6%, LFO at 51.0%, LTO at 69.2%, LAM at 8.5%, LBM at 8.3%, LPM at 5.9%, PC at 61.5%, TC at 52.4%, MPO at 47.6%, LPO at 73.1%, SPO at 86.2% and IPO at 38.6%. In the longitudinal sample, 119.8 (55.1%) were female, mean age was 57.2 (SD = 9.0) and mean BMI was 26.1 (SD = 4.2). Radiographic OA was observed in 99.2 (45.7%) subjects. Abnormal grades (> 0) were observed for MFC at 62.7%, MTC at 44.8%, MFO at 64.9%, MTO at 65.6%, MAM at 9.5%, MBM at 28.7%, MPM at 38.3%, LFC at 33.8%, LTC at 37.5%, LFO at 91.8%, LTO at 75.4%, LAM at 8.7%, LBM at 14.4%, LPM at 11.4%, PC at 60.8%, TC at 51.4%, MPO at 73.7%, LPO at 86.4%, SPO at 88.6% and IPO at 56.1%.

Table 1 Baseline predictors and covariates for cross-sectional and longitudinal analyses, n (%) or mean (SD)

Table 2 lists the prevalence of medial/lateral tibiofemoral tenderness and patellofemoral grind at baseline, 3 and 7 years, respectively weighted N = 255.0, 108.6 and 108.6. At baseline, medial tibiofemoral tenderness was observed in 143.7 (56.4%) subjects, lateral tibiofemoral tenderness in 77.8 (30.5%) subjects, and patellofemoral grind in 54.5 (21.4%) subjects. At 3 years, medial tibiofemoral tenderness was observed in 47.2 (43.5%) subjects, lateral tibiofemoral tenderness in 27.2 (25.1%) subjects, and patellofemoral grind in 15.2 (14.0%) subjects. At 7 years, medial tibiofemoral tenderness was observed in 44.7 (41.1%) subjects, lateral tibiofemoral tenderness in 17.2 (15.8%) subjects, and patellofemoral grind in 14.8 (13.6%) subjects.

Table 2 Medial/lateral tibiofemoral tenderness and patellofemoral grind at baseline, 3 and 7 years, n (%)

Table 3 lists the cross-sectional (prevalence) model odds ratios with 95% confidence intervals (CIs), both age-sex-BMI adjusted as well as fully adjusted. In the fully adjusted model, significant predictors of MTF tenderness included MFC (fully adjusted odds ratio [aOR] 1.84; 95% CI 1.11, 3.05), female sex (aOR = 3.05; 1.67, 5.58) and BMI (aOR = 1.53 per 5 units BMI; 1.10, 2.11). The AUC of the fully adjusted MTF prevalence model was 0.689 (95% CI 0.625, 0.754). Significant cross-sectional predictors of prevalent LTF tenderness included only female sex (aOR = 2.18; 1.22, 3.90). The AUC of the fully adjusted LTF prevalence model was 0.637 (95% CI 0.563, 0.711). There were no significant cross-sectional predictors of prevalent PF grind in the fully adjusted model. However, MPO was predictive of PF grind in age-sex-BMI adjusted models, and it remained borderline significant in the fully adjusted model. The AUC of the fully adjusted PF prevalence model was 0.625 (95% CI 0.538, 0.712).

Table 3 Cross-sectional (prevalence) model odds ratios with 95% confidence intervals

Table 4 lists the longitudinal (incidence) model odds ratios with 95% CIs, both age-sex-BMI adjusted as well as fully adjusted. There were no significant predictors of 3-year incident MTF tenderness. The AUC of the fully adjusted MTF incidence model was 0.731 (95% CI 0.613, 0.849). Significant predictors of 3-year incident LTF tenderness included only female sex (aOR = 3.83; 1.25, 11.77). The AUC of the fully adjusted LTF incidence model was 0.756 (95% CI 0.641, 0.871). Significant predictors of 3-year incident PF grind included only LPO in the fully adjusted model (aOR = 4.82; 1.69, 13.77). However, in the age-sex-BMI adjusted model, PC was also a predictor. The AUC of the fully adjusted PF incidence model was 0.815 (95% CI 0.715, 0.915).

Table 4 Longitudinal (incidence) model odds ratios with 95% confidence intervals

Finally, in sensitivity analyses, we found a significant cross-sectional positive association between PF grind and LFO (aOR = 2.41; 1.42, 4.10), but this was offset by a negative association with LTO in the same model (aOR = 0.51; 0.29, 0.90), indicating potential problems with fit.

Discussion

We have explored potential MRI predictors (cartilage, osteophytes and meniscus) of prevalent and 3-year incident medial and lateral tibiofemoral knee joint tenderness and patellofemoral grind. Significant predictors of prevalent MTF tenderness included medial femoral cartilage, female sex and BMI. Predictors of prevalent LTF tenderness included female sex. There were no predictors of prevalent PF grind in the fully adjusted model. However, medial patellar osteophytes was predictive in the age-sex-BMI adjusted model. There were no predictors of 3-year incident MTF tenderness. Predictors of 3-year incident LTF tenderness included female sex. Predictors of 3-year incident PF grind included lateral patellar osteophytes. In the age-sex-BMI adjusted model, patellar cartilage was also a predictor.

There have been relatively few studies investigating structural correlates of knee joint tenderness (on palpation) or PF grind, and the structural measures in these studies were generally aggregated. Parsons et al. for example published one such (cross-sectional) study, but considered only the aggregated structural measures “tibiofemoral OA” or “patellofemoral OA” evaluated on X-ray [7]. However, pertinent structural defects related to OA may occur in various specific structures (e.g., cartilage/osteophytes/meniscus) and within various specific subregions (medial/lateral, anterior/posterior, tibiofemoral/patellofemoral). The data utilized in our study include such specificity. Despite these differences, however, some comparisons may still be made to previous literature in this area. Parsons et al. investigated tibiofemoral tenderness and crepitus. They found a positive association between tibiofemoral OA and tenderness (OR = 7.8). While their reported effect is ostensibly higher than our estimated association between medial femoral cartilage grade and medial tibiofemoral tenderness (aOR = 1.84), our reported effect is per grade (on a 4-point scale). Appling our models to 3- and 4-point increases in medial femoral cartilage grade (obtained by exponentiation) result in similar estimates to that of Parsons et al. On the lateral side, however, we did not observe a significant adjusted association between MRI scores and lateral tibiofemoral tenderness. As Parsons et al. did not differentiate between medial and lateral symptoms or structural defects, it is unclear whether our lateral results contradict those data. Although Parsons et al. did not model patellofemoral grind per se, they did model crepitus. They found a significant association between crepitus and tibiofemoral OA (OR = 3.9), but no significant association with patellofemoral OA. Similarly, in our study, we did not find an association between PF grind and MRI measures of patellar or trochlear groove cartilage or osteophytes, and due to theoretical considerations (lack of a plausible biological pathway), we did not regress PF grind versus tibiofemoral MRI scores. However, in sensitivity analyses, we did find a significant cross-sectional association between PF grind and LFO (aOR = 2.41), albeit offset by a negative association with LTO in the same model (aOR = 0.51), indicating potential problems with fit. In another cross-sectional study of structural predictors of knee joint tenderness, Saitu et al. utilized ultrasound to assess structural defects, and fit logistic regression models to assess correlation with joint line tenderness (JLT) [19]. They found significant positive associations between JLT and female sex (OR = 11.87) as well as cartilage thickness (OR = 0.12). Female sex was also significantly predictive in both our cross-sectional tenderness models. Inverting their OR for cartilage thickness produces an OR = 8.33, which ostensibly appears higher than our corresponding OR of 1.84 for MTF tenderness vs. MFC. However, Saitu et al. measured cartilage thickness in mm, with an interquartile range (IQR) of just 0.5 mm, and a between-group difference in medians of just 0.2 mm, yet they reported their OR in a per-mm scale. As such, a more comparable estimate to our per-grade OR could be obtained by taking the fourth root of their reported OR, which yields an OR = 1.70, very close to ours. As in our study, Saitu et al. did not find an association between JLT and osteophytes in their overall sample (though they did observe an association in a selected pre-radiographic subsample). In another study, Wang et al. considered the same two outcome variables as our study, namely joint line tenderness and patellofemoral grind, albeit without differentiating medial and lateral tenderness [9]. They found that PF grind (but not tenderness) was associated with cartilage volume loss (assessed on MRI) cross-sectionally, which is the opposite of our finding in which cartilage defects predict tenderness but not PF grind.

Longitudinal studies predicting new onset knee joint tenderness or PF grind per se are fewer still in the existing literature, in fact we have not identified any such studies per se. In the closest such study (also cited above), Wang et al. studied “fluctuating” (present at only one of two time points) and “persistent” (present at each of two time points) patterns of knee joint tenderness and PF grind, versus changes in structural defects [9]. It is worth noting that the structural defects considered in that study were aggregated: “cartilage volume loss” (on MRI) and “worsening of radiographic osteoarthritis” (which they used to describe worsening osteophytes on X-ray). Wang et al. found no association between pattern of knee joint tenderness and change in structural defects (not exactly the same research question as ours in either the dependent or independent variables but nonetheless a similarly null finding). Wang et al. did find an association between pattern of PF grind and rate of cartilage loss over time, but again this is not identical to our research question.

The strengths and limitations of our study deserve comment. While population-based is a strength, the target population is not the general population, but those with baseline knee pain, aged 40–79 at baseline, who were followed up over an average of 7.5 years. However, considering our objective was to explore potential MRI predictors (cartilage, osteophytes and meniscus) of prevalent and 3-year incident medial and lateral tibiofemoral knee joint tenderness and patellofemoral grind, symptoms of an inherently painful disease (OA), this restriction should not be too impactful. Furthermore, our inclusion of mild but persistent knee pain without diagnosed OA may present opportunities to develop preemptive strategies aimed at reducing tenderness and grind both in the present and future, in those at risk of such symptoms (e.g., those with pre-radiographic OA). Another limitation of the models explored herein is that their application would require an MRI, which can be expensive. However, an associated strength of this study is precisely the fact that it is based on MRI data, and as such covers a wide range of specific structures (e.g., cartilage/osteophytes/meniscus), various specific compartments (medial/lateral, anterior/posterior, tibiofemoral/patellofemoral/trochlear groove), and up to four scoring levels per compartment. Another important limitation is the relatively small sample size utilized in this study: the baseline sample (N = 255) for cross-sectional modelling, and the intersection of the first and second follow-up cycles (N = 108 × 2 = 216) for longitudinal modelling. Future validation studies with larger data sets will be needed to confirm our findings.

We have explored potential MRI predictors (cartilage, osteophytes and meniscus) of prevalent and 3-year incident medial and lateral tibiofemoral knee joint tenderness and patellofemoral grind. Cross-sectionally, predictors of MTF tenderness included MFC, female sex and BMI, but nothing predicted 3-year incidence. Female sex alone predicted cross-sectional and longitudinal LTF tenderness. While nothing predicted PF grind in fully adjusted cross-sectional models, LPO was a predictor of 3-year incidence. These findings could potentially help to guide preemptive strategies aimed at reducing tenderness and grind both in the present and future (3-year incidence), two common symptoms of osteoarthritis.