Background

Osteoarthritis (OA) is a highly prevalent chronic condition with marked implications for affected individuals and public health care [1, 2]. Current treatment approaches focus on controlling symptoms since there are no interventions that have yet been approved for modifying the course of the disease or improving structural alterations in affected joint tissues [3]. Non-pharmacological and non-surgical methods such as education and self-management, exercise, weight loss if overweight or obese, and walking aids as indicated, are widely recommended and seen as first-line treatment [4].

OA is understood today as the clinical and pathologic outcome of a range of disorders that result in structural and functional failure of synovial joints. Joint imaging, particularly magnetic resonance imaging (MRI), has evolved rapidly in recent years due to technical advances and their application to clinical research, which has led to abundant evidence regarding the natural history of the disease [5]. While radiography depicts structural bony tissue changes only in advanced stages of OA, MRI is able to visualize all involved joint tissues, even in the earliest stages of disease when radiographs appear still normal [6]. Recent data suggest that non-cartilaginous tissue changes in particular play an important role in the onset and progression of OA [7, 8]. MRI-based semi-quantitative (SQ) scoring of knee OA is a valuable method for performing multi-tissue joint assessment in observational cross-sectional and longitudinal studies of OA including clinical trials [9]. SQ scoring enables evaluation of the whole knee joint using MRI acquisition techniques that are commonly applied in a clinical environment [10]. SQ scoring has expanded our understanding of disease onset and progression and plays an increasing role regarding clinical trial design [11, 12].

In the Foundation for the National Institutes of Health (FNIH) Osteoarthritis Biomarkers Consortium study,—a nested case–control study based within the larger Osteoarthritis Initiative (OAI) study -, presence and amount of baseline structural tissue damage and worsening of several MRI features from baseline to 24 months were associated with increased odds of progression as defined by pre-determined radiographic, clinical or combined outcomes [8].

The IMI-APPROACH (Innovative Medicines Initiative—Applied Public–Private Research enabling OsteoArthritis Clinical Headway, https://www.approachproject.eu) study is an exploratory, European, 5-centre, 2-year prospective follow-up cohort project [13]. Although currently available cohort studies, like the Dutch CHECK [14] and the US OAI with the FNIH subcohort [8] have increased our knowledge of the disease, these attempts still have not resulted in clearly distinctive phenotypes/endotypes with predictive biomarkers. IMI-APPROACH was designed to prospectively describe pre-identified progressor phenotypes of patients with symptomatic and/or structural knee OA by use of conventional and novel clinical, imaging, and biochemical biomarkers, and to validate and refine a predictive model for progressor phenotypes based on these markers. The recruitment for IMI-APPROACH was based on rankings produced by machine-learning models that were trained using data from existing cohorts to estimate the likelihood of joint space width loss (so-called s-score) and/or increased or sustained knee pain (p-score) over the course of the study from demographic data, pain scores, and radiographic features [13, 15]. In addition to this unique selection of participants, the IMI-APPROACH cohort combines a broad spectrum of conventional and novel, explorative, imaging, biochemical, clinical and demographic markers. Modern data science techniques suitable to analyze such extensive datasets will help identifying and predicting phenotypes/endotypes of OA that share distinct underlying pathobiological mechanisms with their structural and functional consequences, relevant for clinical practice and targeted clinical trials.

Herewith, we describe the scoring methodology and baseline cross-sectional frequencies of structural joint tissue damage in IMI-APPROACH participants based on SQ MRI assessment. Furthermore, we describe the extent of changes over a 24-month period for the different MRI features in the overall sample and compare subgroups with and without radiographic knee OA, which may serve as a potential reference for future studies focusing on MRI features and progression over similar observational periods. Finally, we report cross-sectional and longitudinal reliability of MRI assessments,

Methods

Study characteristics

IMI-APPROACH is an observational, longitudinal study that enrolled 297 OA patients at five clinical centers in Europe [13, 15]. The study participants were recruited from five existing observational OA cohorts (CHECK (Utrecht, The Netherlands) [14], HOSTAS (Leiden, The Netherlands) [16], MUST (Oslo, Norway) [17], PROCCOAC (A Coruña, Spain) [18], and DIGICOD (Paris, France) [19]) or from outpatient departments, if not enough participants could be recruited from these existing cohorts. Recruitment from these cohorts relied on machine-learning models that were trained using data from the CHECK cohort and the OAI to predict either the probability of increased or sustained knee pain or the probability of structural progression over the next 2 years. These models were then applied to X-ray-based measures, demographic and clinical data collected at the screening visit to select OA patients with the highest likelihood of having pain and/or structural progression over the course of the study [20]. Beyond the machine learning-based rankings, additional inclusion criteria were ability to walk unassisted, predominantly tibiofemoral knee OA and satisfying the clinical America College of Rheumatology (ACR) classification criteria for knee OA; exclusion criteria included participation in a trial of local therapeutic intervention for index knee OA or potential systemic disease modifying OA drugs (DMOADs) at the time of inclusion, within six months before inclusion, and/or anticipated during two years of follow-up, surgery of the index knee in the six months before inclusion and/or scheduled or expected surgery of the index knee during follow-up, current pregnancy or planned pregnancy during follow-up, predominantly patellofemoral knee OA and others such as alternative/additional causes of joint pain, for example, rheumatic symptoms due to malignancies, primary osteochondromatosis or osteonecrosis [13]. After inclusion and exclusion criteria were checked an index knee was selected based on ACR criteria. If both knees fulfilled the criteria, patients indicated their own index knee based on severity of complaints, in case equal the right knee was selected as the index knee. In case both knees were affected equally, the right knee was selected as the index knee. Demographic and clinical data, blood and urine samples, and imaging data were collected. Regarding imaging, X-ray of the index knee at screening, 6, 12, and 24 months (X-ray of the contralateral knee only at enrolment and 24 month), MRI of the index knee at enrollment, month 6, 12, and 24 follow-up visits, and a CT of the index knee (to extract bone shape, bone mineral density and texture parameters of subchondral bone architecture; only at enrolment and 24 month) were acquired.

IMI-APPROACH was conducted in compliance with the protocol, Good Clinical Practice (GCP), the Declaration of Helsinki, and the applicable ethical and legal regulatory requirements (for all countries involved), and is registered under clinicaltrials.gov identifier: NCT03883568 (first submitted date 21/03/2019). All participants have received oral and written information and provided written informed consent.

MRI acquisition and evaluation

MRI of the index knee was acquired at the five clinical centers with two of the centers using 1.5 T systems (A Coruña: Ingenia CX, Philips Medical Systems, Netherlands; Oslo: Aera, Siemens Healthcare, Germany), and the other centers using 3 T systems (Utrecht: Ingenia or Achieva, Philips Medical Systems, Netherlands; Leiden: Ingenia, Philips Medical Systems, Netherlands; Paris: Skyra, Siemens Healthcare, Germany). The clinical pulse sequence protocol included an axial, a sagittal, a coronal intermediate-weighted fat-suppressed sequence and a T1-weighted coronal turbo spin echo sequence that were all used for SQ evaluation. Details of the pulse sequence protocol are presented in Appendix 1. In addition, a sagittal 3D spoiled gradient echo or volume-interpolated gradient echo sequence with selective water excitation or fat-suppression was acquired for the quantitative cartilage analysis. This sequence was also used for SQ assessment. After site qualification an inter-site comparison was performed with three volunteers who had both knees imaged at 4 of the 5 sites (1.5 T MRI: A Coruña and Oslo; 3.0 T: Leiden and Utrecht). For the analysis of test–retest precision, each site asked study participants at the baseline visit whether they volunteered into one additional MRI acquisition performed at both the baseline and the month 24 visit. Altogether 26 test–retest MRIs were acquired with repositioning of the knee between scans (patients were allowed but not required to leave the scanner) [21].

One musculoskeletal radiologist (FWR) with 17 years’ experience of SQ assessment of knee OA at the time of reading, blinded to all clinical data and predicted progressor status, read the MRIs according to the MRI Osteoarthritis Knee Score (MOAKS) instrument [22] with knowledge of the chronological order of the scans. The following joint structures were assessed: cartilage morphology, subchondral bone marrow lesions (BMLs), osteophytes, meniscal structural damage and meniscal extrusion, Hoffa-synovitis and effusion-synovitis. For the current study only the baseline and 24-months MRIs were considered.

In addition, within-grade changes were coded that fulfill the definition of a definite visual change but do not fulfill the definition of a full-grade change on the ordinal scales applied [23]. Within-grade changes were applied for cartilage and BML assessment. For cartilage, within-grade changes were coded for the area-extent dimension and the full-thickness dimension of the MOAKS scale, separately.

Reliability

For reliability assessment, 20 MRIs were randomly selected to represent the spectrum of study sites and disease severity. Four knees were chosen from each site. The knees represented the spectrum of baseline disease severity from Kellgren-Lawrence (KL) grade 0 to 4 (four knees for each KL grade). The MRIs were assessed for the purpose of intra-reader reliability by the primary reader (FWR) in random order four weeks after the last study participant completed the MRI acquisitions and at least 6 weeks after the primary readings had been performed. Reliability assessment was performed in chronological order with time point known to the reader for baseline and 24-months including within-grade changes. A second reader with 19 years of experience of SQ MRI assessment of knee OA at the time of reading (AG) read baseline and 24-months follow-up MRI in identical fashion for assessment of inter-observer reliability.

Features assessed and change over time

Cartilage ― MOAKS uses a two-digit score for cartilage assessment (each 0–3) that incorporates both area size per subregion (i.e. in the following referred to as “area extent”-dimension) and percentage of subregion that is affected by full-thickness cartilage loss (i.e. in this analysis referred to as “full-thickness”-dimension). Figure 1 depicts the subregional division for cartilage (and BML) assessment for the femur, tibia and the patella. Frequencies are presented for maximum MOAKS score and number of subregions affected by any cartilage damage on a knee level. In addition, the area extent and full-thickness results are presented separately for the whole knee and on a compartmental level. The number of subregions with worsening (i.e., a higher score at 24 months vs. baseline) was defined for the total MOAKS score and separately for area extent and full-thickness. Change over time on a knee and compartmental level was defined as increase in number of subregions showing any MOAKS cartilage worsening including within-grade changes and excluding-within grade changes. Within-grade scoring for cartilage refers to any within-grade change in area or thickness for the total MOAKS score evaluations, and considered separately for area extent and full-thickness dimensions. In addition, change was categorized into none vs. any change.

Fig. 1
figure 1

Subregional division for cartilage and bone marrow lesion assessment using the MOAKS instrument. Both features are assessed in 14 articular subregions. A. Axial intermediate-weighted fat suppressed image shows subregional division of the patella into the medial (mP) and lateral patella (lP). Note that the patella apex is part of the medial patella. B. Sagittal intermediate-weighted fat suppressed image of the medial compartment shows the three femoral and three tibial subregions. The femur is subdivided into the anterior (amF), central (cmF) and posterior (pmF) subregions. The tibia is subdivided into the anterior (amT), central (cmT) and posterior (pmT) subregions. The lateral compartment is subdivided in corresponding fashion in the sagittal plane (not shown). C. Coronal intermediate-weighted fat suppressed image shows the central femoral and tibial subregions. The tibial S region (subspinous – adjacent to the tibial spines) is not considered for BML and cartilage evaluation

BMLs ― MOAKS assesses BMLs in three dimensions: % of subregion affected by any (ill-defined and/or cystic) BML (0–3), % of subregion that is cystic vs. ill-defined BML (0–3) and number of BMLs per subregion. Here we report only the size component as this aspect of BML assessment incorporates both the ill-defined and cystic part of lesion and is clinically most relevant [24]. Subchondral cysts are only marginally associated with symptoms [25]. BMLs are assessed in the same 14 articular subregions as cartilage with the exception that the tibial subspinous subregion is assessed in addition for BMLs. That subregion, however, was not considered as not covered by cartilage and lesions in this region are not considered subchondral. Number of subregions affected by any BML and maximum BML score are presented on a knee and compartmental level. Change in overall number of subregions affected by any BML was defined as the difference between the number of subregions affected by any BML at 24 months (size > 0) and the number of subregions affected by any BML at baseline. This was further categorized into improvement, no change, and worsening in one subregion and worsening in two or more subregions. Further, the maximum increase in BML score from baseline to 24 months was determined on a knee and compartmental level. Finally, the number of subregions with worsening, and the number of subregions with improvement was determined for full-grade changes only and for full-grade and within-grade changes combined. We classified these measures into any subregions with worsening and any subregions with improvement on a knee and compartmental level.

Osteophytes ― MOAKS assesses osteophytes at 12 possible marginal locations of the joint on a scale from 0 to 3. For baseline, number of locations with any osteophytes and the maximum osteophyte score are described for the knee and compartmental level. The change in number of locations affected by any osteophyte was defined as the difference between the number of locations affected by any osteophyte at 24 months (Grade > 0) and the number of locations affected by any osteophyte at baseline. This change was classified as no change, or any worsening, and for the numbers of locations affected by change. In addition, change in maximum osteophyte score, was defined as the greatest amount of worsening of all affected locations per knee or compartment. This was further dichotomized into any vs.no change in maximum score.

Meniscus ― MOAKS scores meniscus damage from 0 to 8 with grade 1 representing intrameniscal signal but no tear or maceration. Grades 2–5 represent different tear types and grades 6–8 reflect maceration, i.e. meniscal substance loss. In addition, meniscal root tears are considered separately as these are considered detrimental for joint health [26, 27]. Furthermore, meniscal extrusion was scored in the anterior and mid-joint locations from 0–3. We assessed whether there was worsening in meniscal morphology from baseline to 24 months in each of the three medial or lateral meniscal subregions. These were evaluated separately. We defined worsening as an increase in grade in at least one subregion. We further categorized worsening in meniscal morphology into number of subregions with any worsening and categorical change (i.e. from normal to tear, normal to maceration or tear to maceration). We assessed changes in meniscal extrusion and root tears separately in the medial and lateral compartments as any change vs. no change.

Hoffa-Synovitis and Effusion-synovitis ― As MRI markers of inflammation, so-called effusion- and Hoffa-synovitis are evaluated in MOAKS. Hoffa-synovitis is a term used for signal changes in Hoffa’s fat pad that are commonly used as a surrogate for synovitis on non-contrast-enhanced MRI [22]. Effusion-synovitis is scored from 0 to 3 according to the distention of the joint capsule as 1 = small, 2 = moderate and 3 = large. Hoffa-synovitis is scored based on the amount of hyperintensity signal in Hoffa’s fat pad on sagittal fat suppressed intermediate-weighted sequences as 1 = mild, 2 = moderate and 3 = severe. Frequencies of baseline Hoffa- and effusion synovitis are presented. 24-months changes in Hoffa-synovitis and effusion-synovitis are assessed separately and categorized as improvement, no change, or worsening.

Analytic approach ― Descriptive statistics are used to report frequencies for the different features and parameters for baseline and change over time. Data is presented for the entire sample and for those knees with and those without radiographic OA. Mann–Whitney-U test was applied to describe differences between knees without radiographic OA (i.e. KL 0 and 1) vs. those with radiographic OA (i.e. KL 2–4). For some features raw distributions were grouped into categories as described above. In these instances, descriptive statistics are presented for both raw and categorical versions of features. For the longitudinal analyses, only those knees with complete and available baseline and 24-months data for the respective feature were included. Weighted kappa statistics were applied to determine inter- and intra-observer reliability for baseline and change over time. All analyses were conducted using SPSS 27 (IBM Corporation, Armonk, NY).

Results

Demographics

Of the 297 IMI-APPROACH participants, 289 had a readable baseline scan and at least one feature assessable (cartilage: n = 286, BML: n = 289, osteophytes: n = 285, meniscus: n = 278, inflammation: n = 287). There were 223 women (77.2%). Participants were on average 66.6 ± 7.1 years old and had a body mass index (BMI) of 28.1 ± 5.3 kg/m2. Mean knee injury and osteoarthritis outcome score (KOOS) symptom score was 69.5 ± 17.2, mean KOOS pain score was 66.4 ± 18.8 and mean KOOS function score was 69.1 ± 19.9. Mean numeric rating scale (NRS) pain score was 4.6 ± 2.7. A considerable proportion of the knees had no definite radiographic OA (44.6%, KL 0: n = 52; KL1: n = 77), but the majority (55.3%) of the knees had definite signs of radiographic OA (KL 2: n = 65 KL 3: n = 84, KL 4: n = 11). Medial joint space narrowing (JSN) was more frequent (47.8%) than lateral JSN (16.3%). Additional baseline characteristics of the cohort are presented in Table 1 and have been reported in detail previously [13].

Table 1 Demographic Characteristics of the Study Sample

Reliability

Summarizing the intra- and inter-reader results for the baseline assessment, all of the measures showed at least substantial agreement ranging between 0.71 for maximum cartilage area extent on a knee level (intra-reader) and 1.00 for several features. Change was relatively rare and the reliability results of longitudinal data showed larger variation. Tables 2 and 3 give a detailed overview of the cross-sectional and longitudinal reliability results. Appendix 2 reports the frequencies of change for the reliability readings.

Table 2 Intra- and Inter-reader Reliability APPROACH MOAKS Assessment (Baseline)
Table 3 Intra- and Inter-reader Reliability APPROACH MOAKS Assessment (Change)

Cartilage

Regarding baseline frequencies for cartilage, 4.9% of knees had a maximum baseline cartilage score (area extent) of 1, 52.1% of 2 and 40.6% of 3. Only 2.4% did not have any cartilage damage in any of the three compartments. In the ROA subgroup markedly more knees showed higher-grade cartilage damage compared to those knees without ROA (p = 0.0000). Regarding the full-thickness component of the MOAKS cartilage score, the respective numbers were 24.8% (grade 0), 16.4% (grade 1), 41.3% (grade 2) and 17.5% (grade 3). Details including number of subregions per knee and compartment affected by cartilage damage at baseline and differences between the ROA and no ROA subgroups are presented in Tables 4 and 5 and Appendix 3. Any change in total cartilage MOAKS score was seen in 53.1% of the entire sample considering only full-grade changes and in 73.9% including full-grade and within-grade changes. Any medial cartilage progression was seen in 23.9% and any lateral progression on 22.1% while any change in the PFJ was observed in 25.7%. Detailed results of change in cartilage for the number of subregions showing any increase (full-grade increase and full-grade plus within-grade increase) in total MOAKS score are presented in Table 6, and separately for the area-extent and full thickness dimensions in Appendix 4 and 5.

Table 4 Baseline cartilage damage (area extent score)
Table 5 Baseline cartilage damage (full thickness score)
Table 6 Cartilage damage change – any MOAKS worsening (baseline to 24 months)

BMLs

BMLs were observed in 77.5% of all knees at baseline, with 31.5% having a maximum score of 1, 27.0% a maximum score of 2 and 19% a maximum score of 3. BMLs were more commonly observed medially (35.3%) compared to the lateral compartment (23.5%), but were most prevalent in the patellofemoral joint (57.4%). The proportion of knees with any BMLs was markedly higher in the ROA (89.4%) compared to the no ROA subgroup (62.8%). The detailed results for baseline BMLs including the comparison between ROA and no ROA knees are shown in Table 7. Number of subregions showing BML change over time ranged from -4 to + 4 reflecting the fluctuation of BML, with the majority of knees having the same number of subregions affected by BMLs at baseline and follow up (60.3%). While for the medial and lateral compartments numbers of subregions with improvement and worsening were similar (9.9% and 9.5% medial, 7.3% and 5.6% lateral), for the PFJ more improvement was observed compared to worsening (15.5% vs. 9.0%), albeit not statistically significant. Including within-grade changes, the number of knees showing BML worsening increased from 42.2% to 55.6%. More details on BML change are presented in Table 8 and Appendix 6, 7 and 8.

Table 7 Baseline frequencies of bone marrow lesions
Table 8 BML change overview baseline to 24 months follow-up

Osteophytes

The large majority of knees exhibited osteophytes at baseline with 45.3% having a maximum grade of 1, 24.2% a maximum grade of 2 and 20.4% a maximum grade of 3, which is shown in more detail in Table 9. Osteophyte worsening was rare with 20% showing an increase in number of locations affected by any osteophyte. 19.1% of knees showed an increase in osteophyte size by one grade and 0.9% by two grades (Appendix 9).

Table 9 Baseline Frequencies of Osteophytes

Meniscus

Regarding baseline meniscal pathology (Table 10), 51.1% of knees had any damage in the medial compartment, and 23.4% had damage in the lateral compartment. Meniscal tears were seen in 17.6% medially and 10.4% laterally, and any meniscal maceration was seen in 33.5% medially and 12.9% laterally. Root tears were rare (3.2% medially and 0.7% laterally). Meniscal extrusion grade 2 or 3 was detected in 33.8% medially and 9% laterally. Change in meniscal damage was rare with 3.2% showing change in one category (from normal to tear or tear to maceration) and 0.9% in two categories (normal to maceration). Any increase in extrusion was seen in 10.9% medially and 2.3% laterally (Appendix 10).

Table 10 Baseline Frequencies of Meniscus Pathology

Inflammation

Concerning inflammatory features of OA at baseline, 33.4% had no Hoffa-synovitis, 49.8% had grade 1 Hoffa-synovitis and 16.7% had grade 2 or 3 Hoffa-synovitis. Effusion-synovitis was seen in 30.3% (grade1), 11.5% (Grade 2) and 4.2% (grade 3) respectively, 54.0% did not have any effusion-synovitis at baseline. Regarding change in inflammation, 7% showed improvement and 7.8% worsening of Hoffa-synovitis, while for effusion-synovitis these numbers were 10% and 22%, respectively. Details of baseline and change characteristics of inflammatory features of OA are presented in Table 11.

Table 11 Hoffa- and effusion-synovitis – baseline and change over 24 months

Discussion

We presented baseline data and change over 24-months follow-up of SQ-assessed MRI features including cartilage, BMLs, osteophytes, meniscal pathology and inflammatory features of OA in the IMI-APPROACH study. We found a wide range of structural pathologies at baseline and substantial and varying change of features over the two-year follow-up period. Knees with established radiographic OA showed more baseline pathologies and more worsening of structural tissue damage over time compared to knees without radiographic OA.

The IMI-APPROACH cohort was specifically designed to include patients with knees likely to show structural or symptomatic progression over a two-year follow-up period. Participants were recruited primarily from existing cohorts and machine-learning models were applied to estimate risk of progression. This study is a descriptive overview of structural tissue pathology and longitudinal change. We did not attempt to show superiority of SQ MRI assessment over other methods. IMI-APPROACH employs a multitude of imaging methods including but not limited to radiographic parameters of knee OA severity, quantitative MRI parameters for cartilage including thickness and volume, SQ MRI scoring of cartilaginous and non-cartilaginous tissues, advanced radiographic parameters such as bone shape analyses and subchondral bone architecture and high-resolution CT characterizing OA related bone and trabecular adaptations [13]. Aim of IMI-APPROACH was to explore these different methods not in a comparative fashion but rather use the complementary information to help reach the overarching aim of being able to define different subtypes of OA, which hopefully will result in a more targeted personalized treatment approach in the future.

When comparing the IMI-APROACH data to the FNIH study, another cohort designed to analyze different biomarkers (including imaging) predicting structural or symptomatic progression, we found that in IMI-APPROACH fewer knees showed worsening in BMLs but a higher number of knees showed progression in the cartilage full-thickness dimension [28]. In APPROACH 42% of knees showed worsening of BMLs in any subregion (59% including within-grade changes) while this number was 73% for the cases and 66% in the control group in the FNIH study (also including within-grade changes). Regarding cartilage damage worsening in FNIH, 59% of subjects had at least one subregion with worsening in area extent dimension of MOAKS including within-grade changes (52% controls vs. 73% cases), while 42% of subjects (24% controls vs. 58% cases) had at least one area with worsening in thickness (considering full grade changes only). In IMI-APPROACH these numbers were 46% (29% no radiographic OA vs. 62% radiographic OA cases) for area extent (including within-grade changes) and similarly 46% (28% no radiographic OA vs. 72% radiographic OA cases) for full-thickness changes (full grade changes only). Regarding inflammatory features of OA, in FNIH 10% of subjects experienced worsening of Hoffa-synovitis with more cases experiencing worsening than controls (17% vs. 6%). In APPROACH this number was similar with 8% overall and 6% for the no radiographic OA subgroup and 10% for radiographic OA cases. In FNIH, the effusion-synovitis score worsened in 41% of cases compared to 18% of controls. In IMI-APPROACH, this number was 22% for the entire sample (19% for the no radiographic OA vs 26% for the radiographic OA subgroups) [28]. The other analyzed parameters (e.g. meniscus and osteophytes) showed little change in both cohorts. Of note, including within-grade assessment increased the number of knees showing change in cartilage and BML parameters. While clinical validity of within-grade assessment has been shown previously, recently it was shown that knees with within-grade changes have larger quantitative cartilage loss compared to those not showing any SQ cartilage change [29]. While both studies focus on progression in symptoms and / or structure, they differ in regard to patient selection. While FNIH used an a priori definition of progression based on pain and increase in joint space narrowing (67% of knees showing either increase in pain, increase in joint space narrowing or both) and applied a retrospective analysis of a prospectively acquired dataset, the IMI-APPROACH project worked with prediction models based on machine learning and existing cohorts. Furthermore, IMI-APROACH included a larger number of patients without radiographic OA (45%) while in FNIH only a small subset did not have radiographic OA (12.5%) [8].

Recently data from the MOST study focusing on KL grade 2 and 3 knees reported cross-sectionally on frequencies of cartilage damage with a focus on spectrum of disease and variability of cartilage damage ranging from no damage to severe widespread damage [30]. In that study, 665 knees were included from participants with comparable demographics to our study. 79% of all knees (68% of KL2 and 94% of KL3 knees) showed widespread full-thickness cartilage damage. In IMI-APPROACH widespread full-thickness damage in at least one of the MOAKS subregions (i.e. MOAKS 3.2 and 3.3) was seen in 33% of all knees and in 6% of knees without ROA and 55% of knees with ROA. The additional compartmental analyses in IMI-APPROACH were performed for the area extent and full-thickness dimensions separately. Any baseline full-thickness damage grade 2 or 3 was seen in 41% respectively 18% for the entire cohort.

The machine-learning-based predicted structural progression probability score, which was used for enrollment of participants in the IMI-APPROACH project, was not part of our analysis as the current study focused descriptively on the baseline frequencies and change over time [15]. Prediction of progression will be a focus of additional work. A recent report found no associations between predicted s-score and actual observed quantitative cartilage thickness loss [31].

Reliability analysis was performed on 20 knees for a spectrum of structural disease severity in cross-sectional and longitudinal fashion. Longitudinal reliability has rarely been described for SQ scoring [32]. While reported values for cross-sectional assessment were in the range of expectation for very experienced readers as in the current study, the longitudinal values are highly influenced by the prevalence of observed change. For this reason, these values have to be interpreted with caution and we have presented actual change frequencies in addition for better interpretability.

Definition of change using SQ approaches is challenging as there are multiple possible definitions including subregional or maximum-grade approaches. Few studies are available that have focused on longitudinal change of MRI parameters using SQ assessment including the FNIH cohort [28]. Runhaar and colleagues suggested definitions of change that are largely similar to our description, but did not incorporate the number of subregions approach [33]. When assessing change over time using SQ MRI approaches, scores are often presented as mean values or summed over a defined anatomical region (usually compartment or knee) [34, 35]. For several reasons, such approaches are sub-optimal as sums are challenging to compare. For example, a sum of 5 acquired over 5 distinct subregions of a given compartment may mean one lesion with a grade 5 (considered severe) while 4 subregions will not exhibit any lesion (grade 0); alternatively, it may reflect grade 1 lesions across 5 subregions. This is the reason why we focused on a number of subregion- and maximum grade-approach. More work is desirable on the prognostic implications of having widespread low-grade involvement vs. focal severe damage.

Part of the study design was reading of MRIs in chronological order not blinded to time point, which is an established approach that increases sensitivity to change compared to blinded reading [36]. Only reading un-blinded to time point allows for the application of within-grade changes, which increases sensitivity for the detection of minor changes [23]. Analytic approaches using SQ MRI data should include the number of subregions or locations affected by tissue pathology, with further possible stratification using cut-offs related to severity of a certain feature. In addition, an approach considering maximum change over a pre-defined unit, such as a knee compartment or the entire joint, adds to the understanding of the amount of change observed, which may be lost using a summative approach.

Our study has several limitations that need mentioning. We presented the SQ MRI data in purely descriptive fashion and did not analyze prediction regarding presence of baseline features or concurrent change and subsequent structural or clinical outcomes, which will be focus of future work. Secondly, due to the wide range of structural disease severity the IMI-APPROACH study is not easily translatable to other datasets. Thirdly, two of the centers used 1.5 T MRI systems while the others employed 3 T systems. There is no data available regarding a direct comparison of SQ scoring of knee OA using 1.5 T vs. 3 T systems. Most of the available literature focused on assessment in the context of knee trauma and did not find marked differences [37,38,39]. One study compared a 1.0 T extremity system with a 1.5 T standard system regarding SQ knee OA assessment and found very comparable results [40]. While we cannot rule out that the image quality on the 3 T systems may have been slightly superior, an omission of relevant joint pathology due to the lower field strength at 1.5 T seems highly unlikely. Finally, we focused on SQ MRI assessment only and did not analyze correlations or concurrent changes with other measures of progression such as radiography or quantitative MRI.

Conclusions

In summary, a wide range of MRI-detected structural pathologies was present in the IMI-APPROACH cohort. More severe changes, especially for BMLs, cartilage and meniscal damage were detected primarily among the ROA group suggesting that once disease is structurally established it progresses more likely than pre-radiographic OA. The role of structural predictors of progression that are also potential therapeutic targets for cartilage-anabolic or anti-catabolic approaches, anti-inflammatory agents or compounds targeting subchondral bone changes should be the focus of further evaluation. In addition, the complexity of the different SQ scoring systems needs to be considered when engaging in analyses focusing on change over time.