Background

Polycystic ovary syndrome (PCOS) is the most common endocrinopathy in women of reproductive age with an estimated prevalence of 8–13% [1]. Its pathogenesis includes insulin resistance and hyperandrogenism which drive the reproductive (menstrual dysfunction, infertility), metabolic (metabolic syndrome, diabetes, cardiovascular risk factors), and psychological (anxiety, depression, low quality of life) complications [2]. Given the high prevalence and diverse features across the lifespan, as well as the high prevalence of obesity which further exacerbates its clinical features, PCOS contributes to the global burden of disease [3]. It is therefore imperative to recognize the condition early to facilitate interventions and prevent complications.

Being a heterogeneous disorder, the diagnosis of PCOS is difficult and often delayed [4]. PCOS diagnosis is based on oligo-anovulation (OA), biochemical or clinical hyperandrogenism (HA), and polycystic ovary morphology (PCOM) on ultrasound extending across the original 1990 National Institutes of Health (NIH) criteria (OA and HA) [5], the 2003 Rotterdam criteria (any two of OA, HA, and PCOM) [6], and the Androgen Excess and Polycystic Ovary Syndrome (AE-PCOS) Society criteria (HA and OA or PCOM or both) [7]. The Rotterdam criteria are now widely accepted and generate four possible diagnostic PCOS phenotypes in adult women: (A) OA + HA + PCOM, (B) OA + HA, (C) HA + PCOM, and (D) OA + PCOM [6]. The Rotterdam criteria are recommended and endorsed by the 2018 international PCOS evidence-based guideline, which was co-developed based on unprecedented evidence synthesis and best practice methods, by world-leading multidisciplinary clinicians and researchers across 37 societies from 71 countries, with consumer engagement [8].

PCOS is more challenging to diagnose in adolescents, as menstrual irregularity and multi-follicular ovaries are part of normal pubertal physiology and the application of adult criteria results in a high prevalence and may over-diagnose PCOS [9, 10]. Available recommendations on adolescent PCOS diagnostic criteria are inconsistent. The 2018 international PCOS guideline updated the Rotterdam criteria and now recommends applying OA and HA while avoiding PCOM for PCOS diagnosis in adolescents [8]. However, this evidence-informed recommendation was ultimately based on expert consensus with limited evidence on the most accurate diagnostic approach in adolescents and on the natural history of PCOS phenotypes over time. It remains unclear if the 2018 updated Rotterdam criteria capture adolescents with PCOS who are at the greatest risk of long-term complications and would benefit the most from lifestyle preventative interventions [8].

Long-term weight gain is a major health concern for women with PCOS and a key pathophysiological contributor to PCOS severity [4]. More than 60% of women with PCOS are above healthy body mass index (BMI), exacerbating metabolic, reproductive, and psychological features of PCOS [11, 12]. These effects can be ameliorated by 5–10% weight loss, and lifestyle intervention to prevent weight gain and promote weight loss is therefore the cornerstone of PCOS management [8, 13]. However, existing studies examining the natural history of weight gain in women with PCOS are limited to largely clinic-based adult populations and women with self-reported PCOS and do not differentiate across various PCOS diagnostic criteria or phenotypes [14,15,16,17,18,19,20,21,22].

To address research priorities and evidence gaps, the aims of the present study in an unselected adolescent population were threefold. Firstly, we aimed to examine the impact of the original 2003 versus the 2018 updated Rotterdam criteria on the prevalence of PCOS diagnosis. Secondly, we aimed to examine the natural history of BMI trajectories in women with and without PCOS from birth until young adulthood, applying both the original and updated adolescent Rotterdam criteria. Thirdly, we aimed to determine BMI trajectories across adolescent phenotypes.

Methods

Study design and setting

The Raine Study is a prospective cohort study aiming to investigate the influences of familial, intrauterine, perinatal, and environmental factors on health across the lifespan [23]. Pregnant women between 16 and 20 weeks of gestation who attended public and private antenatal clinics in Western Australia were recruited from 1989 to 1991. More than 2900 pregnant women enrolled in the study and resulted in 2868 live births [23]. To date, the cohort has been followed up for more than 20 years with greater than 70% of the participants still engaged in the study [23]. Data were collected from four generations (mothers and partners originally recruited into the study (Gen1), Raine Study participants (Gen2), offspring of the participants (Gen3), and grandparents of the participants (Gen0)) in the form of surveys, physical examination, and clinical laboratory testing. Between ages 14 and 16, 723 postmenarchal Gen2 adolescent females were invited to participate in the Menstruation in Teenagers Study which involved the collection of self-reported menstrual diary, urinary progesterone analysis, clinical assessment of hirsutism and acne, biochemical measurement of androgen profile, and ultrasound evaluation of ovarian follicles [24]. A total of 244 Gen2 females consented to participate, and their mean age at assessment was 15.2 years [9, 24]. All follow-up assessments were approved by the ethics committees of King Edward Memorial Hospital and/or Princess Margaret Hospital. Further details of the Raine Study are available at www.rainestudy.org.au.

Outcomes

The primary outcomes of this study were PCOS prevalence and BMI calculated as weight in kilograms per meter squared of height at each follow-up assessment. Participants’ anthropometric data were measured at birth and at ages 1, 2, 3, 5, 8, 10, 14, 16, 20, and 22 by trained research assistants using standardized protocols [25]. Anthropometric data collection was limited to 600 participants of the entire cohort at age 2 due to limitations in funding [25]. Length or height was measured using the Harpenden Neonatometer to the nearest 0.1 cm by two people at birth and age 1 in a supine position [25]. From age 2 onwards, height was measured using a Holtain stadiometer in an anatomical position with shoes off and heels, bottom, and head against a board [25]. Weight was measured with light clothing (running shorts and singlet top) to the nearest 100 g using calibrated hospital scales at birth and Wedderburn digital chair scales from age 1 onwards [25].

Exposure

The primary exposure was PCOS with the diagnosis ascertained using the original Rotterdam criteria (two out of three clinical features) and updated Rotterdam criteria (OA and HA) (see Table 1 for phenotypes). OA was assessed by a combination of menstrual diary and 12-weekly urinary progesterone metabolite PdG analyses [9, 24]. OA was defined as menstrual cycle length less than 21 or more than 35 days (as the time of assessment was approximately 3 years postmenarche), or where the cycle length varied by more than 4 days, or the urinary PdG to creatinine ratio was less than three times baseline secretion in at least two of the months assessed [9, 24]. Secondary causes of menstrual irregularity such as thyroid disorders and hyperprolactinemia were excluded in all participants [9, 26]. Androgen profile was measured during the early follicular phase (days 2 to 6 of the menstrual cycle) between 15:30 and 16:30 to account for the diurnal variation of androgen production and to fit in with the participants’ school commitments [9, 24]. Total testosterone was measured using a double-antibody radioimmunoassay (DSL-4100, Beckman, Australia: lower limit of sensitivity 0.347 nmol/L; conversion factor to conventional units divide by 0.347 for nanograms per deciliter, intraassay and interpatient coefficients of variation are 6% and 15% at the 1 nmol/L concentration, respectively); sex hormone-binding globulin (SHBG) was measured using a non-competitive liquid-phase immunoradiometric assay (SHBG-IRMA kit; Orion Diagnostica, Espoo, Finland: lower limit of sensitivity 1.3 nmol/L, interassay and intrapatient coefficients of variation 2.0 to 8.6% and 15.4%, respectively) [9, 24, 26]. Biochemical HA was defined as the top 25th centile for circulating free testosterone concentrations (calculated using the Vermeulen equation based on total testosterone and SHBG concentrations, conversion factor to conventional units divide by 0.347 for picograms per deciliter) which was at least 24.45 pmol/L for this data set [9, 24, 26]. A trained nurse evaluated clinical HA using the modified Ferriman-Gallwey score of ≥ 8 to determine hirsutism [9, 24]. PCOM was determined using adult criteria (defined as ≥ 1 ovary ≥ 10cm3 in volume or ≥ 12 follicles between 2 and 9 mm diameter) [27] and evaluated using transabdominal ultrasound with a full bladder during the early follicular phase [9, 24, 26, 28]. All ultrasounds were performed by one of two experienced gynecological ultrasonographers while the images were evaluated by one expert radiologist. Either a 5–2-MHz transducer (U22; Philips Medical Systems, Bothell, WA) or a 4-MHz transducer (Voluson 730 Expert; General Electric Milwaukee, WI) was used. The uterine and ovarian volumes were estimated using the formula 0.523 × length × width × height of the organ [29, 30]. Antral follicles were defined as follicles < 10 mm in diameter. Follicular number was assessed by scanning each ovary from the inner margin to the other margin in a longitudinal cross-sectional scanning plane. If a follicle ≥ 10 mm was seen, the ultrasound was repeated in the early follicular phase of the next cycle [9, 24, 26, 28].

Table 1 PCOS phenotypes of each comparison group

Covariates

Maternal antenatal information was collected by researchers from maternal medical records. Age at menarche, education level, employment status, smoking status, relationship status, family income, and personal income were collected from surveys at multiple follow-up points. Physical activity was assessed using the International Physical Activity Questionnaire (IPAQ) [31] during age 20 and 22 follow-ups, and the subsequent metablic equivalent (MET)-minutes per week were computed. Energy intake (kcal) was assessed at age 14 via 3-day food diaries with clarification through a follow-up phone call by a dietitian [32].

Statistical analysis

Analyses were restricted to Gen2 participants in the Menstruation in Teenagers Study who were not on combined oral contraceptive pills and who had a complete assessment of PCOS features. Participants’ characteristics and anthropometry were cross-tabulated using means and standard deviation (SD) or median and interquartile range (IQR) for continuous variables and frequencies for categorical variables. Differences between PCOS status subgroups were assessed using independent t test, Fisher’s exact test, Pearson χ2 test, or Mann-Whitney test as appropriate. Cross-sectional analysis of group-level BMI by PCOS status at each follow-up time point was summarized using means and SD and compared using independent t tests. Longitudinal analysis of BMI was performed using generalized estimating equations (GEE) (Gaussian family, identity link, exchangeable covariance structure) which account for between-subjects and within-subjects’ relationships, as well as incomplete follow-up data. BMI change over time was assessed by including PCOS status by time or PCOS phenotypes (non-PCOS, OA + HA, HA + PCOM, OA + PCOM) by time as interaction terms. Covariates were included in the multivariable GEE model if they exhibited a p value of < 0.1 in the univariate model. The final included covariates were age of menarche, family income at age 14, and smoking and marital status at age 22. BMI trajectories were compared between participants with and without PCOS by their PCOS phenotypes. Stata software version 15 (StataCorp, College Station, TX) was used for statistical analysis.

Results

Participants characteristics and prevalence of PCOS

Of the 244 participants, 17 were excluded due to oral contraceptive pill usage (n = 12) and incomplete PCOS assessment (n = 5). The remaining 227 were included in our analysis, and the dataset contained a total of 1909 anthropometric measurements allowing BMI calculation from birth until age 22. The median number of anthropometric measurements per participant was 9 (range 4–10).

PCOS was diagnosed in 66 participants (prevalence of 29.1%, 95% confidence interval (CI) 23.5–35.4%) using the original Rotterdam criteria and 37 participants (prevalence of 16.3%, 95% CI 12.0–21.7%) using the 2018 updated Rotterdam criteria (Table 2, Table 1). Participants with and without PCOS on both diagnostic criteria had similar antenatal history, gestational age at delivery, birth measurements, age of menarche, family income, personal income, relationship status, and smoking status. While daily energy intake at age 14 appeared similar for participants with and without PCOS on either diagnostic criteria, participants with PCOS based on the original Rotterdam criteria were less physically active than participants without PCOS (Table 2). The prevalence of participants in each BMI category at age 22 among participants with and without PCOS was similar using the original Rotterdam criteria but different among the subgroups in the 2018 updated adolescent Rotterdam criteria (Table 2).

Table 2 Participants’ characteristics

Cross-sectional BMI differences by diagnostic criteria

Table 3 details the participants’ mean BMI over time. Using the original Rotterdam criteria, participants with and without PCOS had similar BMI in childhood, but BMI in participants with PCOS was significantly greater than in those without PCOS from age 14 onwards (BMI at age 14, 22.8 ± 4.4 vs 21.0 ± 3.4, p = 0.003). Using the updated 2018 adolescent Rotterdam criteria, the divergence of BMI occurred earlier where participants with PCOS had higher BMI than participants without PCOS from age 5 onwards (BMI at age 5, 16.3 ± 2.0 kg/m2 vs 15.6 ± 1.4 kg/m2, p = 0.013).

Table 3 Cross-sectional comparison of BMI at each time point

Longitudinal BMI change by diagnostic criteria and phenotypes

Table 4 shows the longitudinal models of BMI change over time stratified by PCOS status, adjusted for age of menarche, family income at age 14, smoking status at age 22, and marital status at age 22. Using the original Rotterdam criteria, participants with PCOS had significantly greater BMI increase than participants without PCOS from age 14 onwards (Wald test for overall differences p < 0.001). However, on the updated 2018 Rotterdam criteria, the BMI increase was greater in PCOS than in those without PCOS from age 10 onwards (Wald test for overall differences p < 0.001).

Table 4 Difference in longitudinal BMI change from baseline (age 1) stratified by PCOS diagnostic criteria

To examine if the updated 2018 Rotterdam criteria identified participants at risk of higher long-term BMI gain, we analyzed 3 PCOS phenotypes (updated Rotterdam criteria phenotype OA + HA and two phenotypes excluded by the updated criteria HA + PCOM and OA + PCOM), by time as an interaction term in the longitudinal analysis (Table 5). After adjusting for age of menarche, family income at age 14, smoking at age 22, and marital status at age 22, compared to those without PCOS, phenotype OA + HA (updated diagnostic criteria) had greater BMI increase from age 10, whilst phenotypes HA + PCOM and OA + PCOM had comparable BMI changes over time as participants without PCOS. The adjusted predicted mean change in BMI trajectories of each PCOS phenotype and participants without PCOS is shown in Fig. 1.

Table 5 Difference in longitudinal BMI change from baseline (age 1) stratified by PCOS phenotype
Fig. 1
figure 1

Predicted longitudinal BMI change over time by PCOS phenotypes

Discussion

To the best of our knowledge, the present study is the first community-based prospective cohort study of women with and without well-characterized PCOS diagnostic features assessed in adolescent years from birth until young adulthood. Our data clearly demonstrates that in the adolescent population, the 2018 international guideline updated Rotterdam criteria detected a lower prevalence of PCOS of 16.3% compared with 29.1% using the original Rotterdam criteria. The updated criteria also identified adolescents with PCOS with rapidly increasing BMI trajectory, with long-term weight gain known to increase PCOS severity. This study also provides novel insights into PCOS diagnostic phenotypes and important patterns of long-term weight gain by phenotype.

PCOS diagnosis in adolescents is controversial as the diagnostic features of OA, HA, and PCOM overlap with normal pubertal physiology [8]. The 2018 international PCOS guideline process involved comprehensive evidence synthesis and reached an evidence-informed consensus recommendation, while also highlighting evidence gaps and research priorities. Importantly, the guideline recommended that all PCOS diagnostic phenotypes should be captured in research to clarify the long-term natural history [8]. Longitudinal studies examining the natural history of PCOS are scarce, mainly limited to adults in clinical settings and to those with self-reported PCOS or BMI status [14,15,16,17,18,19,20,21,22]. An important and large population-based Northern Finland Birth Cohort study has reported that women with PCOS have earlier adiposity rebound (the second rise in BMI following a nadir in early childhood) (age 5.2 ± 1.0 vs 5.6 ± 0.9, p < 0.001) and that BMI trajectories deviate around this age [17]. However, the Finnish cohort included self-reported BMI, and self-reported irregular menstrual cycles and hirsutism, or PCOS status in adulthood [17]. The Finnish study was unable to explore key research priorities on the implications of the 2018 updated Rotterdam criteria including accurate BMI trajectories or the differential BMI patterns across diagnostic phenotypes [17].

The current study examines measured BMI trajectories in community-based adolescents with well-characterized PCOS features from birth until young adulthood. It shows that in adolescents, the prevalence of the original Rotterdam criteria including all phenotypes was 29.1%, and the prevalence using the updated Rotterdam criteria (HA and OA) was 16.3%, a prevalence similar to that seen in adults of 8–13% on systematic review and 12–18% in an Australian population using the Rotterdam criteria [1, 33]. Capturing 29% of adolescents under the original Rotterdam diagnostic criteria may contribute to over-diagnosis at this life stage, potentially causing unnecessary psychological distress and financial and treatment burden. The fear of over-diagnosis may also limit the willingness of clinicians to diagnose PCOS in adolescents despite clear evidence showing under-diagnosis and delayed diagnosis cause significant frustration in the PCOS community [4]. Our findings of an adolescent prevalence that is similar to that seen in adulthood when applying the updated criteria may be reassuring for both clinicians and adolescents affected by PCOS.

A key concern in excluding phenotypes HA and PCOM and OA and PCOM in the updated 2018 diagnostic criteria is uncertainty about their natural history and potential for preventable long-term adverse outcomes. Our study provides novel insights into the natural history of BMI trajectories across PCOS diagnostic criteria and phenotypes, with those adolescents meeting updated criteria (phenotype HA and OA) having the greatest BMI increase over time and the PCOM inclusive phenotypes having a similar BMI trajectory to those not affected by PCOS. Weight gain is the key contributor to PCOS severity and long-term reproductive, cardiometabolic, and psychological complications. Our data are consistent with those from adult women suggesting more severe metabolic features in women with the HA and OA phenotypes, potentially linked to the metabolic impact of HA [34]. Our study supports the international PCOS guideline-recommended updated Rotterdam diagnostic criteria in adolescents and suggests that the updated Rotterdam criteria will identify the group that clinicians need most to target for prevention with early lifestyle intervention. The mechanisms underpinning the divergence in BMI prepubertally noted in both the Finnish cohort and our data are unclear [17]. Other long-term implications including infertility, diabetes, and psychological implications may still be increased in adolescents with OA and PCOM and HA and PCOM phenotypes, with future studies in this cohort needed to address these gaps.

Strengths of our study include the prospective design and unselected community-based population, increasing generalizability. All anthropometric measurements were collected in a standardized manner which reduced measurement errors and recall bias. Multiple time points and low dropout rates increased statistical power. Most importantly, the PCOS phenotypes in our study population were well characterized. The present study also has limitations. The upper limit of HA was set at the top 25% of free testosterone in this study population due to the lack of standardized reference range in this age group and to accommodate the variation of the participants’ gynecological age. This cutoff range was also previously used in other published Raine Study papers [9, 24, 26, 28]. Given the lack of standardized adolescent definition of PCOM, our study used the latest consensus definition of PCOM for adults at the time (≥ 1 ovary ≥ 10cm3 in volume or ≥ 12 follicles between 2 and 9 mm diameter) which was based on a study performed with 7 MHz transvaginal ultrasound transducer [27, 35]. However, it is noteworthy that definitions of PCOM change over time with advances in ultrasound technology, and the latest 2018 PCOS guideline now recommends using a 8-MHz transducer and the threshold updated to ≥ 20 follicles between 2 and 9 mm diameter and/or an ovarian volume of ≥ 10 cm3. The assessment of PCOM was conducted via transabdominal pelvic ultrasound in our study because most of these girls were not yet sexually active. We recognize that transvaginal ultrasound is more accurate in measuring ovarian volume and antral follicle count; however, this is often inappropriate in adolescents [8]. Overall, 91% of our population were Caucasian, limiting generalizability to other ethnicities. The diagnostic PCOS features were evaluated in the adolescent population, and our findings do not aim to reflect diagnostic approaches in adulthood, where ultrasound and PCOM inclusion in the diagnostic criteria are recommended. Finally, we do not yet know whether the adolescent phenotype of PCOS persisted into adulthood and the prevalence of infertility, diabetes, and psychological health in this cohort.

Conclusions

In conclusion, this study addresses key evidence gaps in PCOS literature and international research priorities, contributing novel findings on the reduced prevalence of PCOS using the updated Rotterdam diagnostic criteria in the adolescent population. We provide insight into the natural history of weight gain across PCOS diagnostic criteria and phenotypes in adolescents. We show that updated 2018 Rotterdam criteria requiring both HA and OA identify adolescents most at risk of excess weight gain as a key driver of PCOS severity, a group who should be targeted for early lifestyle intervention and prevention. Our findings support the 2018 international PCOS guideline’s updated Rotterdam diagnostic criteria and the omission of sonographic PCOM evaluation for adolescent PCOS diagnosis [8]. The use of the updated Rotterdam criteria may limit over-diagnosis of PCOS in adolescents, increase clinician confidence in accurate diagnosis in adolescents, and limit reciprocal under-diagnosis, currently rife in PCOS. Whilst the long-term natural history of clinical outcomes is yet to be elucidated, the 2018 international guideline recommends that adolescents who do not fulfill the updated Rotterdam criteria and have persistent oligo-anovulation or hyperandrogenism can be considered at risk for PCOS and be reassessed in adulthood.