Background

Anxiety disorders are currently the most prevalent class of psychiatric disorders worldwide, impacting an estimated 4.1% of 10-19-year-olds [1]. Canadian youth have an estimated six-month prevalence of 11 to 15% [2] with the most frequent diagnoses being separation anxiety disorder, specific phobias, social anxiety disorder, generalized anxiety disorder (GAD), panic disorder, and agoraphobia. Over the last three decades, a steep increase in the prevalence of anxiety disorders has been observed [3]. Considering the impact of the COVID-19 pandemic [4], this trend is likely to continue [5]. These disorders typically have an early onset in childhood/adolescence resulting in substantial impairment across the lifespan [6,7,8,9].

Anxiety disorders commonly co-occur; multiple correlations have been identified among different anxiety disorders [10], particularly between agoraphobia and social anxiety disorder (r = 0.68), panic disorder (r = 0.64), and specific phobia (r = 0.57), and specific phobia and social anxiety disorder (r = 0.50). High current and lifetime comorbidities are also observed with other psychiatric disorders, especially depression as over 50% of individuals with depressive disorders report a history of an anxiety disorder [11]. Substantial overlap has also been observed with post-traumatic stress disorder (PTSD), obsessive-compulsive disorder (OCD), substance use disorders, and attention deficit/hyperactivity disorder (ADHD) [10].

Both genetic and environmental factors play an important role in the intricate pathogenesis of anxiety disorders; in particular, genetic factors account for the moderate stability of anxiety disorders across the lifespan [12]. Current heritability estimates converge to rates around 35% for GAD and around 50% for social anxiety disorder, panic disorder, and agoraphobia [13]. The mode of inheritance is complex, with many genetic variants of small effect interacting with, or adding to other (environmental) risk factors [14, 15]. Importantly, heritability estimates of child and adult anxiety measures differ [16]. Longitudinal twin studies suggest that heritability is high in childhood but decreases over adolescence and into adulthood [12, 17, 18]. The genetic structure of anxiety disorders also seems to change across development. Anxiety subtypes in adults seem to fit a 2-factor model characterized by distress (GAD and depression) and fear (panic disorder and specific phobias) [19], but different structures have been found in youth with different genetic influences on anxiety and depression in childhood, common genetic vulnerability for anxiety and depression emerging in adolescence, and broadening associations in young adulthood [20].

The most well-researched source of genetic variation known to influence the risk of psychiatric disorders are common single nucleotide polymorphisms (SNPs). Genome-wide association study (GWAS), which enables the search for risk variants across the genome, is ideally suited to study common genetic risk factors for polygenic conditions such as anxiety disorders. GWAS for specific anxiety disorders and traits were historically severely underpowered [13]. To overcome sample size limitations, researchers started analyzing disorder subtypes together. By meta-analyzing the results of 7 GWAS on 5 clinically ascertained anxiety disorder subtypes (n = 17,310), the ANGST Consortium study [21] identified 2 genome-wide significant loci. A GWAS in the UK biobank on composite anxiety phenotypes using self-reported symptoms and diagnoses (n = 83,566) identified 5 genome-wide significant loci [22]. In addition, the largest anxiety GWAS to date was performed in 175,163 European and 24,448 African military veterans using a 2-item dimensional measure of GAD [23]. The study identified 6 significant loci for anxiety in European Americans and one in African Americans. But GWAS studies of anxiety phenotypes in youth have thus far been unsuccessful in identifying any genome-wide significant loci due to reasons such as low power and heterogeneity [24,25,26].

Genetic correlations can guide our understanding of the nature and patterns underlying complex traits and disorders. Large GWAS of anxiety disorders show strong positive genetic correlations with major depressive disorder (MDD) (rG = 0.78) [22, 23]. Accounting for comorbid MDD results in diminished but still significant SNP-based heritability for anxiety symptoms [23], indicating shared but also specific genetic effects of MDD and anxiety. Genetic correlations have additionally been observed between anxiety and other psychiatric disorders (e.g., schizophrenia, ADHD), sleep, and cardiometabolic traits and risk factors [22, 23]. GWAS of internalizing disorders in youth showed strong genetic correlations (rG > 0.7) with adult anxiety. However, the observed correlations with adult anxiety disorders were partial rather than complete, indicating that from a developmental perspective, childhood/adolescent internalizing symptoms are not genetically identical to adult anxiety or depression [25]. Given these differences, further clarification of specific genetic contributions to youth anxiety is needed.

Behavioural and cognitive traits, particularly behavioral inhibition (BI), inhibitory control and avoidance, are known to confer risk to later development of anxiety disorders. BI is a strong vulnerability marker of anxiety [27, 28] and is defined as an early childhood temperament characterized by shyness, fear, negative reactions to novelty, and avoidance of unfamiliar contexts or people [29,30,31]. Although BI is the best-known risk factor for anxiety disorders and associated with a 4–6 fold increased risk [32], only an estimated 40% of behaviourally inhibited children will develop anxiety disorders [33]. Research suggests that inhibitory control (the ability to inhibit responses to goal-irrelevant stimuli) plays a moderating role in the trajectory from childhood BI to adulthood anxiety disorders. For example, youth who inhibit their impulses best typically develop anxiety disorders in adulthood [34, 35], thus highlighting the need to assess both BI and inhibitory control in youth. Avoidance of stimuli or situations perceived as dangerous or threatening is a cardinal feature of anxiety disorders. This avoidance is self-reinforcing, sha** further retreat over time [36]. Avoidance is a primary intervention target. At its core, avoidant behavior is fueled by a desire to avoid danger, a feature that makes anxious youth vigilant for threat and prone to exaggerate their interpretations of it. Risk avoidance is well studied in anxious adults [37, 38] and, to a lesser extent, in youth with anxiety [39,40,41]. Among factors that drive avoidant behavior, the aversion to risky behaviours might be of particular relevance in the etiology of anxiety disorders. Measuring inhibitory control and risk tolerance in the context of anxiety could help elucidate cognitive mechanisms underlying youth anxiety.

The first-line treatment option for anxiety disorders in youth is cognitive behavioural therapy (CBT) [42]. CBT involves psychoeducation about anxiety, teaches youth skills for managing fears (e.g., relaxation, cognitive restructuring, problem solving), and helps youth to gradually face their fears while minimizing avoidance (i.e., exposure) [43]. The effectiveness of CBT (face-to-face or online) for youth anxiety has been demonstrated in several randomized control trials indicating large pre- to post- treatment effects and demonstrating superiority over control conditions [44, 45]. Valid second line treatment options are medication monotherapy, i.e., selective serotonin reuptake inhibitors [46], as well as the combination of CBT and medication [47, 48]. Nonetheless, one in three youth fail to respond to existing treatments [49], and few remain in remission [50]. Many youths also do not seek and/or receive treatment [2]. Given that genetic risk factors can affect the clinical response of patients [51], the emerging field of therapygenetics may be particularly important in predicting treatment outcomes. Unfortunately, to date, samples of youth undergoing CBT for anxiety disorders are difficult to recruit and retain, and these analyses have so far been underpowered [52,53,54].

Methods

Aims

In the current article we outline the design and methods of the GAYA study that aims to better understand the genetic underpinnings of anxiety disorders in Canadian youth. The GAYA study will help close the above-identified gaps in existing research in youth anxiety through a framework of integrated specific aims that will enhance our understanding of the specific genetic contributions to anxiety from childhood through adolescence, and implications for treatment. The specific aims and hypotheses are outlined in Table 1.

Table 1 Specific aims and hypotheses

Participants

The study will make use of a population-based design enriched for youth with anxiety disorders as this sampling scheme has been shown to result in the highest power per included individual for quantitative and categorical traits with limited risk of biases [55,56,57,58]. The goal is to recruit 13,000 youth aged 10–19 of which 50% are expected to endorse symptoms of anxiety that indicate the presence of an anxiety disorder. This will be achieved by sampling from clinical settings as well as the general population. Youth will be recruited across Canada to the GAYA study with local sites in Calgary, Halifax, Hamilton, and Vancouver. The Toronto site will recruit from existing participants of Spit for Science [110] and lower numbers indicating increased risk avoidance.

As youth with anxiety disorders are likely to be affected by the presence of an unknown examiner [111], and to increase accessibility, we developed the GAYA app for administering the neurocognitive tasks. Co- designed with youth, the GAYA app allows youth participants to complete the tasks remotely in an environment of their choosing. The app can be installed on smartphones and tablets with iOS or Android operating systems. Participating youth will be provided with a download link for the app and their personal login credentials by the study team.

Saliva samples

Saliva sample collection will follow established protocols. Saliva-derived DNA has been shown to perform nearly as well as blood DNA [112, 113] and is routinely used in large-scale genetic studies in youth [60]. Youth will be invited to provide saliva samples at the nearest individual recruitment sites (Calgary, Halifax, Hamilton, or Vancouver) with an OG-600 Oragene saliva DNA sample kit or choose to have an OCR-100 ORAcollect DNA sample kit mailed to their home with a pre-paid return envelope. De-identified research IDs will link data and saliva samples. The linked IDs will be logged in REDCap to document and track the handling of biologic samples. Participants will be instructed on how to provide a sample by a trained research staff member via live video or a pre-recorded video and receive written instructions. Toronto participants will have already provided their saliva samples at the time of participating in Spit for Science using OG-600 Oragene saliva DNA sample kits.

Optional intervention

Intervention procedures

Youth enrolled in GAYA who are ages 13–19, not currently in mental health treatment, and do not endorse psychosis screening questions will be offered the opportunity to participate in a self-managed Internet-based CBT (iCBT) program, Breathe (Being Real, Easing Anxiety: Tools Hel** Electronically) [114], at all recruitment sites. These youth will be linked to the Strongest Families Institute (SFI, http://strongestfamilies.com/), where all referred youth will receive the Breathe program that is part of SFI’s validated eplatform IRIS (Intelligent Research and Intervention Software).

Through weekly self-managed check-ins, during which youth assess and rate their social-emotional functioning over the past week, a detailed monitoring of anxiety symptoms over the course of the Breathe program will be enabled. Youth will also be asked to rate their anxiety symptoms based on the SCARED pre-, post-treatment, and at a 3 months follow-up. Intervention response will be defined as the changes in SCARED scores from pre- to post-treatment for the 3 months follow-up.

Breathe intervention description

Breathe is a self-mediated 6-module standardized iCBT program that involves: (a) multimedia-based education about anxiety problems and approaches to overcoming anxiety (e.g., reviewing why exposure exercises are important); (b) self-assessment activities to determine level of intervention and safety needs; (c) activities that teach users about anxiety sensitivity and how to develop realistic thinking about anxiety-producing situations; (d) activities for practicing co** and relaxation skills; (e) development of a hierarchy of feared situations and steps for gradual and repeated exposure to feared situations (using imagery/in vivo activities); (f) contingency management (examining the function of anxiety from a reinforcement perspective) and modelling (viewing videos of others confronting feared situations); and (g) skills for maintenance and relapse prevention. Animations, embedded videos, timed prompts, and on-screen pop-ups are used in each module to provide an interactive and multimodal experience. In one of the largest effectiveness trials of iCBT in adolescents conducted to date, that used SFI’s IRIS eplatform, 563 adolescents aged 13–19 were randomly assigned to 6 weeks of Breathe (n = 280) or to visit a static (no elements of interactivity or personalization) website which provided resources for anxiety (n = 283) [115]. In the trial, adolescents who participated in Breathe had a greater improvement in symptoms 3 months after program use (p = 0.04) [115]. Although this Breathe study was complimented by one telephone coach support session to those who wanted the additional support, the current study will only be self-mediated.

Collection of biomaterials, DNA extraction and genoty** process

DNA extraction from saliva is performed at each individual site following best practice and according to the manufacturers’ protocols. Subsequent genoty** will be performed in 3 batches. All individuals will be genotyped (using DNA from the saliva samples) with Illumina’s Global Screening Array v3.0 (GSA). The GSA is a cost-effective genoty** array that is routinely used for population-scale genetic studies around the globe. For genoty** calling Illumina’s GenomeStudio will be used. Spit for Science genoty** will be done locally using the same Illumina array and their genetic data is sent to the Halifax site once the participant has completed the questionnaires and/or app portion of the study.

Quality control (QC) and imputation of GWAS data

The Ricopili pipeline will be used to perform QC of the genetic data within and across ancestral stratified subgroups (based on demographic information) [116]. Ricopili has been extensively used by international consortia for their large-scale GWAS and makes use of well-established analytic software during its processing steps (e.g., PLINK [116]). Analyses will look for any (hidden) relatedness or sample duplicates as part of our QC and flag individuals that are related (pi_hat > 0.2) for downstream analyses. Population substructure will be (re-) examined by principal components (PC) estimation and support vector machines will be run in joined PC analyses with a reference sample of known ancestral background (TopMed) to annotate the sample with population substructure information and compare this information with the demographic information collected during the online assessment. For imputation, a pre-phasing/imputation stepwise approach as implemented in Ricopili [117] will be used with and across population subgroups identified in our sample. This will include the evaluation of a potential increase in power through usage of imputation approaches that use local ancestry to enable inclusion of admixed individuals in the GWAS [118]. ChrX imputation will be conducted separately by sex for subjects passing an additional QC designed for these purposes [119]. For downstream analyses SNPs that have an INFO > 0.8 and a MAF > 0.01 will be considered.

Data analysis strategy

Appropriate covariates (e.g., age, sex, gender, recruitment procedure/site) will be included for analyses in each trait. Where necessary and appropriate analyses will control for current/past treatment history through established protocols. For all GWAS, the impact of population substructure on the genome-wide test statistics using λGC [120] and Linkage Disequilibrium score regression (LDSC) analyses [121] will be evaluated. There are clear sex differences described in the epidemiology of anxiety. Anxiety disorders and symptoms occur more often in women, and the odds of develo** an anxiety disorder is 1.7 times greater for women than men [122]. The analyses will therefore be stratified by sex and gender to explore shared and unique genetic contributions.

Data analysis will be aligned with each of the specific aims. For all results identified in aims 1–4, using MAGMA [123] and LDSC [124] tissue and single cell enrichment analyses will be conducted compiling publicly available single-cell RNA-sequencing data from five studies of the human and mouse brain [125,126,127,128]. Similarly, transcriptome-wide association studies will be conducted using FUSION [129] and expression quantitative trait locus data from the PsychENCODE Consortium (1,321 brain samples) [130]. In addition, analyses will include EpiXcan [131], an elastic net-based method, which weighs SNPs based on epigenetic annotation information [132].

Specific aim 1: identify genetic risk factors associated with clinical symptoms and vulnerability markers of youth anxiety

Following best practices in the field, additive model GWAS analyses will be conducted for common SNPs and each quantitative trait. To increase power, multivariate-based approaches will be employed that enable us to address the complex relationship of the anxiety phenotypes (SCARED subscales, BIS/BAS subscales, inhibitory control, and risk avoidance) amongst each other but also in relationship to the genetic data. As such the GW-SEM software package [133, 134] will be used. Genome-wide results from GAYA study samples will be meta-analysed with other available samples [25, 61] using inverse-variance weighting with METAL [135] and accounting for population structure. Sensitivity analyses using structural equation modelling (SEM), via genomic SEM, will help to address potential heterogeneity. Established approaches (LDSC [120] and GCTA [136]) will be used to study liability-scale heritability of clinical symptoms and vulnerability markers of youth anxiety. Partitioned heritability across minor allele frequency bins and functional annotations (e.g., cell-types) using the same software packages and publicly available data (e.g., from the PsychENCODE consortium [137]) will provide further insights into the genetic relationship of different clinical symptoms and vulnerability.

Specific Aim 2: identify genetic factors that are unique to anxiety in different age groups

Multi-trait conditional and joint analysis [138] to adjust GWAS summary statistics from the GWAS in youth for the genetic effects in adult anxiety to identify putative age group-specific SNP associations. It is noteworthy that previous analyses in closely related traits of similar sample size (e.g., ADHD) were able to identify new genome-wide significant hits specific to the traits under analysis [138]. Similarly, summary statistics from youth and adult anxiety GWASs will be analysed with ccGWAS [139], a tool designed to identify loci with different allele frequencies among different trait groups. Using ccGWAS genetic loci will be identified that are specific in their association to the individual age groups.

Using established protocols (SBayesR [140]/LDpred2 [141]/PRSice2 [142]) genomic risk profile scores (GRPS) will be generated in the GAYA study sample trained on discovery datasets from different age groups (i.e., pairwise between the youth and adult GWAS) to assess the variance explained through genetic liability for anxiety disorders in one age group for the other. Finally, where appropriate, CLiP [143] will be used to study heterogeneity in GRPS for the GAYA study sample.

Specific aim 3: identify genetic factors that are shared and unique between anxiety and its common comorbidities

Via LDSC [120] patterns of genetic correlation, common comorbidities (e.g., MDD, ADHD) with youth anxiety will be analyzed. Genomic SEM will be run including the youth anxiety GWAS along with the newest adult anxiety GWAS [21,22,23] to investigate the multivariate genetic architecture across youth anxiety and its comorbidities. In this multivariate GWAS, it will be possible to identify loci that confer risk to multiple disorders (i.e., that are shared across disorders).Two-sample Mendelian Randomization (MR) analyses will be conducted using the inverse-variance-weighted (IVW) MR method to investigate associations between the genetic liability for youth anxiety and adult-onset mental disorders, while further ensuring the robustness of our IVW estimates through MREgger and the MR robust adjusted profile score approach [144, 145].

Specific aim 4: identify a prediction model of treatment response in youth with anxiety disorders

GWAS data for general susceptibility for major psychiatric illnesses (such as adult and youth anxiety, MDD, ADHD, and others), and antidepressant treatment response [146] will be used to train GRPS in the GAYA study sample. For each of these GRPS the amount of variation explained in the clinical response of youth receiving Breathe will be assessed. It will also be evaluated how much of this variation can be explained by clinical symptoms/vulnerability markers and subsequently whether the combination of these measures (i.e., GRPS plus clinical symptoms/vulnerability markers) can increase our ability to predict clinical response in youth with anxiety disorders. Further, explorative GWAS will be conducted in two datasets: (a) 3,000 youth recruited to receive CBT and (b) around 10,000 individuals (all age groups, including the 3,000 youth) by combining the GAYA study sample with samples available via collaborations [52, 61].

Youth council

A national youth council, including members of established youth councils at study sites, will consult through all phases of the design and management of the study. Meetings will be conducted virtually to allow for geographic diversity and youth council recruitment will prioritize representation of diverse demographics. Youth will advise on several aspects of GAYA, including recruitment strategies (i.e., flyers and posters); contact management and retention tools; study measures and instruments; assessment instrument package; and the assessment package’s length and readability. Additionally, as part of the knowledge translation plan, youth will be included in the interpretation of findings and their presentation through various knowledge translation activities, such as presentations, publications, short videos, infographics, and webinars co-led by youth.

Discussion

While anxiety disorders have become more common in youth over the years and exacerbated by the pandemic, efforts aiming to explore the genetic underpinnings of anxiety disorders are limited. Twin studies strongly suggest that genetic susceptibility plays a role in the development of anxiety disorders and that this role is age-dependent [12, 17, 18] Thus, the GAYA study has the potential to fill an important gap in our current knowledge. Study results will significantly contribute to a better understanding of the developmental trajectory of anxiety disorders and its common comorbidities, increasing knowledge in relation to the high rates of co-occurrence observed across psychiatric disorders [22, 23]. As analyses of youth undergoing CBT for anxiety disorders have been previously underpowered [52,53,54], the GAYA study’s sample size will further inform prediction models of treatment response. Finally, study results are expected to inform early intervention or preventative strategies and suggest novel targets for therapeutics and personalization of care.