Introduction

Attention-deficit/hyperactivity disorder (ADHD) represents one of the most common psychiatric disorders in children and adolescents with a worldwide prevalence rate of 5.2%.1 According to the Diagnostic and Statistical Manual of Mental Disorders, Fourth Edition (DSM-IV),2 ADHD is characterized by pervasive and impairing symptoms of inattention, hyperactivity and increased impulsivity. Family, twin and adoption studies indicate that ADHD is a highly heritable disorder; heritability estimates are consistently around 0.8.3, 4, 5, 6, 7 However, neither genome-wide association studies (GWAS) nor large scale meta-analyses of GWAS has so far unequivocally identified specific genes conferring major risk (see Hinney et al.8).

Copy number variations (CNVs) are, by definition, chromosomal deletions or duplications of at least 1 kb up to several Mb that are variable in size among carriers. At a genome-wide level, thousands of CNVs have already been identified. Several CNVs were shown to contribute to neurodevelopmental disorders such as schizophrenia,9 Parkinson disease (PD)10 or autism.11 Five genome-wide analyses of CNVs have been published in ADHD.Supplementary Text. All association analyses were performed on the QC filtered, rare CNV calls were investigated using the PLINK software (version 1.07).30 As primary analyses, we tested the hypothesis that particular rare CNVs might be found at an increased frequency in ADHD cases compared with controls. Locus-specific tests of association were performed (one-sided χ2 tests) and significance was assessed via permutation (empirical P-values based on 100 000 permutations) at a pointwise as well as at a genome-wide level. In more detail, we calculated the frequency of rare CNVs in ADHD patients, and we compared it to the frequency in the controls. The frequency was calculated at each unique start and stop site for rare CNVs that met all of the defined QC measures (defined in the Supplementary Text). Each site (5047 sites in total, located in 1083 non-overlap** genomic CNV containing regions) was assessed for a difference in CNV frequency between groups with the use of a permutation-based Fisher’s exact test in PLINK. We refer to a locus in the sense of a susceptibility locus as a genomic region that is exclusively made up of adjacently tested sites for which significantly more rare CNVs were observed in ADHD cases than in controls. These analyses were undertaken for all rare CNVs as well as stratified according to CNV type, that is, deletion or duplication. In the GWAS sample, empirical, genome-wide corrected P-values were generated by permuting affection status and simultaneously preserving the correlation structure of CNVs (100 000 permutations) to simulate the null hypothesis of no association. In other words, we used the permutation resampling method to correct for the multiple testing problem, which occurs when testing any identified locus of rare CNVs.

By application of PLINK’s case–control ‘cnv-enrichment-test’ function, we additionally tested whether CNVs in the PARK2 gene are enriched in ADHD GWAS cases compared with GWAS controls. In contrast to Fisher’s exact test, the ‘cnv-enrichment-test’ is robust to case–control differences in CNV size or CNV rate.31 Enrichment in cases is reported as one-sided empirical P-value using 100 000 permutations.

Analysis of the replication sample was performed to confirm the finding of the GWAS sample at the PARK2 locus (Results section) and we focussed the testing on this single rare CNV locus for an association to ADHD. Consequently, pointwise empirical P-values (100 000 permutations) for the replication sample were not corrected for multiple testing. The statistical analyses in the GWAS sample were repeated for quantitative (q)PCR-validated CNVs at the PARK2 locus as part of the sensitivity analysis. We applied a significance level α of 0.05 (globally for the genome-wide testing and locally for the replication sample).

As secondary sensitivity analyses (see Supplementary Text), we also assessed the genome-wide frequency of CNVs in ADHD cases compared with controls according to the average number of CNVs per sample. We expected more CNVs in the ADHD cases based on the literature.32 Thus, one-sided tests were applied to all rare CNVs, as well as to rare deletions and duplications only; and genome-wide multiple testing was dealt with using 100 000 permutations. Finally, we likewise tested whether CNVs in the ADHD cases were larger in size than those in the control group based on the average size of CNVs per individual.

CNV validation and replication analysis at the PARK2 locus

We performed real-time qPCR experiments to validate the CNVs by a Duplex TaqMan CNV assay (Applied Biosystems, Darmstadt, Germany, assay Hs03615859_cn at chr6: 162 696 987±50 bp, NCBI36/hg18) as described previously.33 Individual copy number status was determined for each ADHD patient of the GWAS sample. Briefly, every PCR was performed as a triplicate for each individual of the GWAS ADHD cases and the results from the qPCR were analysed using the software CopyCaller 1.0 (Applied Biosystems). In cases, 11 of the 12 CNVs identified with PennCNV covering the PARK2 locus (Results section) were technically validated by qPCR. qPCR experiments did not reveal any further CNV carrier, which was undetected in previous SNP array-based CNV detection analyses. Thus, CNVs at the PARK2 locus of GWAS ADHD patients could be validated with both, a low false-positive and a low false-negative rate. For a subset of controls (HNR controls), CNV calls, which were estimated to cover the PARK2 gene, were validated by qPCR. Apart from one potential CNV carrier, who was incorporated into statistical analysis at the PARK2 locus, we additionally considered the five HNR controls for which CNVs were estimated to flank the PARK2 locus (n=3) or for which CNVs were called but excluded due to their small size (spanned <15 probes) in the course of our CNV QC procedure (n=2). Moreover, we additionally included six randomly chosen control subjects of the HNR control sample. CNV analyses were performed blinded to the likely CNV status of the controls. For the PARK2 locus, all analysed CNV states could be validated. For the KORA and PopGen controls, no DNA was available for qPCR validation. However, given the high technical validation rate in the available DNA samples, validity of CNV calls was presumed to be comparably high for KORA and PopGen controls. Although false-negative and false-positive rates are unknown for the GWAS controls group, there is no obvious reason to expect that these rates would significantly differ between cases and controls. Notably, the low frequency of PARK2 CNVs in control subjects was consistently reported in the ‘Database of Genomic Variants’ (http://projects.tcag.ca/variation) and in two previous publications, where CNVs at the PARK2 gene were absent in 2026 healthy, population-based controlsSupplementary Text.

Results

The GWAS sample included 489 ADHD cases and 1285 controls (Table 1) with high-quality SNP array data for full CNV analysis. Comparison of the CNV sets identified in the ADHD patients and in the controls showed no increased overall frequency of CNVs in ADHD cases (Supplementary Text). After exclusion of common (frequency >1%) CNVs, 2432 rare CNVs (592 in ADHD cases; 1840 in controls) with an increased length in ADHD cases (average CNV size: 226.3 kb (range: 9.3–2830.8 kb) in ADHD cases; 186.4 kb (range: 5.6–4479.6 kb) in controls) were included in the association analysis. Although there was a difference in the sex distribution between ADHD patients (81.0% males) and control subjects (50.7% males), there was no evidence for significant difference in the rate at which rare CNVs were called in males compared with females in either cases or comparison subjects (data not presented). All rare CNVs >500 kb are listed in Supplementary Table S1.

With regard to previous observations,13, 16, 17 we first looked at our data with respect to a potential overall enrichment of rare CNVs in ADHD cases compared with controls (Supplementary Table S7) and in terms of an enrichment for loci implicated in neurodevelopmental disorders, such as autism or schizophrenia (data not shown). There was no evidence for an increased burden of rare CNVs in ADHD patients (P=0.997). We additionally performed comparative analyses on rare CNVs stratified by their size. Interestingly, with increasing size thresholds, we observed a stronger trend of association between large, rare CNVs and ADHD, which is in accordance with the reports of previous studies.13, 16 Despite the fact that none of the comparisons resulted in a nominally significance (that is, P<0.05): there was a 1.126-fold enriched rate (P=0.271) and a 1.133-fold higher proportion (P=0.253) of subjects carrying at least one rare CNV >500 kb in the ADHD cohort. The rate of rare CNVs >500 kb observed in ADHD cases was 10.4%, which is similar to the rates of 12.2 and 12.5% reported in previous studies.13, 16 Limiting our analysis to rare CNVs >2 Mb, we observed a 3.065-fold enrichment (P=0.074) in ADHD cases relative to control subjects. However, due to the potential bias in individual CNV rate and average size, which differentiates cases and controls in our GWAS sample (see Supplementary Text), we did not follow-up these data. Differences between distributions in cases and controls may rather result from different technical genoty** procedures, than indicating association effects.

Locus-specific association tests for an overrepresentation of CNVs, including both deletions and duplications, in ADHD cases in comparison with controls revealed only one genome-wide significant genomic region within the PARK2 gene with a P-value of 2.8 × 10−4 empirically corrected for genome-wide testing (Figure 1, Supplementary Table S8). This locus for which we observed more rare CNVs in ADHD cases than in controls is located at chr6: 162 659 756—162 767 019 (NCBI36/hg18). We refer to this region as the PARK2 locus. In total, this locus is covered by 12 CNVs among the ADHD patients (2.45%, three deletions (0.61%) and nine duplications (1.84%)) and four CNVs among the controls (0.31%, two deletions (0.16%) and two duplications (0.16%)), all these CNVs extend into the coding region (either exon 2 or exon 3) of PARK2. Locus-specific association tests, stratified according to CNV type (deletion or duplication), did not reveal further genomic regions with genome-wide significant results. The PARK2 locus alone showed a genome-wide significant enrichment of duplications in ADHD cases compared with controls (P=1.9 × 10−3 after empirical correction for genome-wide testing, Figure 1). In contrast, we did not observe a genome-wide significant excess of deletions for any of the tested regions harbouring rare CNVs.

Figure 1
figure 1

Results for the PARK2 locus in the GWAS and in the replication sample. Each panel consists of four parts (called CNVs, PARK2 gene, probes analysed and association tests): CNVs: red (pink) bars represent duplications in an ADHD case (control), blue (lightblue) bars indicate a deletion of an ADHD case (control). PARK2 gene: the marks indicate the coding regions (NCBI36/hg18). Association tests: permutation-based one-sided −log10-transformed P-values for association tests; the black (pink; lightblue) line represents association tests for an increased frequency of segmental CNV data independent of type (deletions; duplications) in cases compared with controls. The significance level P=0.05 is highlighted as a dashed red line. The chromosomal region offering genome-wide significantly more CNVs in ADHD patients than in controls is highlighted by grey vertical shading. (a) Results for the GWAS sample. The presented P-values are genome-wide empirically corrected. The chromosomal region covered by the qPCR assay used for validation of PennCNV’s CNV calls is shown as a darkgrey vertical dashed line within this region of genome-wide significance. The duplication that could not be validated by qPCR analysis is marked by ‘x’. Results of association tests after exclusion of the non-validated case duplication are shown in Supplementary Figure S7. (b) Results for the replication sample. The presented P-values are pointwise corrected.

PowerPoint slide

Enrichment of CNVs at the PARK2 gene for GWAS ADHD cases compared with GWAS controls was also supported by the robust ‘cnv-enrichment-test’31 (one-sided empirical P=8.8 × 10−4).

Copy number status at the PARK2 locus of all ADHD patients in the GWAS sample was reevaluated by qPCR analyses (Figure 1). Apart from one duplication, each CNV status was technically validated. Even after reanalysis of all rare CNVs with exclusion of the non-validated duplication, the finding for the PARK2 locus remained genome-wide significant (empirically corrected P=1.2 × 10−3, Supplementary Figure S7). We observed no differences in ADHD subtypes or basic characteristics like age, sex and intelligence quotient (IQ) values between carriers of CNVs at the PARK2 locus and the 489 ADHD patients of the GWAS sample (Supplementary Table S2).

Next, we assessed an independent sample of 386 ADHD patients and 781 healthy controls to replicate our finding of an excess of CNVs in ADHD patients at the PARK2 locus. We replicated the excess (P=4.3 × 10−2, Figure 1) of CNVs in the ADHD replication sample (n=4 (1.04%), two duplications (0.52%) and two deletions (0.52%)) compared with replication controls (one duplication (0.13%)). Similar to the initial finding, four of the five CNVs in the PARK2 locus extend into the coding exon 2 of the gene. Owing to the small number of CNVs we did not stratify by CNV status.

Finally, in addition to our GWAS discovery analysis, we assessed ADHD CNV candidate loci from previous reportsSupplementary Text) may in part be explained by the inability of a precise determination of CNV breakpoints on the basis of SNP chip data. To support our main finding, however, the association of PARK2 CNVs with ADHD was validated by qPCR and replicated in an independent sample.

In summary, our results support the role of structural variants at the PARK2 locus for ADHD genetics. Moreover, our data support the further investigation of CNVs involving neurodevelopmental genes, such as CHL1, PTPRD, GCNT2, PTPRN2 and NDE1, as well as deletions and duplications at the 15q13 and 16p11.2 regions for ADHD genetics.