INTRODUCTION

Bipolar disorder (BD) is a debilitating and highly heritable psychiatric disorder whose genetic etiology is largely unknown. Candidate gene, genome-wide association (GWAS), and gene expression studies have all implicated a variety of genes, but a coherent theory of pathogenesis has not yet emerged. Multiple variants in many genes often come together into several gene networks and fewer biological pathways. This points to the powerful strategy, known as ‘integrative genomics’ (Schadt, 2006), to address the extreme genetic heterogeneity seen in many common illnesses. This strategy has proven to be of value in interpreting the multigenic signals that have been observed in GWAS, copy number variation, and de novo mutation studies of many common neuropsychiatric diseases (Walsh et al, 2008; Choi et al, 2011; Fromer et al, 2014) but has so far been little studied in BD.

Many of the common genetic variants identified by GWAS lie in the regulatory regions, where they can affect the expression of nearby genes (Maurano et al, 2012). Much of the impact of genetic variation on gene expression is tissue-dependent (Andersson et al, 2014). Non-protein-coding genes that produce regulatory RNAs also appear to have an important role in fine-tuning of gene expression over development and in response to environmental stressors (Barry, 2014). It has also become clear that most genes, especially those expressed in brain, produce a number of distinct messenger RNA molecules, or transcripts, owing to alternative splicing, differential exon usage, and other posttranscriptional modifications (Barry, 2014).

Thus, the integration of GWAS signals with gene expression data requires a sensitive, tissue-specific approach that can assess differences in transcript abundance, noncoding RNAs, and posttranscriptional modification. Next-generation sequencing of RNA, known as RNA sequencing (RNA-seq), addresses many of these needs better than traditional gene expression microarrays, providing a more complete picture of the ‘transcriptome’ (McGettigan, 2013).

As a first step toward an integrative genomics strategy in BD, we sequenced RNA derived from postmortem brain obtained from individuals with BD and matched controls. Our initial analysis of these data detected many differentially expressed (DE) genes in BD. These genes have important roles in neuroplasticity, circadian rhythms, and GTPase binding (Akula et al, 2014). When we integrated these data with the results from previous GWAS of BD, we found that DE genes in the GTPase pathway were also enriched for single nucleotide polymorphisms (SNPs) that were associated with BD. This suggested that differential expression of these genes was not just a consequence of BD or its treatment, but also reflected inherited genetic variation associated with disease risk. However, that study was limited by analysis methods that focused on individual genes and transcripts, without regard to their correlated patterns of expression.

Here, we performed a complete re-analysis of the same RNA-seq data using methods that exploit the correlated patterns of expression among groups of genes. We used weighted gene correlation network analysis (WGCNA) (Langfelder and Horvath, 2008), a widely used method that finds modules of highly correlated genes, relates these modules to one another, and tests the influence of sample phenotypes on gene expression correlations. WGCNA has been widely used to identify co-expressed gene networks in various human brain regions (Oldham et al, 2008), animals (Fuller et al, 2007; Langfelder et al, 2012), and in human phenotypes, including schizophrenia (Torkamani et al, 2010), autism (Voineagu et al, 2011), cancer (Clarke et al, 2013), aggressive behavior (Malki et al, 2014), BD (Chen et al, 2013a), and psoriasis (Li et al, 2014). However, aside from one study of a few gene networks (Cell-Type Enrichment of Co-Expression Networks

Oldham et al (2008) also report gene expression signatures typical of particular cell types in human brain. Their results indicate modules enriched for oligodendrocytes, astrocytes, neurons, and synapses. In order to test whether any of our 21 modules were enriched for gene expression signatures reflecting these cell types, we compared our modules to those from Oldham et al (2008) using a hypergeometric test.

RESULTS

Gene Co-Expression Networks

A total of 33 consensus co-expression modules were detected (Supplementary Figure S3). All 33 modules were highly preserved in both the NISC1 and NISC2 data sets (Zsummary>10; Supplementary Figure S4). The number of genes in each module varied from 74 to 2766, with an average of 446 (Supplementary Table S1). Comparison with previously published data (Oldham et al, 2008) showed good agreement between these modules and those detected in microarray data from human cortex in individuals without psychiatric illness (Supplementary Table S2). Twenty-nine of 33 modules significantly overlapped with modules identified in the Oldham et al (2008) study. This demonstrates that WGCNA can detect robust modules of co-expressed genes across a range of data types and individuals.

Association with BD

Of the 33 preserved modules, 21 were selected for downstream analysis because their eigengene values were correlated with BD in the same direction in both NISC1 and NISC2 (Table 1; Supplementary Table S3 contains the module membership (kME) values for all genes in these 21 modules). The observed association with BD was not explained by differences in age or sex, because samples were matched on these variables. The observed associations were also not explained by differences in known biological (smoking and cause of death) or technical (RNA-integrity number and sequencing depth) covariates (Supplementary Table S4).

Table 1 Twenty-one Gene Co-Expression Modules that Showed Consistent Correlation with Bipolar Disorder (BD)

Eleven of the 21 modules were associated with BD at false discovery rate <0.05 (Table 1). The eigengene values for each of these 11 modules are depicted in Figure 1 as a heatmap. This shows that most of the genes within each of five modules (dark turquoise, green, turquoise, dark orange, and red) were downregulated in most of the BD cases we studied, compared with controls. Most of the genes in the remaining six modules (royal blue, sky blue, light yellow, dark grey, purple, and yellow) were upregulated in BD. The module assignments for all genes in the 11 BD-associated modules are depicted in Supplementary Figures S5 and S6.

Figure 1
figure 1

Eigengene heatmap. The x-axis shows the modules and the y-axis shows the samples. Red indicates negative, and green indicates positive eigengene values.

PowerPoint slide

Overlap with DE Genes in RNA-seq Data

To assess whether the results of the WGCNA analysis agreed with those of our previous study, we assessed overlap among genes within each of the 21 differentially co-expressed modules identified by WGCNA with genes found to be DE in our previous RNA-seq study (Akula et al, 2014). There was a significant overlap in 10 modules (hypergeometric p-value <0.05), and only 1 module contained no genes previously identified as DE (Table 1). This shows that the WGCNA analysis largely agrees with single gene expression analysis but also identifies additional genes.

Modules Enriched with GWAS Genes (GWAS Enrichment)

In order to distinguish modules containing genes that may have a causal role in BD from those whose differential co-expression may be a consequence of BD or its treatment, we tested genes within each of the 21 modules for evidence of association with BD in previous GWAS. Eight modules were significantly enriched for GWAS-implicated genes (permutation p-value<0.05; Table 1), consistent with a causal role in BD. The remaining modules showed no evidence of GWAS enrichment.

As GWAS enrichment analyses can be biased by gene size, we repeated the analyses with MAGENTA (Segre et al, 2010), which takes gene-length bias into account. Similar results were obtained (Supplementary Table S5). The red and green modules were significantly enriched, and the purple module showed a trend toward enrichment (p<0.1), while the dark turquoise module was not significant in the MAGENTA analysis.

Functional Gene Set Enrichment Analysis

A critical question in this study concerns the potential functional relationships among the implicated genes. In order to explore this question, we performed gene set enrichment analysis in each of the eight gene modules that were consistent with causal involvement in BD. Three of the eight ‘causal’ modules (green, red, and salmon) yielded several significantly enriched GO terms, including: cell-cell signaling, PSD, ion transport, synapse, regulation of transcription, and passive transmembrane transporter activity (Table 2). A few specific GO terms, such as PSD, synapse, cation channel activity, and ribosomal subunit were strikingly (>2.5-fold) enriched, whereas most of the other GO terms showed ~twofold enrichment. (Gene significance and module membership values for all the genes in the green module are shown in Supplementary Figures S7 and S8.) Of the remaining 13 modules that were not enriched for GWAS signals, 7 modules (blue, dark olive green, dark red, pale turquoise, royal blue, turquoise, and yellow) yielded significant enrichment for particular GO terms. These included zinc ion binding, defense response, immune system development, response to wounding, proteolysis, and carboxylic acid binding, among others. Most of the immune-related GO terms showed >fivefold enrichment. The enriched GO terms, along with the genes and their respective p-values, are given in Supplementary Table S6. The top 25 genes in each of the modules that yielded significant functional enrichment results are shown in Figure 2.

Table 2 Functional Enrichment of Genes in Co-Expressed Modulesa
Figure 2
figure 2

Gene co-expression networks in bipolar disorder. The top 25 genes in each of the 10 modules correlated with bipolar disorder are illustrated. Thickness of the grey lines are proportional to (absolute) magnitude of the observed gene-gene correlations. Colors correspond to those in Table 2.

PowerPoint slide

RNA-seq Co-Expression Networks Agree with Those Implicated in a Published Microarray-Based WGCNA Analysis

Chen et al (2013a) found 23 co-expression modules that were associated with BD in multiple microarray datasets. We calculated the extent of overlap between these modules and those we found in the RNA-seq data. There was a highly significant overlap between the co-expressed modules in both studies (Supplementary Table S7). Several modules had corrected hypergeometric p-values<0.05, indicating the high reproducibility of the co-expressed gene network structure, even though fewer genes can be detected by microarray.

BD-Associated Co-Expression Networks do not Show Expression Signatures for Specific Cell Types

Comparison with the Oldham et al (2008) data found evidence of significant gene overlap between several modules and genes characteristic of oligodendrocytes, astrocytes, microglia, neurons, glutamatergic neurons, and synaptic proteins (Supplementary Table S2). The strongest gene overlaps (>50%) were observed for modules characteristic of oligodendrocytes.

DISCUSSION

To our knowledge, this is the first study to perform WCGNA analysis on RNA-seq data of the complete brain transcriptome in BD cases. These results provide a high-resolution account of the interacting gene networks in brain that are involved in BD. By incorporating the GWAS signals, we have attempted to distinguish gene modules that have a causal role from those that appear to be a consequence of BD or its treatment. The preserved, differentially co-expressed, GWAS-enriched modules point toward a number of biological pathways as important factors in the pathogenesis of BD. Three modules were associated with BD and enriched for DE genes and BD GWAS signals. Of these, the green module showed a striking (4.6-fold) enrichment for genes involved in the PSD (Figure 3). This finding is noteworthy in light of the several prior studies that have implicated the PSD in BD and other neuropsychiatric disorders (el-Mallakh and Wyatt, 1995; Kristiansen and Meador-Woodruff, 2005; Beneyto and Meador-Woodruff, 2008; Pennington et al, 2008; Network and Pathway Analysis Subgroup of Psychiatric Genomics Consortium, 2015). Our results provide independent support for those findings and suggest that genes involved in the PSD are a key factor in the pathogenesis of BD. If replicated in larger samples, our results could point toward new therapeutic targets for BD among the numerous proteins active in the PSD (Feng and Zhang, 2009).

Figure 3
figure 3

Post-synaptic and ion channel genes in the green module. The postsynaptic density figure has been adapted from Feng and Zhang (2009). Proteins enclosed in red circles are encoded by one or more genes assigned to the ‘green’ module in the present study. CaCh: CACNA1E, CACNA1G, CACNB1, CACNG3; CAMK: CAMK2A, CAMK4; N-cadherin: CDH12, CDH8; PSD95: DLGAP3, DLGAP4; Ephrin: EFNB3, EPHA4, EPHA6; AMPAR: GRIA1, GRIA2, GRIA3; NMDAR: GRIN2B; Homer: HOMER1; Kalirin: KALRN; KCh: KCNA4, KCNB2, KCNC3, KCNG1, KCNH5, KCNIP3, KCNQ2, KCNQ4, KCTD1; Densin-180: LRRC7; PDZ: PDZD2; Ras signaling: RAP2B; SH3: SH3KBP1, SH3PXD2A; Shank: SHANK1, SHANK2.

PowerPoint slide

This study has several limitations. The sample size was relatively small compared with the samples used in previous microarray studies. Nevertheless, we were able to successfully replicate many of the published findings and extend those findings to non-coding genes and previously undiscovered functional gene networks. This reflects the precision and wide dynamic range of high-depth, RNA-seq-based transcriptome data (Iancu et al, 2012, 2014; Zhang et al, 2014). Given the small sample size, subtle biological or technical biases cannot be ruled out. As is inherent in ‘omics’ studies where the number of variables far exceeds the number of subjects, the results should be cautiously interpreted until replicated in a larger dataset (Bild et al, 2014). Some of the gene modules that were associated with BD in this study did not reveal recognized functional pathways. This may reflect the limitations inherent in analyses that depend on known relationships between genes in the published literature. This limitation will diminish as more empirical gene–gene relationships are revealed. Other limitations of this study include the focus on only one brain region and lack of cellular resolution. However, a comparison with published cell-specific gene expression signatures (Oldham et al, 2008) suggests that several cell types contribute to these results.

The major strength of this study is the ability to integrate expression data in groups of genes with risk allele data from GWAS. We identified 11 modules that were associated with BD, 4 of which were significantly enriched with GWAS variants. The results for GWAS enrichment were further validated using MAGENTA (Segre et al, 2010), which supported significant GWAS enrichment among genes in the green and red modules. A significant enrichment of miR-137 targets among genes in the red module was observed. A potential causal role for the remaining modules cannot be ruled out, however, because genes in those modules might harbor rare single nucleotide or copy number variants that would not be detectable by GWAS.

As this study relies on the observed co-expression of both protein-coding and non-protein-coding genes in the brain, rather than on genetic relationships apparent in the published literature, it offers a more unbiased account of genetic relationships in the brain. For example, the analysis was able to detect several non-protein-coding genes, such as lncRNAs, that so far are not well understood in the context of brain function but seem to have a key role in tying together otherwise disparate sets of genes involved in BD. Recent research has shown that lncRNAs are highly conserved and have an essential role in synapse formation (Bernard et al, 2010; Voineagu et al, 2011). However, the results resonate broadly with a recent report of abnormal activation of peripheral blood monocytes and lymphocytes in BD (Gumieiro et al, 2010).

This study has produced results that are overall consistent with the prior findings, but also implicate novel genes and biological pathways that may contribute to the risk for BD. The integration of RNA-seq-based gene expression data with GWAS data highlights potentially important differences in gene co-expression networks that contain genes harboring risk alleles and those that do not. The current findings may also be valuable for hel** to interpret the results of future studies of rare variation in BD. Integration of GWAS with gene co-expression data is a promising approach to better understand the mechanisms of highly genetically heterogeneous neuropsychiatric disorders.

FUNDING AND DISCLOSURE

NA, KHC, and FJM declare no conflict of interest. Over the past three years, JRW has received compensation from F. Hoffmann-La Roche AG, Pfizer Inc., and Nestlé Health Science.