Background

There is a growing body of evidence that natural antisense transcripts (NATs) play important regulatory roles in various biological processes. NATs are usually transcribed from the opposite strand of a particular gene locus, and they are thought to regulate sense gene expression [1, 2]. One of the proposed models of NAT-mediated regulation is for the antisense transcript to act as a cis-repressor of gene expression from the sense strand. For example, in early embryogenesis, transcription of the antisense genes Tsix and Air determines the fate of expression of their sense partners ** regions between Tsix and **. We also observed this characteristic for the AFAS probes, because the AFAS probe signals clearly showed positional preference relative to the sense mRNA (Figure 1C). This result indicates that AFAS probes indeed detect the positional bias of antisense transcription. Similarly, we also observed higher signals within 5' regions (Figure 1C), thus suggesting that NATs may also arise near the transcriptional start site, as previously shown for head-to-head overlap** NATs such as WT1, Sphk1, and Tsix [285). Both microarray and Northern analyses confirmed that the Acaa1b sense transcript is expressed within liver and kidney (see Additional file 5). Northern analyses were not able to detect the antisense transcript of Acaa1b from either poly(A)+ or total RNA (data not shown), but quantitative RT-PCR, ISH and microarray analyses were able to detect this transcript within the testis and kidney (see Additional file 5). This result implies that NATs detected by microarray analysis using AFAS probes are transcribed in vivo.

We also analyzed the expression of Aard (alanine- and arginine-rich domain-containing protein), which is a functionally uncharacterized gene but is known to be expressed within the adult testis and XY fetal gonad [35]. In humans, exons of AARD (also known as C8orf85) overlap with that of an unnamed uterus EST (GenBank: AK093981), whereas mouse Aard has no EST arising from the antisense strand (Figure 3B). Northern analysis confirmed that expression of the sense transcript of Aard was testis-specific (Figure 3C); however, Northern analysis of the antisense transcript showed laddered hybridization patterns for total RNA, but not for poly(A)+ RNA isolated from all samples (Figure 3D). By comparison, both the sense and antisense transcripts (Aard-AS) were detected by ISH within a particular region of the seminiferous tubules (Figure 4A,B), thus confirming that the Aard-AS is also expressed in the testis. In addition, Aard-AS was most likely located within the nucleus, whereas Aard was located within the cytoplasm (Figure 4C,D). Because ISH shows that Aard-AS is expressed in a particular region of the seminiferous tubules, we checked our microarray data on fractionated testis samples that reflected the three steps of spermatogenesis (i.e., pachytene spermatocytes, round spermatids, and elongated spermatids). We found that Aard-AS was expressed within the early period of spermatogenesis, whereas the sense transcript appeared at a later phase (Figure 4E). This finding shows that sense and antisense transcripts of Aard are transcribed exclusively and in a mutually antagonistic fashion during spermatogenesis. In addition, Aard-AS expression was detected only in the random-primed target sample, not in the oligo-dT primed target (Figure 4E), indicating that Aard-AS tends to be poly(A)-negative and nuclear-localized.

Figure 3
figure 3

Northern analyses of the sense and antisense transcripts of Aard. (A) Schematic illustration of AARD and AK093989 (unannotated gene product) in humans. (B) Mouse orthologous partner, Aard. (C) Northern analysis for sense expression of Aard in NIH3T3 cells, brain (Br), liver (Li), kidney (Ki), and testis (Te). (D) Northern analysis for expression of the corresponding antisense transcript for both poly(A)+ and total RNA. Triangles indicate 28S (4710 nt) and 18S (1870 nt) ribosomal RNA.

Figure 4
figure 4

Expression dynamics of Aard and Aard -AS. (A) In situ hybridization results (seminiferous tubules) for sense (Aard) and antisense (Aard-AS) transcripts. Arrow indicates the direction of spermatogenesis, as illustrated in (B). Scale bars: 100 μm. (C and D) Enlargement of the corresponding boxes in (A). Arrowheads denote the cytoplasmic signal of Aard and the nuclear signal of Aard-AS. (E) Microarray results of Aard and Aard-AS expression during the spermatogenesis. Black and white bars indicate normalized signal intensity levels of sense and AFAS probe, respectively. Arrow indicates the direction of spermatogenesis (Psp, pachytene spermatocytes; Rsp, round spermatids; Esp, elongated spermatids). Fractionation of germ cells on the basis of the three stages of spermatogenesis in the mouse testis was performed as previously described [53, 54].

These data clearly confirm that AFAS probes can detect the expression of antisense transcripts in normal tissues, and that they can also identify transcripts expressed in a tissue- and cell-type-specific manner. Detection of such expression dynamics for antisense transcripts is possible only by using the analytical platform targeting the complementary strand of the annotated genes. Thus, AFAS probes, when used within appropriate biological samples and combined with other analytical modalities, can be used to discover genuine functional NATs; this is an advantage over conventional approaches that depend on publicly available cDNA data.

Detection of novel NATs differentially expressed under pathological conditions

We next checked whether AFAS probes have the ability to detect antisense transcripts in cancerous tissues. Examples of functional antisense transcripts identified in abnormal cells are CDKN2B, WT1, and HBA2 [8, 9, 29]. These antisense transcripts control the epigenetic status of surrounding genes by DNA methylation or histone modification and thus are thought to affect the expression of their sense partners. To confirm this notion, we applied the AFAS probe technique to the 404 well-characterized genes including oncogenes and tumor suppressors (1752 AFAS probes were successfully designed, giving 4.4 probes per gene on average). We used these probes in microarray experiments based on the GRS/A mouse strain, which frequently suffers from (MMTV)-induced mammary tumors [36].

For the probes designed to detect the sense transcripts, we identified 57 genes showing differential expression. Among these, 48 were up-regulated and 9 were down-regulated within tumor regions, compared with in normal regions, according to a set statistical threshold (P ≤ 0.05 by Student's t-test) (Figure 5 and Additional file 6). Among the up-regulated genes in tumors, 12 genes (Pdcd6 is shown as an example in Additional file 7) showed loss of antisense expression (Figure 5A, right lower), whereas among the down-regulated genes Nr2c2 showed up-regulation of its antisense expression in an anti-correlated manner with the sense transcript counterpart (Figure 5A, left upper). These genes are reminiscent of the model in which antisense transcription may lead to the silencing of sense gene expression, such as cyclin-dependent kinase inhibitor (CDKN2B) and its antisense counterpart [9]. These genes may be regulated through an antisense-mediated pathway.

Figure 5
figure 5

Probes showing differential expression between normal and tumor regions. Differential expression between normal and tumor regions is plotted with log-scaled P value according to Student's t-test for sense (x-axis) and antisense (y-axis) expression. In cases where the mean value of signals from normal regions was higher than that of tumors, the value was multiplied by -1. Accordingly, values higher than zero indicate up-regulation, whereas values lower than zero indicate down-regulation, in tumors. (A) Colored points denote significant changes in the expression of both the sense and antisense transcripts in tumors, in a correlated manner (blue) and in an anti-correlated manner (red). The names of the genes indicated by the colored points are listed in additional file 6. (B) Orange and green dots indicate up-regulated and down-regulated antisense expression from genes, respectively, but no apparent changes in sense transcript expression.

Interestingly, the expression of antisense transcripts representing 37 genes (Thbd is shown as an example in Figure 6A) was found to increase, despite the absence of changes in expression of their sense transcript counterparts (Figure 5B). We also identified down-regulated antisense transcripts corresponding to 45 genes (Drd4 is shown as an example in Additional file 7) for which there were no changes in expression of their corresponding sense transcripts. Because ISH using cancerous tissues, like microarray analysis, can detect antisense expression arising from Thbd (thrombomodulin) (Figure 6B–D), there might be more examples of genes for which antisense expression is altered in cancerous tissue but cannot be detected by microarray analysis that targets expression from the sense strand of genes.

Figure 6
figure 6

In situ hybridization of the antisense transcript of Thbd. (A) Microarray results for Thbd, for which expression of the antisense transcript (red bars) has markedly changed in tumor cells but that of the sense transcript (blue bars) has not. (B-D) Results of in situ hybridization of mammary tumor tissue of GRS/A mice for detecting antisense transcription of Thbd. Scale bars: 200 μm (C); 100 μm (D).

Discussion

This paper shows that microarray probes targeting transcription from the complementary strand of known genes can identify novel NATs, an approach that has not been possible solely on the basis of publicly available cDNA data. Recently described high-density oligonucleotide tiling-array platforms are designed to overview the transcriptional landscape of specific genomic regions at high resolution. By comparison, our platform uses multiple probes to specifically screen for transcription from the antisense strand of known genes. Many previous studies have attempted to identify NATs by DNA microarray analysis using cDNA-oriented custom microarrays or commercially available microarray platforms [3741]. Since our microarray platform is custom-made and not commercial, it can be applied to any genes or gene loci of interest. Furthermore, our method does not introduce bias from cDNA synthesis between sense and antisense profiling because it does not require specific protocols for target cDNA synthesis for NAT detection. In addition, our microarray platform approach can simultaneously profile sense and antisense expression in one microarray hybridization experiment.

Many NATs detected by AFAS probes were appeared only in the random-primed targets. This was concordance with previous cDNA-based microarray profiling of NAT expression [24]. Whereas poly(A)-plus RNA population is roughly represented by oligo-dT primed cDNAs, whole transcriptome (including the poly(A)-minus RNA population) is represented by cDNAs synthesized by random primers. Therefore, NATs detected by our analysis tend to be poly(A)-negative. Although oligo-dT primers can pick the internal poly(A)-stretches, this is not an issue at the level of microarray-based NAT screening, because the vast majority of the poly(A)-stretch (approximately 90%) is located within the 3' end of the transcripts (data not shown).

By designating AFAS probes to human-mouse orthologous genes, we identified many probes showing positive signals. Two of these probes identified transcripts for which in vivo expression was confirmed. Thus, our approach may reveal more, as yet unidentified, conserved NATs; this has not been possible by conventional approaches, as previously reported using cDNA data [26, 31, 32]. Of the individually validated examples (Acaa1b and Aard), expression of Aard-AS was localized to the nucleus and was detected only in random-primed target samples. In addition, multiple-size hybridized bands pattern was observed especially for total RNA membrane, not for poly(A)+ RNA membrane. This observation is similar to that of previously identified antisense transcripts [24], and this is probably due to heterogeneously sized molecules of Aard-AS transcripts. Because ISH and the microarray data on other antisense transcript examples also show nuclear localization and poly(A)-avoidance (data not shown), it is possible that these features are general characteristics of the antisense transcriptome.

We also designed AFAS probes for well-characterized genes and identified several examples of correlated and anti-correlated expression between the NATs and the corresponding sense transcript within MMTV-induced mammary tumors. We observed differentially expressed genes for which expression of the antisense transcript had changed, whereas that of the sense transcript had not. Given that differential antisense expression might induce changes in epigenetic status, for example in CDKN2B and CDKN2BAS [9], antisense transcription may cause changes in the methylation status of neighboring genes. This notion can be tested by using methylated DNA immunoprecipitation (MeDIP) and chromatin immunoprecipitation (ChIP) on chip analyses to further characterize the antisense transcriptome and to determine whether specific NATs function as epigenetic regulators. Whereas this study revealed NATs specific to mouse tumors, human clinical samples have also been analyzed to screen for novel NATs by the same methodology; this new study has identified many antisense transcripts showing increased or decreased expression in human colon cancer tissues compared with controls (Saito R., Kohno K., Okada Y., Osada Y., Numata K., Watanabe K., Nakaoka H., Yamamoto N., Kanai A., Yasue H. et al., manuscript in preparation).

Although next-generation high-throughput transcriptome sequencing (RNA-seq) might replace microarray-based expression analyses, antisense transcriptome analysis by sequencing is still under development because of the laborious nature of strand-specific library construction [42]. DNA microarray-based profiling makes it possible to gain a detailed view of specific genes or gene loci and can also provide expression profiles of both poly(A)-plus and poly(A)-minus RNAs.

Conclusion

We showed here that probes targeting the complementary strand of the annotated genes successfully identify novel NAT expression, including those altered tissue- and tumor-specifically. The results suggest that there are more examples of NATs that cannot not be collected from public cDNA sources. Further functional investigation is required for such dynamically expressed NATs, and the use of microarray platforms targeting both strands of the gene locus will help to narrow down the proper candidates for further functional analyses.

Methods

Custom microarray construction

The AFAS probes for detecting NATs were designed to detect antisense transcription originating from genes categorized into three groups: (1) 48 genes in which antisense transcription has been previously reported and 87 imprinted genes in mice, (2) 404 selected well-annotated genes, (3) orthologous genes in NAT loci (detailed definition given below), and (4) randomly selected genes for which there were no cDNA, EST, and CAGE tags in the antisense orientation. For categories (1) and (2), the AFAS probes were designed to correspond to every 500 bases of the antisense strand of the exonic regions of each gene. For category (3), the AFAS probes were designed to correspond to a single specific sequence in each transcript. For category (4), two AFAS probes were designed per transcript. Target region selection for the probe design is summarized in Additional file 8. All probes were computationally designed by using the OligoWiz program [43] and were used in the Agilent 44K custom oligoarray platform for single-color microarray analysis.

Target sample preparation for the microarray analysis

Total RNA for the mouse (C57BL/6J) microarray experiments was isolated from NIH3T3 cells (fibroblast cell line), SL10 cells (fibroblast cell line), brain, heart, intestine, kidney, liver, lung, placenta (d.p.c. 10.5 and 13.5), spleen, stomach, testis, and thymus. Testis was from C57BL/6J males (8 to 10 weeks), placenta was from pregnant mice, and the other tissue was from both male and female mice. Nuclear and cytoplasmic fractionation of NIH3T3 cells was carried out according to the Protein and RNA Isolation System (PARIS) instructions (Ambion Inc.). For the microarray analysis of murine mammary tumors, RNA samples were collected from normal and cancerous mammary glands of dissected GRS/A mice [36].

Data processing and the accessibility

Numerical processed signal values (gProcessedSignal) of the Agilent Feature Extraction File were obtained as representative expression levels for each probe within the array. If a spot had an intensity value lower than five, or if there was no prominent difference between foreground and background signals, then the intensity value was adjusted to five and the corresponding probe was treated as an "absent probe". To perform normalization of signal intensity distribution between multiple arrays, the whole mean signal of every hybridization experiment was adjusted to that of the data from SL10 cells by oligo-dT priming. Probes with intensity values lower than five, as well as being flagged as "saturated", were discarded for the inter-array-normalization step. Tissue-specificity of the expression signals was evaluated according to τ measurement [44]. The raw data from the microarray analyses were deposited in the NCBI Gene Expression Omnibus (GEO) under accession number GSE14568 [45]. Expression data as well as a simplified genomic structure can be accessed via an originally constructed viewer [46].

In silico identification of orthologous genes in NAT loci

To identify orthologous genes in NAT loci (Figure 2), we initially performed in silico identification of sense-antisense pairs by the same procedures as previously published [47], by using the latest full-length cDNA collections [33, 48], NCBI RefSeq mRNA [49] and the UniGene collection [50]. This identified 3524 and 5351 exon-overlap** sense-antisense pairs in humans and mice, respectively. Genomic synteny data between human and mouse (defined by BLASTZ derived from UCSC [51]) was then exploited to determine whether each identified pair was located within the syntenic region between the two species. Those pairs located within the syntenic regions were retained for the orthologous relationship validation. The orthologous relationship between the genes located within the syntenic regions was defined according to the orthologous gene table from the BioMart Project [52]. Finally, 648 genes are identified as orthologous genes for which NAT was identified in human cDNAs but not in mouse cDNAs. AFAS probes for these (635 of 648) were successfully designed.

Northern hybridization analyses

RNA from mouse tissues (C57BL/6J, 8 to 10 weeks, male and female mixed), and the NIH3T3 was isolated by using Trizol reagent (Invitrogen Corporation). Northern analyses were performed as previously described [24]. Loading of equal amounts of RNA samples was confirmed by visualization of ethidium bromide-stained RNA in the gel. Probes specific for sense and antisense of Acaa1b (NM_146230), Aard (NM_175503), and Thbd (NM_009378) were amplified by the PCR (see Additional file 9). All the probe sequences contained their corresponding microarray probe sequences. cDNA fragments were cloned to the pGEM-T Easy Vector (Promega Corporation), and strand-specific cRNA was prepared for hybridization.

In situ hybridization

Probes specific for sense and antisense of Acaa1b (NM_146230), Aard (NM_175503), and Thbd (NM_009378) were amplified by the PCR (see Additional file 9). All the probe sequences contained their corresponding microarray probe sequences. The amplified fragment was sub-cloned into pGEMT-Easy vector (Promega) and was used for generation of sense or antisense RNA probes. Paraffin-embedded testis sections (6 μm) of normal adult mouse (C57BL/6 mouse, male, 8 weeks) were obtained from Genostaff Co., Ltd. For in situ hybridization the sections were hybridized with digoxigenin-labeled RNA probes at 60°C for 16 h. The bound label was detected using NBT-BCIP, an alkaline phosphate color substrate. The sections were counterstained with Kernechtrot (Muto Pure Chemicals Co., Ltd.). Probe sequence of negative control experiment was selected from Oryza sativa putative leaf protein (NM_197207) (see Additional file 5 and 10).

Real-time quantitative RT-PCR

cDNA was initially synthesized with gene-specific reverse primers (Acaa1b-AS and Gapdh) from selected tissue RNA (Brain, Testis, Kidney, and Liver), then subjected to quantitative RT-PCR. Gene expression level was normalized with Gapdh. Primers are listed in Additional file 11.