Introduction

Ubiquitin-specific protease 4 (USP4) is a deubiquitinating enzyme that can edit or remove ubiquitin chains of various topologies. USP4 can remove both degradative K48- and regulatory K63-linked ubiquitin chains from a growing list of protein targets that include key players in a number of signal transduction pathways. Its substrates include the transforming growth factor-β (TGF-β) I receptor1,2 and the TGF-β-activated kinase 1 (TAK1)3, Wnt/β-catenin pathway transcription factor Nemo-like kinase (NLK)4, the p53 antagonist ARF-BP15, anti-viral response mediator RIG-I6, TNFα/NF-κB inflammatory pathway mediators TRAF-2 & -6

Figure 1
figure 1

Schema of Usp4 exonic properties and potential mechanisms of exon skip**.

(A) Proposed alternative mechanisms for skip** of exon 7 in USP4: (i) differential splice site strengths (ii) long flanking introns (B) Usp4 exon structure relative to encoded protein domains. D1 and D2 form the cysteine protease catalytic domain that effectuates ubiquitin cleavage. The regulatory DUSP-UBL1 domain physically interacts with the unstructured Insert region and with DUSP-UBL1 domains of other USP4 monomers leading to dimerization. (C) USP4 protein sequence entropy. USP4 exon-retained isoform amino acid sequences for 62 species were aligned using MUSCLE51. Shannon entropy was calculated using DAMBE18,19,20. A number of exon skip** mechanisms have been proposed, including:

  1. 1

    differential strength of splice signals leading to alternative splice site selection (reviewed in Keren et al.21).

  2. 2

    excessively long flanking introns22,23,24,25,26,27.

  3. 3

    intermixture of major and minor class introns28.

As USP4 does not have minor class introns, we chose to examine the first two mechanisms of E7 skip**, as illustrated in Fig. 1A(i,ii). First, if an exon is functionally important, then flanking sequences recognized by splicing factors will be under strong purifying selection to maintain proper spliced end-products. If the inclusion of E7 is unimportant, then its proximal splice sites (3′SS6, 5′SS7 and BPS6) may experience greater sequence drift, leading to weakening of these sites relative to 5′SS6, 3′SS7 and BPS7 and exon 7 skip** (Fig. 1A(i)). Selection can however also act to preserve differential splice signal strengths in order to enable the production of two isoforms. In short, strong 5′SS6, 3′SS7 and BPS7 relative to 5′SS7, 3′SS6 and BPS6, resulting from passive drift or active selection, could permit E7 skip**. Second, exons flanked by long introns are known to be intermittently skipped. As such, the alternative inclusion of E7 in USP4 mRNA could be explained by appropriately long flanking introns (Fig. 1A(ii)). We evaluated these alternative hypotheses by constructing a computational evolutionary framework to characterize USP4 splicing patterns in multiple vertebrate lineages and tested our in silico predictions in the laboratory to gauge whether the two isoforms of USP4 observed in humans have discrete functional and/or regulatory roles or whether they are the by-product of reduced selection for E7 retention.

Results

Bioinformatic analysis

To study exon-intron architecture and splice signal conservation, we downloaded GenBank sequences for well-annotated USP4 sequences for 62 vertebrate species covering major vertebrate taxa (see Supplementary Table S1).

Sequence and length conservation of USP4 exons

Both long and short isoforms of USP4 comprise a bi-partite catalytic domain and a regulatory DUSP-UBL1 domain, where the seventh exon forms part of the unstructured flexible linker between these two (Fig. 1B). Sequence identity (Fig. 1C) and length (Fig. 1D) of USP4 exons are highly conserved for exons that encode structured functional domains and less well for exons that correspond to unstructured regions. E7 exhibits greater length variation than its neighbors. Relative to mammals, E7 of birds and fish generally encodes one and two additional amino acids respectively, while the length of E7 within each of these clades is variable; this suggests multiple indel events in USP4 during the evolutionary diversification of vertebrates. Sequence conservation among aligned USP4 exons, quantified using Shannon entropy in Fig. 1C, reveals that E7 is located within a highly variable region. The entropic nature of E7 conforms to the general pattern that alternatively spliced exons exhibit less sequence conservation30. In Fig. 2A, we have derived a more sensitive Dto3′.opt of 20–40 nt based on the locations of consensus BPS sequences (YURAY) within all 15770 introns of human chromosome 22 protein-coding genes. This slightly larger window for Dto3′.opt will reduce the false negative rate in detecting true BPSs in the neighbouring introns of E7. If a strong BPS is absent from the upstream intron (I6), a strong downstream BPS within I7 could lead to exon skip** (Fig. 1A). We therefore evaluated the relative strengths of USP4 BPS6 and BPS7 for 14 well-studied species representing the major vertebrate taxa.

Figure 2
figure 2

Characterization of relative BPS strengths.

(A) Distribution of location of strong BPS sequences (“YURAY” motif with single downstream AG dinucleotide corresponding to the 3′SS) in all introns of human chromosome 22. The X-axis depicts the last 200 nt of the introns. Introns shorter than 200 nt were excluded to avoid statistical bias. Dto3′.opt is 20–40 nt. (B) Locations of YURAY sequences within 50nt of 3′ ends of introns 6 and 7 of USP4 in 14 vertebrate species.

The location of YURAY motifs in I7 (Fig. 2B) suggests a strong BPS7 among all studied mammalian species. A strong BPS7 is also present in I7 of zebrafish (Danio rerio), but is missing in the frog (Xenopus tropicalis), the chicken (Gallus gallus) and the Chinese turtle (Pelodiscus sinensis). The latter two species retain candidate YURAY motifs which can be eliminated from consideration due to their greater distance from the 3′SS (73nt for the chicken, 85 nt for the Chinese turtle) and to the presence of a downstream non-3′SS AG (non-3′SS AG dinucleotides between a BPS and a 3′SS AG are avoided as the first AG following the branch-point is generally used as the 3′SS AG).

YURAY motifs are absent from the 40 last nucleotides of USP4-I6 in all mammalian species. Taken together, mammalian species in general have a weak BPS6 and a relatively strong BPS7. This lends plausibility to the first hypothesis from Fig. 1A, wherein E7 splicing results from alternative pairing of 5′SS6 and BPS7 (E7 skip**) or of 5′SS7 and BPS7 (E7 inclusion). In contrast, non-mammalian species tend to have a strong BPS6 (except for the Chinese turtle). YURAY motifs are located near Dto3′.opt in I6 of the chicken (Dto3′ = 47 nt) and zebrafish (Dto3′ = 16 nt) and frog (three YURAY motifs with Dto3′ = 36, 42 and 53 nt, respectively). This suggests that E7 skip** should be less likely in non-mammalian than in mammalian species, though the Chinese turtle may be an exception.

Relative strengths of proximal and distal splice signals

In addition to a weak BPS6 relative to BPS7, a stronger 5′SS6 than 5′SS7 and/or stronger 3′SS7 than 3′SS6 would favor E7 skip** (Fig. 1A(i)). Position weight matrices (PWM) measures site-specific nucleotide usage bias in a motif alignment, where the consensus motif typically has the highest PWM score (PWMS). PWMSs are routinely used to characterize the signal strength of splice sites31,32.

A PWM for the 5′SS sequences of human chromosome 22 introns (Table 1) shows a consensus 5′SS consistent with what has been documented in the literature, i.e., a core motif of AG|GUAAGU, where “|” indicates the exon-intron junction. PWMs derived from zebrafish and chicken chromosomes are almost identical to the human 5′SS matrix in Table 1(upper panel). Thus, this PWM can be used to generate PWMSs as comparable measures of signal strength at 5′SS6 and 5′SS7 from the 14 representative vertebrate USP4 sequences, where significantly larger scores indicate stronger splice signal strength. Given Table 1(upper panel), the maximal PWMS is 16.9. PWMSs are consistently larger for 5′SS6 than 5′SS7 for the ten mammalian USP4 sequences (Table 1(lower panel)), with the mean PWMS being 9.2314 for 5′SS6 and 5.455 for 5′SS7 (t = 19.759, df = 11, p < 0.0001, paired-sample t-test). This lends support for the scenario depicted in Fig. 1A(i), where a stronger 5′SS6 favors E7 skip** in mammals. In constrast, for chicken and zebrafish, PWMS6 is greater than PWMS7 while Pelodiscus sinensis again conforms to the mammalian pattern (PWMS6 > PWMS7).

Table 1

There were no significant differences in flanking 3′SS PWMSs (as might be expected since 5′SS and BPS are most important for determining exon-intron boundaries33).

Length of the introns flanking exon 7

Because exons flanked by long introns tend to be skipped during the splicing process22,25, which would suggest that the the potential contribution of the long I6 to E7 skip** in mammals should not be ignored.

Figure 3
figure 3

Length of USP4 introns for 62 bird, fish and mammal species.

log10 scale used for y-axis. Intron 6 is highly variable in length.

Experimental tests of alternative hypotheses

Determination of splicing mechanism

Although our results are consistent with the hypothesis that E7 skip** results from BPS7 strengthening relative to BPS6 and 5′SS6 stronger than 5′SS7, they do not exclude the possibility that the longer intron I6 or other potentially complicating factors in USP4 pre-mRNA may contribute to E7 skip**. To test the hypothesis that E7 skip** is due to differential splice signals at 5′SS and BPS, we created a minigene construct by inserting the human USP4 genomic sequence encompassing E7 (together with 75 nt at the 3′ tail of I6 and 51 nt at the 5′ end of I7) into the well characterized splicing reporter pXJ41 (the generous gift of Dr. Sushma Grellsheid, Durham University). In the resulting minigene the human E7 genomic fragment resides in the second intron of the rabbit beta hemoglobin gene (Fig. 4A). If E7 skip** is due to the long upstream I6, then we should observe no E7 skip** in this construct with short upstream intron. The minigene mimics the scenario in Fig. 1A(i) with the differential signal strength of splice sites: 5′SSa (Fig. 4A) is stronger (PWMS = 7.5358) than 5′SS7 (PWMS = 5.0898) and BPSb (Fig. 4A) is stronger than BPS6, with the former having a CUAAC (YURAY) sequence located 35 nt from the 3′ end of the intron and the latter having no YURAY at Dto3′.opt. If E7 skip** in its natural USP4 mRNA is due to such differential strength of splice signals, then we should observe E7 skip** in the minigene mRNA.

Figure 4
figure 4

Mismatched flanking 5′ splice site strengths leads to alternative skip** of exon 7.

(A) Schematic of the RT-PCR assay construct. Assays were performed using oligonucleotide primers positioned upstream and downstream of the rabbit exons. (B) Exon skip** as detected in minigene RT-PCR assay. Lane labels specify transfected construct and reagents in U2OS cells. First line: −: untransfected control, P: empty plasmid (pXJ41), M: USP4-E7 minigene construct (pDG467). Second line: +: oligo dT-primed cDNA reaction, *: random-primed cDNA reaction, −: random primers, no reverse transcriptase (RT). A 200 base pair (bp) product of is visible after primed reactions of pXJ41-transfected cells. Exon-retained and -skipped products are visible as 350 and 200 bp bands, respectively, in pDG467-transfected cells. Products of approximately 800 and 1500 bp were generated pXJ41- and pDG467-transfected conditions, respectively, in the absence of RT and likely arose from amplification of DNA rather than cDNA. (C) Schematic of site-directed mutations. The wild type (WT) sequences and the boundaries of E7 (yellow box) are shown. In the branch point (BP) mutant a YURAY sequence (indicated in pink) was inserted 32 residues upstream of E7. In the splice site (SS) mutants the downstream exon-intron boundary sequence were optimized (substituted residues indicated in green). The combination mutant (BP + SS) incorporated both mutations. (D) RT-PCR analysis of site-directed mutants. Lane labels correspond to transfected mutant constructs as per Figure 4C. Exon-skipped products are absent from SS and BP + SS mutants.

When the minigene was expressed in human U2OS osteosarcoma cells, RT-PCR analysis using primers upstream and downstream of the rabbit exons revealed two isoforms of the size predicted for retention and exclusion of E7 (Fig. 4B). Similar results were obtained in the unrelated HeLa cell line (not shown). This finding excludes exon skip** as a consequence of intron length but is consistent with E7 skip** due to the differential strength of splice signals. We therefore explored the contributions of the BPS6 and 5′SS7 elements by site-directed mutagenesis of the E7 minigene construct, inserting a consensus YURAY sequence at Dto3′.opt in Ia (upstream of E7) and/or engineering an optimized 5′SS7 element in Ib (depicted as BP and SS respectively in Fig. 4C). The mutated versions of the minigene were transfected into U2OS cells and RT-PCR analysis was performed as before. Whereas the introduction of the consensus YURAY sequence had no effect on the ratio of exon retained and exon excluded products, the latter was undetectable in RNA isolated from cells transfected with minigenes in which the 5′SS7 element had been optimized (Fig. 4D).

Phylogenetic distribution of USP4 alternative splicing

In contrast to mammalian species, 5′SS6s of chicken and zebrafish USP4 are weaker than 5′SS7s, as indicated by their PWMS values (Table 1). To verify whether (as would be predicted) E7 skip** does not occur in such species, RT-PCR analysis was performed on RNA isolated from primary chick fibroblast cultures (the generous gift of Dr. J. S Diallo, Ottawa Hospital Research Institute). Primers corresponding to sequences in exons 6 and 8 were used to detect the presence or absence of the seventh exon as depicted in Fig. 5A. We detected only the exon-retained version of the transcript (Fig. 5B). However, when the human minigene was introduced into chick embryo fibroblasts by transfection both isoforms were detected (Fig. 5C). The absence of exon skip** in the chicken cells could thus be directly attributed to the primary sequence of the chicken USP4 pre-mRNA. Our data exclude the possibility that E7 retention occurs in the chicken as a consequence of an altered repertoire of splicing factors in avian versus mammalian cells (see Discussion). By similar logic we predict that exon skip** would not occur in the zebrafish gene; RT-PCR analysis of RNA from larval stage zebrafish (the generous gift of Dr. Marc Ekker, University of Ottawa) confirmed the presence of a single exon-retained isoform (Fig. 5D). In support of this, performing a BLASTn of USP4 exons 5–13 against the 600,432 chicken EST sequences recovered four sequences with E7 but no sequence without E7. Among the 1,488,339 zebrafish ESTs, seven have E7 but none are without E7. In contrast, searching the 8,704,868 human ESTs recovered five sequences with and seven without E7. The corresponding numbers from the 4,853,570 mouse EST sequences are 20 and 8, respectively. Our conceptual framework based on the relative strengths of 5′ splice signals thus correctly predicted splicing propensity in these model organisms, confirmed by both database and experimental analyses.

Figure 5
figure 5

Exon skip** does not occur in the USP4 gene of the chicken or the zebrafish.

(A) Schematic of the RT-PCR strategy using primers specific for exons 6 and 8. (B) RT-PCR analysis of RNA from the endogenous USP4 gene in chicken. Lane labels specify addition (+) or omission (−) of RT (control) to chicken embryo fibroblast (CEF) RNA extracts. When RT was included a single PCR product of predicted size (329 bp) for the exon-retained transcript was detected. (C) RT-PCR analysis of CEF transfected with the human USP4 minigene construct. Lane labels specify transfection of pDG467 (+) or an irrelevant control plasmid (−) in CEF cells. Both the exon-retained and exon-included products were detected (as in human cells). (D) RT-PCR analysis of RNA from the endogenous USP4 gene in the zebrafish. Lane labels specify addition (+) or omission (−) of template (lane 2) or RT (lane 3) to larval zebrafish RNA extracts. A single amplification product was detected of the size predicted for the exon-retained cDNA (313 bp).

As is shown in Fig. 4D, the optimization of three nucleotides in the 5′ splice site downstream of USP4-E7 according to the consensus sequence, namely −3G → C, −2G → A and +6A → T, proved sufficient to eliminate exon skip** in the human USP4 minigene. Among species observed in Table 1, the nine therian mammal 5′SS6s feature optimal nucleotides −2A and +6T while suboptimal −3G, −2G and +6A penalize the 5′SS7 PWMSs of all members of this lineage. These residues are identical in the Chinese turtle and are likely thus responsible for the observed alternative E7 skip** in this distant relative. In contrast, both flanking splice sites of E7 in zebrafish feature optimal nucleotides (5′SS6: −3A, −2A, +6T; 5′SS7: −3C, −2A, +6C), which preclude E7 exclusion. Curiously, in chicken, these determinant nucleotides are identical to those of mammals which produce E7 skip** with the exception of the 5′SS6 +6N site, which is weak (+6A). The upstream and downstream 5′SSs in chicken, though weak, are equivalent and prevent exon skip** as in zebrafish. The +6N site may thus be the discriminant factor in E7 skip** propensity. To verify this, we expanded the scope of our analysis to include all sequenced genomes bearing USP4 to see whether the 5′SS mismatching (in particular +6 site mismatching) predicts splicing proclivity. While direct expansion of our analytical framework is limited by insufficient EST data and biological sample unavailability, we can infer splicing patterns from RNA-seq datasets. Similar to the methodology used for EST mining, we performed a BLASTn of available RNA-seq data from the Sequence Read Archive (SRA) using the USP4 coding sequence with E7 removed as a query. In the absence of hits crossing the exon 6–8 boundary for multiple, sufficiently large expression datasets, species were deemed to forgo short isoform production.

Figure 6A summarizes USP4 splicing patterns in a phylogenetic context with corresponding flanking 5′SS sequence logos indicated. According to the PWM in Table 1A, +6A and +6G weaken the 5′SS while +6T is optimal and +6C is neutral (weighted consensus illustrated in Fig. 6C). For all tetrapods, when the downstream +6 site is stronger than the upstream +6 site, there is alternative splicing of E7. This correlation is particularly apparent in the avian phylum: chicken and turkey have no E7 skip** (+6A; +6A), all other birds either exhibit skip** (+6C/T; +6A) or loss of E7. What is more, some members of sister taxa have lost the ability to produce the long isoform: E7 is deleted in Corvus brachyrhynchos but present in Corvus cornix cornix; absent from Adelie penguin but present in Emperor penguin, for example. In contrast to this substantial variability, all mammals retain an optimal E7 skip** configuration, +6T; +6A (with the exception of the clade root: platypus USP4 has +6G; +6A and, consistent with our model, does not undergo skip**). In theory, many nucleotides substitutions could disrupt the 5′SS if alternative splicing were the result of drift. Since the same splice site configuration is maintained throughout 220 million years of mammalian evolution there may be selection for this particular +6 configuration. In Fig. 6C and Supplemental Figure 1, we show the effects of downstream +6 site point mutation from native +6A to +6T, +6C and +6G in human and mouse cell lines. In each case, the splicing propensity changed in direct relation with the estimated fitness in our PWM: alternative splicing was nearly eliminated in +6T, slightly reduced in +6C and increased in +6G. Therefore, we propose that the highly conserved +6A site within 5′SS7 is under natural selection to maintain both long and short USP4 isoforms in therian mammals.

Figure 6
figure 6

Long and short isoform production was selected for in therian mammals.

(A) USP4 exon 7 skip** propensity throughout the vertebrate phylogeny (see inset Legend). (B) 5′SS6 and 5′SS7 identities for species in (A). In (B), splice site configurations that lead to constitutive retention of E7 are in the top row and those that lead to alternative skip** are in the bottom row. (C) Sequence logo for an optimal 5′SS (from Table 1) and changes in long-to-short isoform ratios (IRL/S) after experimental replacement of the sixth intronic nucleotide (+6 site) of the downstream 5′SS7 of human USP4 in H1299 cells. IRL/S quantifications are as follows: C = 1.00, G = 0.81, T = 1.52, A (WT) = 1.02. (D) Subcellular localization of exon 7 skipped and exon 7 retained USP4 isoforms. Long and short isoforms with appended green (GFP) and red (mKATE) fluorescent tags, respectively, were transfected into 293T and HeLA cells.

Differential localizations and roles of spliced isoforms

The evidence supporting alternative splicing selection in mammalian USP4 is strong; we would consequently expect the two isoforms to have distinct cellular roles. Indeed, we observed distinct subcellular localizations of long and short USP4 isoforms in both single- and double-transfections of HeLa, U2OS, 293T and 3T3 cells (see Fig. 6D). While the short isoform was distributed throughout the cell, the localization of the long isoform was largely cytoplasmic in most if not all cells in the four cell lines examined. Potential implications of this observation are discussed below. Altogether, our results suggest that the two major USP4 isoforms generated by alternative skip** of its seventh exon may not be functionally redundant as previously suggested.

Discussion

It has been proposed that mutations that weaken the 5′ splice site are responsible for the evolutionary shift from constitutive to alternative splicing in many vertebrate genes, as reviewed in Keren et al.21 and compelling evidence has been presented in support of this hypothesis34. While most minor splice variants are attributable to noisy splicing35,36, USP4 constitutes a rare case wherein selective pressure acts to conserve differential 5′SS strengths leading to exon skip** in therian mammals. The approach we presented here focuses on these cis-acting splice sites, which offers a more basic but more direct framework towards understanding the splicing code. 5′ splice sites can recruit trans-acting alternative splicing factors for intrinsic splice regulation. For example, deleterious exon skip** in survival of motor neuron (SMN) pre-mRNA can be attributed to recruitment of splice repressor U2AF65 by a weak downstream 5′SS37. While trans-acting factors interacting with the 5′SSs of USP4 may similarly regulate E7 skip**, our model explains USP4-E7 splicing propensity independent of other cis-regulatory sequences such as exonic splice enhancers (ESEs), which may or may not be selectively co-optimized in USP4 alternative splicing. Our study also highlights the importance of experimental verification of alternative hypotheses. Although the bioinformatics framework alone cannot distinguish between the two mechanisms proposed in Fig. 1(i,ii), the experimental results demonstrate that relative 5′SS strengths are far better predictors of alternative splicing than BPSs or upstream intron lengths. Further, our combinatorial in silico and experimental approach identified the +6 site within the 5′SS as the splicing discriminant. Intronic +6 site mutations have been reported as splicing instigators in other genes such as SMN138,39. E7 skip** in SMN1 leads to spinal muscular atrophy (SMA) and SNPs that cause this aberrant skip** have been identified in patients at the downstream 5′SS7 at the +6 site (+6T → G). SMN1 has a very close paralog, SMN2, that is incapable of rescuing SMN1 deficiency in SMA because its E7 is also skipped due to a WT nucleotide variant, 5′SS7 +6G. Thus, +6T at the downstream 5′SS7 of SMN1/2 promotes upstream exon inclusion while +6G promotes near-total upstream exon skip**. In mammalian WT USP4, 5′SS6 has +6T while 5′SS7 has +6A. As reflected in the PWM in Table 1 and observed in Fig. 6C, the strengths of +6 site nucleotides are predicted to be as follows: T > C ≥ A > G, where a stronger 5′SS6 +6 relative to 5′SS7 +6 correlates with splicing proclivity. It is curious that +6A ≠ +6G and that the former was selected as the weak downstream nucleotide of USP4. A plausible mechanism for +6A-dependent alternative skip** may involve U1C, a component of the U1 snRNP that preferentially recognizes a 5′SS motif with +6A, GTATAA40 and can interact with splicing regulator TIA-141,42 to promote exon retention, for example in SMN243,44. Several genes undergo U1C-dependent alternative splicing45,46. Based on linear changes in the relative abundances of the short and long isoforms observed during differentiation of P19 embryonic carcinoma cells (Gray, unpublished) we postulate that E7 of USP4 may be subject to regulated alternative splicing in therian mammals.

Retention or exclusion of the amino acids encoded by exon 7 does not affect the protease activity of the USP4 enzyme (using a synthetic substrate14) and the ubiquitin-exchange regulatory mechanism proceeds equally in both USP4 isoforms16. We have nonetheless shown that there is selection for alternative splicing maintenance in mammalian USP4. Establishing the molecular selection driver should be highly informative. We show that the two isoforms display distinct subcellular localizations, which suggests that (1) propensity and/or (2) capacity for substrate interactions may differ. First, vital cytoplasmic (e.g. TGF-β pathway1) and nuclear (e.g. spliceosomal11,12) substrates have been reported for USP4 (isoforms specificities not declared). Long and short USP4 isoform production may be advantageous for simultaneous, collective targeting of key substrates in various cellular compartments. On the other hand, the two isoforms almost certainly have some distinct interactors. Cytoplasmic retention of USP4-long is mediated by phosphorylation of a ubiquitously conserved serine (S445); the apparent absence of this regulation in USP4-short may reflect a lack of phosphorylation by Akt2. Exon 7 of mammalian USP4 is serine-rich (16 out of 47 residues; human sequence: RSSTAPSRNFTTSPKSSASPYSSVSASLIANGDSTSTCGMHSSGVSRG) and also contains six constitutively charged residues (five positively charged and one negatively charged). While the +4 net charge difference between isoforms likely affects substrate interactions, the serine-rich exon also retains multiple phosphorylation sites (underlined) that are conserved among all mammals. USP4-E7 may be phosphorylated in conjunction with Ser445 for nuclear exclusion or may be required for interaction with Akt and other substrates. For instance, SART3, a spliceosomal factor and deubiquitination target of both USP412 and its paralog USP1547, has been reported to interact with serine-rich (not to be confused with serine/arginine-rich) domains of proteins48. Interestingly, USP15 contains an analogous, serine-enriched alternatively spliced seventh exon (SPGASNFSTLPKISPSSLSNNYNNMNNR; reported phosphorylation sites underlined). Splice boundaries and amino acid sequence differ between E7 of USP4 and USP15, suggesting that alternative splicing arose independently in these, though they both maintain significant proportions of serines. This may be a case of stabilizing selection acting on clusters of phosphorylation sites49. There may be an important feedback loop involving the splicing and subsequent localization of USP4 and USP15, two DUBs that critically interact with the spliceosome. Long and short USP4 production and thus DUB modification of isoform-specific substrates may differ across tissue types. It remains to be seen whether such isoform-specific substrates drove evolutionary conservation of the dual isoforms within placental mammals.

To summarize, we have shown that the long and short isoforms of USP4 have distinct properties and their contributions to cellular networks should be considered separately. Most proteins have more than one reported isoform and though most may be considered non-essential noise, distinct functional variants, such as in USP4, must not be grouped as one protein. The roles of all significantly expressed minor splice variants should be studied more carefully.

Methods

Bioinformatic analysis

Well-annotated USP4 sequences for 62 vertebrate species were downloaded from GenBank (See Supplementary Table S2) covering major vertebrate taxa. Coding sequences, exons, introns and exon-intron junctions (5 nt on the exon side and 12 nt on the intron side) were extracted and analyzed by using DAMBE

References

  • Aggarwal, K. & Massagué, J. Ubiquitin removal in the TGF-β pathway. Nat Cell Biol. 14, 656–657 (2012).

    Article  CAS  PubMed  Google Scholar 

  • Zhang, L. et al. USP4 is regulated by AKT phosphorylation and directly deubiquitylates TGF-b type I receptor. Nat Cell Biol. 14, 717–726 (2012).

    Article  CAS  PubMed  Google Scholar 

  • Fan, Y.-H. et al. USP4 targets TAK1 to downregulate TNFa-induced NF-kB activation. Cell Death Differ. 18, 1547–1560 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Zhao, B., Schlesiger, C., Masucci, M. G. & Lindsten, K. The ubiquitin specific protease 4 (usp4) is a new player in the wnt signalling pathway. J Cell Mol Med. 13, 1886–1895 (2009).

    Article  PubMed  Google Scholar 

  • Zhang, X., Berger, F. G., Yang, J. & Lu, X. USP4 inhibits p53 through deubiquitinating and stabilizing ARF-BP1. EMBO J. 30, 2177–2189 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Wang, L. et al. USP4 positively regulates RIG-I-mediated antiviral response through deubiquitination and stabilization of RIG-I. J Virol. 87, 4507–4515 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • **ao, N. et al. Ubiquitin-specific protease 4 (USP4) targets TRAF2 and TRAF6 for deubiquitination and inhibits TNFa-induced cancer cell migration. Biochem J. 441, 979–986 (2012).

    Article  CAS  PubMed  Google Scholar 

  • Zhou, F. et al. Ubiquitin-specific protease 4 mitigates Toll-like/interleukin-1 receptor signaling and regulates innate immune activation. J Biol Chem. 287, 11002–11010 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Uras, I. Z., List, T. & Nijman, S. M. B. Ubiquitin-specific protease 4 inhibits mono-ubiquitination of the master growth factor signaling kinase PDK1. PLoS One 7, e31003 (2012).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  • Soboleva, T. A., Jans, D. A., Johnson-Saliba, M. & Baker, R. T. Nuclear-cytoplasmic shuttling of the oncogenic mouse UNP/USP4 deubiquitylating enzyme. J Biol Chem. 280, 745–752 (2005).

    Article  CAS  PubMed  Google Scholar 

  • Sowa, M. E., Bennett, E. J., Gygi, S. P. & Harper, J. W. Defining the human deubiquitinating enzyme interaction landscape. Cell. 138, 389–403 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Song, E. J. et al. The Prp19 complex and the Usp4sart3 deubiquitinating enzyme control reversible ubiquitination at the spliceosome. Genes Dev. 24, 1434–1447 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Gupta, K., Chevrette, M. & Gray, D. A. The Unp proto-oncogene encodes a nuclear protein. Oncogene. 9, 1729–1731 (1994).

    CAS  PubMed  Google Scholar 

  • Frederick, A., Rolfe, M. & Chiu, M. I. The human UNP locus at 3p21.31 encodes two tissue-selective, cytoplasmic isoforms with deubiquitinating activity that have reduced expression in small cell lung carcinoma cell lines. Oncogene. 16, 153–165 (1998).

    Article  CAS  PubMed  Google Scholar 

  • Gupta, K., Copeland, N. G., Gilbert, D. J., Jenkins, N. A. & Gray, D. A. Unp, a mouse gene related to the tre oncogene. Oncogene. 8, 2307–2310 (1993).

    CAS  PubMed  Google Scholar 

  • Clerici, M., Luna-Vargas, M. P. A., Faesen, A. C. & Sixma, T. K. The DUSP-Ubl domain of USP4 enhances its catalytic efficiency by promoting ubiquitin exchange. Nat Commun. 5, 5399 (2014).

    Article  ADS  PubMed  PubMed Central  Google Scholar 

  • Vlasschaert, C., **a, X., Coulombe, J. & Gray, D. A. Evolution of the highly networked deubiquitinating enzymes USP4, USP15 and USP11. BMC Evol Biol. 15, 230 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Alekseyenko, A. V., Kim, N. & Lee, C. J. Global analysis of exon creation versus loss and the role of alternative splicing in 17 vertebrate genomes. RNA 13, 661–670 (2007).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Sugnet, C. W. et al. Unusual intron conservation near tissue-regulated exons found by splicing microarrays. PLoS Comput Biol. 2, e4 (2006).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  • Sammeth, M., Foissac, S. & Guigó, R. A general definition and nomenclature for alternative splicing events. PLoS Comput Biol. 4, e1000147 (2008).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  • Keren, H., Lev-Maor, G. & Ast, G. Alternative splicing and evolution: diversification, exon definition and function. Nat Rev Genet. 11, 345–355 (2010).

    Article  CAS  PubMed  Google Scholar 

  • Bell, M. V., Cowper, A. E., Lefranc, M. P., Bell, J. I. & Screaton, G. R. Influence of intron length on alternative splicing of CD44. Mol Cell Biol. 18, 5930–5941 (1998).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Roy, M., Kim, N., **ng, Y. & Lee, C. The effect of intron length on exon creation ratios during the evolution of mammalian genomes. RNA 14, 2261–2273 (2008).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Kandul, N. P. & Noor, M. A. F. Large introns in relation to alternative splicing and gene evolution: a case study of Drosophila bruno-3. BMC Genet. 10, 67 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Fox-Walsh, K. L. et al. The architecture of pre-mRNAs affects mechanisms of splice-site pairing. PNAS 102, 16176–16181 (2005).

    Article  ADS  CAS  PubMed  Google Scholar 

  • Sterner, D. A., Carlo, T. & Berget, S. M. Architectural limits on split genes. PNAS 93, 15081–15085 (1996).

    Article  ADS  CAS  PubMed  Google Scholar 

  • Kim, E., Magen, A. & Ast, G. Different levels of alternative splicing among eukaryotes. Nucleic Acids Res. 35, 125–131 (2007).

    Article  CAS  Google Scholar 

  • Patel, A. A. & Steitz, J. A. Splicing double: insights from the second spliceosome. Nat Rev Mol Cell Biol. 4, 960–970 (2003).

    Article  CAS  PubMed  Google Scholar 

  • **ng, Y. & Lee, C. Assessing the application of Ka/Ks ratio test to alternatively spliced exons. Bioinformatics. 21, 3701–3703 (2005).

    Article  CAS  PubMed  Google Scholar 

  • Gao, K., Masuda, A., Matsuura, T. & Ohno, K. Human branch point consensus sequence is yUnAy. Nucleic Acids Res. 36, 2257–2267 (2008).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Zheng, C. L., Fu, X.-D. & Gribskov, M. Characteristics and regulatory elements defining constitutive splicing and different modes of alternative splicing in human and mouse. RNA 11, 1777–1787 (2005).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Dewey, C. N., Rogozin, I. B. & Koonin, E. V. Compensatory relationship between splice sites and exonic splicing signals depending on the length of vertebrate introns. BMC Genomics. 7, 311 (2006).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Matlin, A. J., Clark, F. & Smith, C. W. J. Understanding alternative splicing: towards a cellular code. Nat Rev Mol Cell Biol. 6, 386–398 (2005).

    Article  CAS  PubMed  Google Scholar 

  • Lev-Maor, G. et al. The “alternative” choice of constitutive exons throughout evolution. PLoS Genet. 3, e203 (2007).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Melamud, E. & Moult, J. Stochastic noise in splicing machinery. Nucleic Acids Res. 37, 4873–4886 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Pickrell, J. K., Pai, A. A., Gilad, Y. & Pritchard, J. K. Noisy Splicing Drives mRNA Isoform Diversity in Human Cells. PLoS Genet. 6, e1001236 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Cho, S. et al. Splicing inhibition of U2af65 leads to alternative exon skip**. PNAS 112, 9926–9931 (2015).

    Article  ADS  CAS  PubMed  Google Scholar 

  • Wirth, B. et al. Quantitative Analysis of Survival Motor Neuron Copies: Identification of Subtle SMN1 Mutations in Patients with Spinal Muscular Atrophy, Genotype-Phenotype Correlation and Implications for Genetic Counseling. Am J Hum Genet. 64, 1340–1356 (1999).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Lorson, C. L., Hahnen, E., Androphy, E. J. & Wirth, B. A single nucleotide in the SMN gene regulates splicing and is responsible for spinal muscular atrophy. PNAS 96, 6307–6311 (1999).

    Article  ADS  CAS  PubMed  Google Scholar 

  • Du, H. & Rosbash, M. The U1 snRNP protein U1c recognizes the 5′ splice site in the absence of base pairing. Nature 419, 86–90 (2002).

    Article  ADS  CAS  PubMed  Google Scholar 

  • Förch, P., Puig, O., Martínez, C., Śeraphin, B. & Valcárcel, J. The splicing regulator TIA-1 interacts with U1-C to promote U1 snRNP recruitment to 5′ splice sites. EMBO J. 21, 6882–6892 (2002).

    Article  PubMed  PubMed Central  Google Scholar 

  • Bauer, W. J., Heath, J., Jenkins, J. L. & Kielkopf, C. L. Three RNA recognition motifs participate in RNA recognition and structural organization by the pro-apoptotic factor TIA-1. J Mol Biol. 415, 727–740 (2012).

    Article  CAS  PubMed  Google Scholar 

  • Singh, N. N. et al. TIA1 Prevents Skip** of a Critical Exon Associated with Spinal Muscular Atrophy. Mol Cell Biol. 31, 935–954 (2011).

    Article  CAS  PubMed  Google Scholar 

  • Klar, J. et al. Welander distal myopathy caused by an ancient founder mutation in TIA1 associated with perturbed splicing. Hum Mutat. 34, 572–577 (2013).

    CAS  PubMed  Google Scholar 

  • Rösel, T. D. et al. RNA-Seq analysis in mutant zebrafish reveals role of U1c protein in alternative splicing regulation. EMBO J. 30, 1965–1976 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Rösel-Hillgärtner, T. D. et al. A Novel Intra-U1 snRNP Cross-Regulation Mechanism: Alternative Splicing Switch Links U1c and U1-70k Expression. PLoS Genet. 9, e1003856 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Long, L. et al. The U4/U6 recycling factor SART3 has histone chaperone activity and associates with USP15 to regulate H2b deubiquitination. J Biol Chem. 289, 8916–8930 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Harada, K., Yamada, A., Yang, D., Itoh, K. & Shichijo, S. Binding of a SART3 tumor-rejection antigen to a pre-mRNA splicing factor RNPS1: A possible regulation of splicing by a complex formation. Int J Cancer. 93, 623–628 (2001).

    Article  CAS  PubMed  Google Scholar 

  • Landry, C. R., Freschi, L., Zarin, T. & Moses, A. M. Turnover of protein phosphorylation evolving under stabilizing selection. Front Genet. 5 (2014).

  • **a, X. DAMBE5: A Comprehensive Software Package for Data Analysis in. Mol Biol Evol. 30, 1720–1728 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Edgar, R. C. MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res. 32, 1792–1797 (2004).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Wickham, H. ggplot2: elegant graphics for data analysis (Springer: New York, 2009).

Download references

Acknowledgements

This study was supported by Discovery Grants of the Natural Science and Engineering Research Council of Canada (NSERC) to X.X. and D.A.G. and by a grant from the Cancer Research Society, Inc. to D.A.G.

Author information

Authors and Affiliations

Authors

Contributions

C.V., X.X. and D.A.G. contributed equally to the design of the study and the preparation of the manuscript. C.V. and X.X. were responsible for computational analyses, while D.A.G. was responsible for the experimental aspects of the study.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Electronic supplementary material

Rights and permissions

This work is licensed under a Creative Commons Attribution 4.0 International License. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in the credit line; if the material is not included under the Creative Commons license, users will need to obtain permission from the license holder to reproduce the material. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Vlasschaert, C., ** in therian mammals. Sci Rep 6, 20039 (2016). https://doi.org/10.1038/srep20039

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1038/srep20039

  • Springer Nature Limited

This article is cited by