Introduction

Next-generation RNA sequencing technology has revealed thousands of novel transcripts that possess no potential protein-coding elements. These RNAs are typically annotated as non-coding RNAs (ncRNAs) in the Human Genome Project and ENCODE Project [31, 59, 64, 130]. In regulatory regions, upstream of promoters (promoter upstream transcript, PROMPT) [106], enhancers (eRNA) [76], intergenic regions (lincRNA) [114] and telomeres [81] can be other sources of lncRNAs. Many hallmarks of lncRNA processing are similar to those of mRNAs in post-transcription, such as nascent lncRNAs being 5'-capped, 3'-polyadenylated or alternatively spliced [19]. LncRNA production is less efficient than for mRNAs and their half-lives appear to be shorter [98]. Unlike mRNA that is directly transported to the cytoplasm for translation, many lncRNAs tend to be located in the nucleus rather than in the cytosol, as revealed by experimental approaches such as fluorescent in situ hybridization [20, 67]. However, upon export to cytoplasm, some lncRNAs bind to ribosomes where they can be translated into functional peptides under specific cell contexts [20, 58]. For instance, myoregulin is encoded by a putative lncRNA and binds to sarco/endoplasmic reticulum Ca2+-ATPase (SRCA) to regulate Ca2+ import in the sarcoplasmic reticulum [6]. Nevertheless, it remains to be established if other ribosome-associated lncRNAs generate functional peptides.

General function of lncRNAs

A broad spectrum of evidence demonstrates the multifaceted roles of lncRNAs in regulating cellular processes. In the nucleus, lncRNAs participate in nearly all levels of gene regulation, from maintaining nuclear architecture to transcription per se. To establish nuclear architecture, Functional intergenic repeating RNA element (Firre) escapes from the X chromosome inactivation (XCI) and bridges multi-chromosomes, partly via association with heterogeneous nuclear ribonucleoprotein U (hnRNPU) (Figure 1a) [54]. CCCTC-binding factor (CTCF)-mediated chromosome loo** can also be accomplished by lncRNAs. For example, colorectal cancer associated transcript 1 long isoform (CCAT1-L) facilitates promoter-enhancer loo** at the MYC locus by interacting with CTCF, leading to stabilized MYC expression and tumorigenesis (Figure 1b) [153]. In addition, CTCF binds to many X chromosome-derived lncRNAs such as X-inactivation intergenic transcription element (**te), X-inactive specific transcript (**st) and the reverse transcript of **st (Tsix) to establish three-dimensional organization of the X chromosome during XCI [69]. In addition to maintaining nuclear architecture, lncRNAs may also serve as building blocks of nuclear compartments. For example, nuclear enriched abundant transcript 1 (NEAT1) is the core element of paraspeckles that participate in various biological processes such as nuclear retention of adenosine-to-inosine-edited mRNAs to restrict their cytoplasmic localization and viral infection response. However, the exact function of paraspeckles has yet to be fully deciphered (Figure 1c) [26, 30, 57]. LncRNAs can also function as a scaffolding component, bridging epigenetic modifiers to coordinate gene expression (e.g. activation or repression). For instance, **st interacts with polycomb repressive complex 2 (PRC2) and the silencing mediator for retinoid and thyroid hormone receptor (SMRT)/histone deacetylase 1 (HDAC1)-associated repressor protein (SHARP) to deposit a methyl group on lysine residue 27 of histone H3 (H3K27) and to deacetylate histones, respectively, leading to transcriptional repression of the X chromosome (Figure 1d) [87]. Similarly, Hox antisense intergenic RNA (Hotair) bridges the PRC2 complex and lysine-specific histone demethylase 1A (LSD1, a H3K4me2 demethylase) to synergistically suppress gene expression [118, 140]. In contrast, HOXA transcript at the distal tip (HOTTIP) interacts with the tryptophan-aspartic acid repeat domain 5 - mixed-lineage leukemia 1 (WDR5-MLL1) complex to maintain the active state of the 5' HOXA locus via deposition of histone 3 lysine 4 tri-methylation (H3K4me3) [149]. LncRNAs also regulate the splicing process by associating with splicing complexes. A neural-specific lncRNA, Pnky, associates with the splicing regulator polypyrimidine tract-binding protein 1 (PTBP1) to regulate splicing of a subset of neural genes [112]. Moreover, interaction between Metastasis-associated lung adenocarcinoma transcript 1 (Malat1) and splicing factors such as serine/arginine rich splicing factor 1 (SRSF1) is required for alternative splicing of certain mRNAs (Figure 1e) [139].

Fig. 1
figure 1

Summary (with examples) of the multifaceted roles of lncRNAs in the cell. a The X chromosome-derived lncRNA Firre associates with HnRNPU to establish inter-chromosome architecture. bCCAT1-L generated from upstream of MYC loci promotes MYC expression via CTCF-mediated loo**. c Paraspeckle formation is regulated by interactions between NEAT1_2 and RBPs. d X chromosome inactivation is accomplished by coordination between ** neural tube to generate a dorsal to ventral gradient [4, 88]. In contrast, sonic hedgehog (Shh) proteins emanating from the floor plate as well as the notochord generate an opposing ventral to dorsal gradient [16]. Together with paraxial mesoderm-expressed retinoic acid (RA), these factors precisely pattern the neural tube into spinal cord progenitor domains pd1~6, p0, p1, p2, motor neuron progenitor (pMN), and p3 along the dorso-ventral axis (Figure 2a). This patterning is mediated by distinct expression of cross-repressive transcription factors—specifically, Shh-induced class II transcription factors (Nkx2.2, Nkx2.9, Nkx6.1, Nkx6.2, Olig2) or Shh-inhibited class I transcription factors (Pax3, Pax6, Pax7, Irx3, Dbx1, Dbx2)—that further define the formation of each progenitor domain [104, 143]. All spinal MNs are generated from pMNs, and pMNs are established upon co-expression of Olig2, Nkx6.1 and Nkx6.2 under conditions of high Shh levels [2, 105, 132, 162]. Although a series of miRNAs have been shown to facilitate patterning of the neuronal progenitors in the spinal cord and controlling of MN differentiation [24, 25, 27, 74, 141, 142], the roles of lncRNAs during MN development are just beginning to emerge. In Table 1, we summarize the importance of lncRNAs for the regulation of transcription factors in MN contexts. For instance, the lncRNA lncrps25 is located near the S25 gene (which encodes a ribosomal protein) and it shares high sequence similarity with the 3' UTR of neuronal regeneration-related protein (NREP) in zebrafish. Loss of lncrps25 reduces locomotion behavior by regulating pMN development and Olig2 expression [48]. Additionally, depletion of an MN-enriched lncRNA, i.e. Maternally expressed gene 3 (Meg3), results in upregulation of progenitor genes (i.e., Pax6 and Dbx1) in embryonic stem cell (ESC)-derived post-mitotic MNs, as well as in post-mitotic neurons in embryos. Mechanistically, Meg3 associates with the PRC2 complex to facilitate the maintenance of H3K27me3 levels in many progenitor loci, including Pax6 and Dbx1 (Figure 2b) [156]. Apart from lncRNA-mediated regulation of Pax6 in the spinal cord, corticogenesis in primates also seems to rely on the Pax6/lncRNA axis [113, 145]. In this scenario, primate-specific lncRNA neuro-development (Lnc-ND) located in the 2p25.3 locus [131] exhibits an enriched expression pattern in neuronal progenitor cells but reduced expression in the differentiated neurons. Microdeletion of the 2p25.3 locus is associated with intellectual disability. Manipulations of Lnc-ND levels reveals that Lnc-ND is required for Pax6 expression and that overexpression of Lnc-ND by means of in utero electroporation in mouse brain promotes expansion of the Pax6-positive radial glia population [113]. Moreover, expression of the Neurogenin 1 (Ngn1) upstream enhancer-derived eRNA, utNgn1, is necessary for expression of Ngn1 itself in neocortical neural precursor cells and it is suppressed by PcG protein at the ESC stage [108]. Thus, lncRNAs seem to mediate a battery of transcription factors that are important for early neural progenitor patterning and this role might be conserved across vertebrates.

Fig. 2
figure 2

Schematic illustration of spinal motor neuron development. a Notochord- and floor plate-derived sonic hedgehog protein (Shh), and roof plate-generated wingless/integrated (WNT) protein and bone morphogenetic (BMP) protein, as well as retinoic acid (RA) diffusing from the paraxial mesoderm, pattern the identities of spinal neurons by inducing cross-repressive transcription factors along the dorso-ventral axis (pd1~6, p0, p1, p2, pMN, and p3). Motor neuron progenitors (pMNs) are generated by co-expression of Olig2, Nkx6.1 and Nkx6.2. After cell cycle exit, pMNs give rise to generic MNs by concomitantly expressing Isl1, Lhx3 and Mnx1. Along the rostro-caudal axis, Hox6/Hoxc9/Hox10 respond to RA and fibroblast growth factor (FGF) to pattern the brachial, thoracic and lumbar segments, respectively. b In the Hox6on segment, the interaction between PRC2-Jarid2 complex and a Isl1/Lhx3 induced lncRNA Meg3 perpetuates the brachial Hoxa5on MN by repressing caudal Hoxc8 and alternative progenitor genes Irx3 and Pax6 via the maintenance of H3K27me3 epigenetic landscape in these genes. Yet the detailed mechanism how Meg3 targets to these selective genes still needs to be illustrated.

Table 1 Proposed functions of lncRNAs during spinal motor neuron development

LncRNAs in the regulation of postmitotic neurons

In addition to their prominent functions in neural progenitors, lncRNAs also play important roles in differentiated neurons. Taking spinal MNs as an example, postmitotic MNs are generated from pMNs, and after cell cycle exit they begin to express a cohort of MN-specific markers such as Insulin gene enhancer protein 1 (Isl1), LIM/homeobox protein 3 (Lhx3), and Motor neuron and pancreas homeobox 1 (Mnx1, Hb9) (Figure 2a). Isl1/Lhx3/NLI forms an MN-hexamer complex to induce a series of MN-specific regulators and to maintain the terminal MN state by repressing alternative interneuron genes [43, 72]. Although the gene regulatory network for MN differentiation is very well characterized, the role of the lncRNAs involved in this process is surprisingly unclear. Only a few examples of that role have been uncovered. For instance, the lncRNA CAT7 is a polyadenylated lncRNA that lies upstream (~400 kb) of MNX1 identified from the RNA-Polycomb repressive complex 1 (PRC1) interactome. Loss of CAT7 results in de-repression of MNX1 before committing to neuronal lineage through reduced PRC1 and PRC2 occupancy at the MNX1 locus in hESC~MNs [115]. Furthermore, an antisense lncRNA (MNX1-AS1) shares the same promoter as MNX1, as revealed by clustered regularly interspaced short palindromic repeats (CRISPR) and CRISPR-associated protein 9 (CRISPR-Cas9) screening [53]. These results suggest that in addition to neural progenitors, lncRNAs could have another regulatory role in fine-tuning neurogenesis upon differentiation. However, whether the expression and functions of these lncRNAs are important for MN development in vivo still needs to be further validated. Future experiments to systematically identify lncRNAs involved in this process will greatly enhance our knowledge about lncRNAs and their mysterious roles in early neurogenesis.

After generic postmitotic MNs have been produced, they are further programmed into versatile subtype identities along the rostro-caudal spinal cord according to discrete expression of signaling molecules, including retinoic acid (RA), WNT, fibroblast growth factor (FGF), and growth differentiation factor 11 (GDF11), all distributed asymmetrically along the rostro-caudal axis (Figure 2a). Antagonistic signaling of rostral RA and caudal FGF/GDF11 further elicits a set of Homeobox (Hox) proteins that abut each other, namely Hox6, Hox9 and Hox10 at the brachial, thoracic and lumbar segments, respectively [12, 77, 129]. These Hox proteins further activate downstream transcription factors that are required to establish MN subtype identity. For instance, formation of lateral motor column (LMC) MNs in the brachial and lumbar regions is regulated by Hox-activated Forkhead box protein P1 (Foxp1) [35, 119]. It is conceivable that lncRNAs might also participate in this MN subtype diversification process. For example, the lncRNA FOXP1-IT1, which is transcribed from an intron of the human FOXP1 gene, counteracts integrin Mac-1-mediated downregulation of FOXP1 partly by decoying HDAC4 away from the FOXP1 promoter during macrophage differentiation [128]. However, it remains to be verified if this Foxp1/lncRNA axis is also functionally important in a spinal cord context. An array of studies in various cell models has demonstrated regulation of Hox genes by lncRNAs such as Hotair, Hottip and Haglr [118, 149, 160]. However, to date, only one study has established a link between the roles of lncRNAs in MN development and Hox regulation. Using an embryonic stem cell differentiation system, a battery of MN hallmark lncRNAs have been identified [14, 156]. Among these MN-hallmark lncRNAs, knockdown of Meg3 leads to the dysregulation of Hox genes whereby caudal Hox gene expression (Hox9~Hox13) is increased but rostral Hox gene expression (Hox1~ Hox8) declines in cervical MNs. Analysis of maternally-inherited intergenic differentially methylated region deletion (IG-DMRmatΔ) mice in which Meg3 and its downstream transcripts are further depleted has further revealed ectopic expression of caudal Hoxc8 in the rostral Hoxa5 region of the brachial segment, together with a concomitant erosion of Hox-mediated downstream genes and axon arborization (Figure 2b) [156]. Given that dozens of lncRNAs have been identified as hallmarks of postmitotic MNs, it remains to be determined if these other lncRNAs are functionally important in vivo. Furthermore, lncRNA knockout has been shown to exert a very mild or no phenotype in vivo [52]. Based on several lncRNA-knockout mouse models, it seems that the physiological functions of lncRNAs might not be as prominent as transcription factors during the developmental process [8, 123], yet their functions become more critical under stress conditions such as cancer progression or neurodegeneration [102, 124]. Therefore, next we discuss how lncRNAs have been implicated in MN-related diseases.

Motor neuron-related diseases

Since lncRNAs regulate MN development and function, it is conceivable that their dysregulation or mutation would cause neurological disorders. Indeed, genome-wide association studies (GWAS) and comparative transcriptomic studies have associated lncRNAs with a series of neurodegenerative diseases, including the age-onset MN-associated disease amyotrophic lateral sclerosis (ALS) [86, 164]. Similarly, lncRNAs have also been linked to spinal muscular atrophy (SMA) [33, 152]. However, most of these studies have described associations but do not present unequivocal evidence of causation. Below and in Table 2, we summarize some of these studies linking lncRNAs to MN-related diseases.

Table 2 Proposed functions of lncRNAs in spinal motor neuron diseases

Amyotrophic lateral sclerosis (ALS)

ALS is a neurodegenerative disease resulting in progressive loss of upper and lower MNs, leading to only 5-10 years median survival after diagnosis. More than 90% of ALS patients are characterized as sporadic (sALS), with less than 10% being diagnosed as familial (fALS) [17]. Some ALS-causing genes—such as superoxide dismutase 1 (SOD1) and fused in sarcoma/translocated in sarcoma (FUS/TLS)—have been identified in both sALS and fALS patients, whereas other culprit genes are either predominantly sALS-associated (e.g. unc-13 homolog A, UNC13A) or fALS-associated (e.g. D-amino acid oxidase, DAO). These findings indicate that complex underlying mechanisms contribute to the selective susceptibility to MN degeneration in ALS. Since many characterized ALS-causing genes encode RNA-binding proteins (RBPs)—such as angiogenin (ANG), TAR DNA-binding protein 43 (TDP-43), FUS, Ataxin-2 (ATXN2), chromosome 9 open reading frame 72 (C9ORF72), TATA-box binding protein associated factor 15 (TAF15) and heterogeneous nuclear ribonucleoprotein A1 (HNRNPA1)—it is not surprising that global and/or selective RBP-RNAs, including lncRNAs, might participate in ALS onset or disease progression. Below, we discuss some representative examples.

Nuclear Enriched Abundant Transcript 1 (NEAT1)

NEAT1 is an lncRNA that appears to play an important structural role in nuclear paraspeckles [30]. Specifically, there are two NEAT1 transcripts: NEAT1_1 (3.7 kb) is dispensable whereas NEAT1_2 (23 kb) is essential for paraspeckle formation [30, 100]. However, expression of NEAT1_2 is low in the CNS of mouse ALS models relative to ALS patients, indicating a difference between rodent and human systems [101, 103]. Although crosslinking and immunoprecipitation assay (CLIP) has revealed that NEAT1 associates with TDP-43 [103, 137, 36, 85, 111, 117]. Healthy individuals exhibit up to 20 copies of the (G4C2) repeat, but it is dramatically increased to hundreds to thousands of copies in ALS patients [36]. Loss of normal C9ORF92 protein function and gain of toxicity through abnormal repeat expansion have both been implicated in C9ORF72-associated FTD/ALS. Several C9ORF72 transcripts have been characterized and, surprisingly, antisense transcripts were found to be transcribed from intron 1 of the C9ORF72 gene [97]. Both C9ORF72 sense (C9ORF72-S) and antisense (C9ORF72-AS) transcripts harboring hexanucleotide expansions could be translated into poly-dipeptides and were found in the MNs of C9ORF72-associated ALS patients [47, 50, 95, 121, 151, 163]. Although C9ORF72-S RNA and consequent proteins have been investigated extensively, the functional relevance of C9ORF7-AS is still poorly understood. C9ORF72-AS contains the reverse-repeated hexanucleotide (GGCCCC, G2C4) located in intron 1. Similar to C9ORF72-S, C9ORF72-AS also forms RNA foci in brain regions such as the frontal cortex and cerebellum, as well as the spinal cord (in MNs and occasionally in interneurons) of ALS [49, 163] and FTD patients [36, 49, 92]. Intriguingly, a higher frequency of C9ORF72-AS RNA foci and dipeptides relative to those of C9ORF72-S have been observed in the MNs of a C9ORF72-associated ALS patient, with a concomitant loss of nuclear TDP-43 [32]. In contrast, another study suggested that compared to C9ORF72-S-generated dipeptides (poly-Gly-Ala and poly-Gly-Arg), fewer dipeptides (poly-Pro-Arg and poly-Pro-Ala) derived from C9ORF72-AS were found in the CNS region of C9ORF72-associated FTD patients [83]. These apparently contradictory results perhaps are due to differing sensitivities of the antibodies used in those studies. It has further been suggested that a fraction of the C9ORF72-AS RNA foci is found in the perinucleolar region, indicating that nucleolar stress may contribute to C9ORF72-associated ALS/FTD disease progression [33, 152]. In both these studies, SMN-AS1 recruits the PRC2 complex to suppress expression of SMN protein, which could be rescued by either inhibiting PRC2 activity or by targeted degradation of SMN-AS1 using ASOs. Moreover, a cocktail treatment of SMN2 splice-switching oligonucleotides (SSOs), which enhanced inclusion of exon 7 to generate functional SMN2, with SMN-AS1 ASOs enhanced mean survival of SMA mice from 18 days to 37 days, with ~25% of the mice surviving more than 120 days [33]. These finding suggest that in addition to SSO treatment, targeting SMN-AS1 could be another potential therapeutic strategy for SMA. Moreover, transcriptome analysis has revealed certain lncRNA defects in SMA mice exhibiting early or late-symptomatic stages [13]. By comparing the translatomes (RNA-ribosome complex) of control and SMA mice, some of the lncRNAs were shown to bind to polyribosomes and to alter translation efficiency [13]. Although lncRNAs can associate with ribosomes and some of them generate functional small peptides, it needs to be established if this information is relevant in SMA contexts.

LncRNAs in liquid-liquid phase separation (LLPS) and motor neuron diseases

An emerging theme of many of the genetic mutations leading to the neurodegenerative MN diseases discussed above is their link to RBPs. Interestingly, many of these RBPs participate in granule formation and are associated with proteins/RNAs that undergo liquid-liquid phase separation (LLPS) (reviewed in [120]). LLPS is a phenomenon where mixtures of two or more components self-segregate into distinct liquid phases (e.g. separation of oil and water phases) and it appears to underlie formation of many transient membrane organelles, such as stress granules that contain many ribonucleoproteins (RNPs). Although it remains unclear why ubiquitously expressed RNP granule proteins aggregate in neurodegenerative disease, one study found that aggregated forms of mutant SOD1, a protein associated with fALS, accumulates in stress granules [41]. These aggregated forms induce mis-localization of several proteins associated with the miRNA biogenesis machinery, including Dicer and Drosha to stress granules. Consequently, miRNA production is compromised, with several miRNAs (i.e. miR-17~92 and miR-218) perhaps directly participating in ALS disease onset and progression [56, 142]. Mislocalization of ALS-related proteins such as FUS and TDP-43 in the cytosol rather than nucleus of MNs has been observed in ALS patients, but the mechanism remains unclear [125, 146].

A recent study highlighted differences in RNA concentration between the nucleus and cytosol. In the nucleus where the concentration of RNA is high, ALS related-proteins such as TDP-43 and FUS are soluble, but protein aggregations form in the cytosol where the concentration of RNA is low, suggesting that RNA could serve as a buffer to prevent LLPS [84]. Collectively, these findings indicate that not only are RNAs the binding blocks for RBPs, but may also serve as a solvent to buffer RBPs and prevent LLPS. Accordingly, persistent phase separation under stress conditions could enhance formation of irreversible toxic aggregates of insoluble solidified oligomers to induce neuronal degeneration [148]. Although many neurodegenerative diseases have been associated with RNP granules, and primarily stress granules, it remains to be verified if stress granules/LLPS are causative disease factors in vivo. Many other questions remain to be answered. For instance, are the lncRNAs/RNPs mentioned above actively involved in RNP granule formation? Given that purified cellular RNA can self-assemble in vitro to form assemblies that closely recapitulate the transcriptome of stress granules and the stress granule transcriptome is dominated by lncRNAs [63, 144], it is likely that the RNA-RNA interactions mediated by abundantly expressed lncRNAs might participate in stress granule formation in ALS contexts. Similarly, do prevalent RNA modification and editing events in lncRNAs [159] change their hydrophobic or charged residues to affect LLPS and the formation of RNP granules to give rise to disease pathologies? It will be tantalizing to investigate these topics in the coming years.

Conclusion and perspective

Over the past decade, increasing evidence has challenged the central dogma of molecular biology that RNA serves solely as a temporary template between interpreting genetic information and generating functional proteins [23]. Although our understanding of lncRNAs under physiological conditions is increasing, it remains to be established if all expressed lncRNAs play particular and functional roles during embryonic development and in disease contexts. Versatile genetic strategies, including CRISPR-Cas9 technology, have allowed us to clarify the roles of lncRNA, the individual lncRNA transcripts per se, and their specific sequence elements and motifs [42]. Taking spinal MN development and degeneration as a paradigm, we have utilized ESC-derived MNs and patient iPSC-derived MNs to dissect the important roles of lncRNAs during MN development and the progression of MN-related diseases such as ALS and SMA. A systematic effort to generate MN-hallmark lncRNA knockout mice is underway, and we believe that this approach will help us understand the mechanisms underlying lncRNA activity, paving the way to develop new therapeutic strategies for treating MN-related diseases.