Introduction

Transposable elements (TEs), also called mobile elements, are DNA fragments that may move about inside a host genome and typically make new copies of themselves while they do so. They are present across all forms of life, accounting for 50% of the mammalian genome [1,2,3]. TEs are present in the genomes of bacteria, plants and mammals, and are divided in two major classes known as Class I retrotransposons and Class II DNA transposons [4], and these two groups vary from one another in terms of the way they transpose. Class II TEs are less common (3.5%) in the human genome and are regarded as DNA fossils because no family of DNA transposons is still active today [5]. The development of genomics and large-scale functional tests has revealed new knowledge on the many functions of TEs [6].

While DNA transposons move commonly by a cut-and-paste mechanism, retrotransposons do so by a copy-and-paste fashion [7]. The transcription of class I retrotransposons results in an intermediate RNA molecule that may be reverse-transcribed into DNA using reverse transcriptase to create a new copy of the retrotransposon in the genome. On the other hand, Class II DNA transposons produce an enzyme called transposase that separates the parental sequence from the genome before mediating its reintegration into another region of the genome [4, 8].

Retrotransposons come in a variety of forms, such as non-LTR retrotransposons and endogenous retroviruses (ERVs), which are distinguished by the presence of long terminal repeats (LTRs). Long nuclear elements (LINEs), short-interspersed elements (SINEs), and SVAs are further classifications for non-LTR retrotransposons [4, 9]. LINEs make up the majority of non-LTR retrotransposons in the human genome, accounting for 20.4% of it, followed by SINEs (13.1%), LTRs (9.1%), and SVAs (0.1%) [10, 11].

The consequences of TE insertions on host gene expression might be beneficial or harmful, like any mutational process. Regardless of their transposition competency, TE regulatory sequences can be co-opted for host regulatory activities. Evidence gives new thoughts on TE mobility and regulatory potential and serves as a vital resource for population history and disease genetics research [12]. Mechanistically, TEs can influence gene expression either transcriptionally [13], post-transcriptionally [14], or at the step of translation [2, 15] through their encoded products which include both proteins and non-coding RNAs (ncRNAs). More complex than originally thought, the mechanisms by which TEs affect host gene-regulatory networks include: the addition of TFBSs, promoters, and enhancers, alteration of 3D chromatin organization, production of regulatory ncRNAs, co-option/exaptation/domestication of TE-derived coding sequences as new transcriptional effector proteins, and collateral consequences of TE silencing mechanisms [2]. The objective of this review is to discuss TE-mediated gene regulation, with a particular emphasis on the mechanisms, contributions of various TE types, and differential roles of various tissue types, based mostly on recent studies on humans.

The roles of transposable elements in the human genome and cell

The evolution of genetic information, as well as DNA duplication, stability, and gene expression, are just a few of the numerous facets of DNA function that TEs may affect. The discovery of TEs' involvement in genome evolution and gene function has altered the previously held belief that TEs are junk, parasitic, colonizing, or selfish DNA [16]. New genes with crucial host functions can be produced as a result of TEs [29]. A recent study has established how lineage-specific TEs can promote evolutionary turnover and divergence of innate immune regulatory networks and reveals a novel function for B2 SINEs as inducible enhancer elements that affect immunity in mice [30].

Transposons could be altered to incorporate a reporter gene that, when randomly inserted into the bacterial chromosome, can fuse to a gene on the chromosome [31]. This kind of transposon library screening for reporter expression under various situations enables the identification of fusions that are appropriate to stress conditions or a particular therapy. A genome-wide picture of the bacterial regulatory network organization may be obtained from the characterization of these fusions [31]. In addition, a study demonstrated that the expression of retrotransposon is clearly related to aging in Drosophila [32].

The negative roles of transposable elements in the human genome

Through processes dependent on and independent of transposition, TEs can lead to genomic/epigenomic instability, which may result in different disease conditions, cell death or the development of cancer [20, 26, 33, 73], histone alterations [74], and mRNA editing [72]. The majority of TEs in somatic cells are silenced by one of these processes, DNA methylation [75] (Fig. 1).

Fig. 1
figure 1

TEs are regulated in both healthy and cancerous cells. Epigenetic changes such as DNA methylation, histone modification, and non-coding RNA (eg cirRNA, miRNA, and lncRNA) inhibit the function of TEs in healthy cells (left panel). During cellular transformation, hypomethylation with increased S-Adenosyl methionine (SAM), various histone modifications (like methylation and acetylation), and oncogenic non-coding RNAs, which inhibit the expression of tumor suppressor genes (TSGs), all contribute to the loss of repressive signals and the uncontrolled production of TEs in cancer cells (right panel). DNA breakdown, mutations, and genomic instability result from all these (arrows indicate the increased activity, cross circle indicates inhibition; ( +) sign indicates increment, (–) sign indicates decrement, and cross sign indicates inhibition) [26]

The interactions between the TEs and a large number of non-coding RNAs are the basis of one well-known germline process [76]. The majority of research on regulatory RNAs has focused on the PIWI-interacting RNAs (piRNAs), which interact with TEs at various levels [77,94]. Numerous of these transcripts are co-opted as regulatory RNAs or chimera transcripts, and L1s produce hundreds of these developmentally regulated and cell-type-specific transcripts. One human-specific transcript expressed only during brain development is LINC01876, an L1-derived lncRNA. L1s are implicated in human-specific developmental processes as a result of decreased size of cerebral organoids and premature differentiation of neural progenitors caused by CRISPRi-silencing of LINC01876. Therefore, it has been demonstrated that L1-derived transcripts offer a previously unrecognized layer of transcriptome complexity that is unique to humans and primates and contributes to the functional diversity of the human brain [94,95,96]. Given that TEs can play a dual and contradictory function in the proper differentiation and development of neuronal mosaicism and in the start of neurological illness, the manifestation of TEs in the brain is symbolic of this "double-edged sword" phenomenon.

These are intriguing illustrations of how TE evolution has included both mechanisms to prevent these invasive sequences from having a negative impact on genome function and systems to allow them to play an active and beneficial part in it. It is becoming apparent that TEs are a crucial component of the genome's regulatory toolbox [97]. RNA translation, alternative splicing, and gene transcription are just a few biological processes for which repetitive sequences have shown promise as regulators [98]. More and more evidence is mounting that TEs can play a crucial role in regulating gene expression in a variety of mechanisms (Fig. 2).

Fig. 2
figure 2

Different mechanisms that TEs influence gene expression regulation

The role of TEs in epigenetic gene expression control

Only recently has the role of TEs in 3D genome architecture been studied. TEs have an impact on 3D chromatin architecture with a direct effect on the folding of chromosomes [2, 104]. It is yet unknown what chemical mechanism causes this correlation between the suppression of TEs and the rise in histone repressive marks [104]. KZFP/KAP1 (Krüppel associated box (KRAB) zinc finger protein/KRAB-associated protein 1) complex plays a crucial role in maintaining heterochromatin, with DNA methylation marks at TEs shielding the loci from TET-mediated demethylation, according to new research in naive murine embryonic stem cells prior to implantation [105]. Ecco et al. [106] discovered that two KRAB/ZFP (Kinc Finger Protein) family members control TE targets by histone-based processes in differentiated tissues, which are not necessarily associated with the DNA methylation state of the loci. Additionally, it has been demonstrated that ZFP92 controls the transcription of particular genes in different tissues by the repression of particular TEs [107].

The work demonstrates that the interactions between the TEs and their KRAB-ZFP controllers affect the expression of neighboring genes. It has been shown that primate-specific ERVs serve as docking sites for the co-repressor protein KAP1 (also known as TRIM28) to produce local heterochromatin in human brain progenitor cells, making this connection even more obvious there [108]. KAP1 binds to the ERVs and represses them, which controls the expression of nearby genes crucial for brain development [108]. The interactions of the transcriptional regulators human silencing hub (HUSH) and microrchidia family CW-type zinc finger 2 (MORC2) with evolutionarily young full-length L1s situated in the transcriptionally permissive euchromatic region, which promotes the deposition of histone H3K9me3, a specific mark for transcriptional silencing, are another example of the regulation of neighboring genes by TEs. A reduction in mRNA expression and potential effects on the RNA polymerase II (POL II) elongation rate might result from this MORC2/HUSH-bound L1 specific impact spreading to nearby genes [109].

Certain kinds of TEs, particularly younger LINEs, have been discovered to affect chromatin accessibility in the livers of several inbred mouse strains, serving as a source of chromatin diversity. This demonstrates the ability of TEs to control tissue-specific genes, which may lead to phenotypic variability among populations [110]. Transposable elements can actively reorganize the chromatin structure to regulate gene expression over a lengthy period of time. About 10% of TE families have been discovered to be enriched in active genomic areas generally and across various organs. While L1 LINEs and ERV LTRs are the most often enriched TE classes in the repressed areas targeted with the H3K9me3 epigenetic mark, SINEs and DNA transposons are the most frequently enriched classes in the active chromatin regions [111].

Intriguingly, open euchromatin areas show the strongest epigenetic impact of TEs. By comparing the epigenomes of two D. melanogaster strains, it has been shown, for example, that the enrichment of repressive epigenetic marks around euchromatic TEs is caused by the presence of TEs rather than by the preferential insertion of TEs into genomic regions already enriched with repressive epigenetic marks. This pattern explained why TE-flanking alleles had lower transcript levels and greater histone 3 dimethyl lysine 9 (H3K9me2) enrichment than similar alleles without neighboring TE insertions [112]. Similar to this, the analysis of epigenetic marks in flies with and without Bari-Jheh, a natural transposon that affects the expression of nearby genes, revealed significant differences in histone 3 trimethyl lysine 4 (H3K4me3), H3K9me3, and histone 3 trimethyl lysine 27 (H3K27me3) histone marking in relation to oxidative stress conditions, highlighting that this TE element influences gene expression by affecting the local chromatin state. These illustrations imply that the gene expression of neighboring genes is significantly influenced by the various TE distributions seen in the germlines of various organisms/strains and species [113] (Fig. 3A).

Fig. 3
figure 3

The consequence of TE distribution on the epigenetic control of gene expression due to changes in methylation of histones and DNA across species, tissues of the same organisms, and stimuli or circumstances. A The epigenetic control of a particular gene is altered by the varied distribution of TEs in evolution. TE element controls gene expression by influencing the local chromatin state due to changes in methylation of histones and DNA. B The expression of a particular gene is impacted by the differential redistribution of TEs in various cells and tissues of the same organism during development. C The expression of a certain gene is influenced by the relocalization of TEs sequence in the same cell following a particular stimulus or circumstance

The hypothesis that TEs have been co-opted and that their distributions have co-evolved with the control of gene expression is supported by the intriguing fact that varied TE distributions resulting from somatic transpositions impacting gene expression are also relevant in the same individual [114]. In actuality, TE enrichment differs between tissues, and TEs have binding sites for tissue-specific master transcription regulators [111]. The fact that integration is only permitted in open chromatin areas explains one aspect of TE targeting. The vicinity of neuronal genes is where somatic LINE insertions are abundant in mammalian brains.

The non-random and targeted tissue-specific distribution of TEs might be viewed as a way to genetically fix a landmark, which could result in an epigenetic regulation of neighboring gene expression, if we consider that TEs can be the target of epigenetic marks (Fig. 3B). The discovery that various environmental conditions cause L1 transposition through various basic helix-loop-helix PER-ARNT-SIM (bHLH/PAS) proteins raises the prospect of L1 insertions being targeted differently under various forms of stress [115]. Experimental data suggest that TE insertions are targeted in ways that go beyond the ostensible mechanistic need for accessible chromatin [116].

A temporal and functional hierarchy of transcriptional and epigenomic alterations in response to stress is established in Arabidopsis thaliana by the increased DNA methylation that silences TEs near environmental-induced genes [117]. It is feasible to suggest that, in response to particular stimuli, the mobilization and insertion of TEs may also be regulated in adult tissues and post-mitotic cells to drive the epigenetic regulation of particular genes (Fig. 3C).

Transposable elements in long-range regulation

Different families of TEs have developed many binding sites for transcription factors during the course of evolution, resulting in various transcriptome landscapes [14]. The ENCODE (Encyclopedia of DNA Elements) data comprises roughly 2 million transcription factors binding sites (TFBSs) that coincide with putatively regulation-competent human retrotransposons. For example, these retrotransposons (44% SINEs, 33% LINEs, and 23% LR/ERVs) are situated in a 5-kb gene promoter neighborhood [118]. According to the findings, SINEs are more common than LINE-derived transcription factor binding sites (TFBSs) outside of a 5-kb region close to the transcription start site, but the opposite is true within that region [118].

In addition, while it has long been hypothesized that the repeat sequences that TEs disperse across genomes serve as a source of TFBSs that encourage the emergence of new gene regulatory networks [2], it has only recently become clear that the proteins that TEs encode themselves offer complementary pathways to achieve this result. It was suggested that the process of transposase capture may be a recurring idea in the formation of transcription factors by the discovery that well-characterized transcription factors, such as the paired box (PAX) proteins, feature DNA-binding domains that appear to have arisen from transposases [119].

Additionally, the pathways most significantly influenced by the various retrotransposon distributions have been connected to crucial procedures such as cell stress and immunological responses, ribosome biogenesis, chromatin remodeling, DNA replication, mitotic spindle organization, and cell cycle advancement [118]. The discovery that an evolutionary conserved genomic region called AS3 9, made up of three TEs inserted side by side, serves as a distal enhancer for wnt5a expression during the morphogenesis of the mammalian secondary palate was made by Nishihara et al. [120].

Functional analyses have demonstrated that the AmnSINE1, X6b DNA, and MER117 retrotransposons were co-opted by a retroposition/transposition mechanism during the evolution of mammals. This co-option resulted in the acquisition of a specific Msx1 protein binding site within the X6b DNA sequence, which together with Wnt5a is involved in palatogenesis. According to this study, the great variety of numerous cis-regulatory elements (CREs) may have evolved as a result of the combination of several TEs that were all present in the same DNA segment [120].

TEs’ role as cis-regulatory elements in the genome

It is believed that the majority of CREs newly evolved during primate evolution are directly derived from TEs [121, 122]. Transposable elements frequently contribute to cis-regulatory elements, tissue-specific expression, and alternative promoters in zebrafish, according to epigenomic analysis [123]. In mammalian genomes, transposable elements are a significant source of various cis-regulatory sequences (Fig. 2). According to some studies, 20% of the CREs found in the human genome may have been taken from TEs [124, 125]. By offering binding sites for trans-acting factors, TEs significantly contribute to all cis-regulatory regions (promoters, enhancers, silencers, and insulators) in the human genome [122]. TEs serve as a reservoir for a variety of regulatory functions and are crucial to the evolution of many regulatory components. They either offer substitute enhancers and promoters or change the activity of the current promoters [126, 127].

It has been well established that TEs may adapt to regulatory elements in the human genome and take on non-TE activities [128, 129]. The transcriptional activity of TE-derived sequences in regulatory elements has been empirically verified in several investigations [127, 130, 131]. According to one study, out of the 35,007 promoters, 75% were identified to have TE-derived sequences, with some promoters possessing as many as ten TEs [132]. However, only 6.8% of the TFBSs in promoters were found to be TE-derived, according to the study.

Studies have shown that TFs bind to TEs and that these proteins contain TF-binding sequence motifs [125, 132, 140]. Early in development, young TE families—often LTR elements with embryonic TFBSs in their ancestral sequence—display extremely particular transcriptional patterns [141, 142].

In somatic cells, TEs support cis-regulatory gene networks through the following mechanisms: overlap between the cis-regulatory programs of somatic cells and stem cells, retroviral hijacking of transcription factors expressed in different types of immune cells, or gain of somatic regulatory activity through TE sequence mutations that take place after genomic insertion [139, 143, 144].

Additionally, TE-derived regulatory sites frequently are species/lineage-specific and add innovation and variety to speciation. Future thorough analyses including all regulatory element types across a wide range of species ought to offer more information [127, 145].

However, it has been discovered that mobile element insertion polymorphisms are the most common structural variations in the human genome. Alu elements, L1s, and SVAs are the three groups of retrotransposons that are predominantly in charge of producing human TE polymorphisms [146,189].

Current clinical studies frequently target TEs or benefit from TE biology. Clinical studies using checkpoint inhibitor treatment for immune signaling against renal, ovarian, colorectal, and melanoma malignancies that include TE signaling pathways are currently being conducted [190].

In relation to TEs, both humoral and cell-mediated immunity have been investigated. Several malignancies, including ovarian and melanoma patients as well as teratocarcinoma cell lines, have been linked to anti-ERV-K antibodies [191]. Adaptive immune activity to target TEs as new therapeutic targets was found to be aided in cancer patients by T-cell-mediated and autologous humoral response.

Overexpression of transposon elements in different human diseases is due to demethylation of the TE loci [192]. The TE transcript mechanism, however, is occasionally independent of DNA methylation [193]. This raises the possibility of additional TE regulatory mechanism for non-coding RNAs and histone alterations. Human disorders are significantly influenced by RNA modification [194]. It has been revealed that transposon RNA M(6)A underwent one of its modifications [195]. Transposons may have a role in certain human disease mechanisms, by making use of attractive targets for treatments. In addition to RNA changes, one may look at the uncharacterized DNA modifications of TEs for additional study. One such is m6dA, which is found in the human genome at certain locations, is linked to enhanced transcription activity, and has been implicated in cancer [196, 197]. The link between TE loci and biomarkers raised in disease states would be an intriguing area for further research.

Limitation of the review

This review has a limitation in that it non-specifically addresses the role of transposable elements in the regulation of gene expression. It is more descriptive since it is a narrative review rather than a systematic review and/or meta-analysis, which are supported by statistical analyses and which can objectively answer a particular subject. Therefore, this review presents the authors' own perspectives on a more general topic.

Conclusion and perspective

There are a number of recent discoveries that support the increasingly clear active involvement of TEs in genome function, highlighting their impact on the control of gene expression. In addition to providing ready-to-use TFBSs or undergoing mutations to generate binding motifs for TFs, TEs have inherent regulatory mechanisms for controlling their own expression. Many genes' regulatory elements contain TE sequences, which are involved in both short- and long-range regulation of gene expression. By actively taking part in the production of regulatory RNAs, TEs also contribute to the control of genes. There is still much to learn about the function of transposable elements in gene regulation and their therapeutic potential.