Introduction

All organisms on earth, from bacteria to humans, maintain their survival and keep the continuation of species through a constant repetition of cell growth and cell division. After a series of biochemical events, the components of cells are replicated and then equally divided into two parts to form two daughter cells. Cell division and correct separation of genomic materials between daughter cells are the basis of normal development, growth, and reproduction of organisms. The cell cycle is the mechanism of cell reproduction and is usually divided into four periods: G1, S, G2, and M. The DNA synthesis phase (S phase) and mitosis phase (M phase) are respectively separated by two-period gaps called G1 and G2 [1, 2]. The duration of the division phase accounts for a very small proportion of the whole cell cycle, and the overall time of the S, G2, and M phases is relatively constant. Therefore, the length of the cell cycle mainly depends on the G1 phase [3]. There is also a period in the cell cycle called the G0 phase in which cells are generally in a quiescent or dormant period. Some cells in G0 remain stationary for a long time until they are stimulated by external cues (i.e., growth factors) to re-enter the cell cycle and undergo cell division [4, 5]. Genome replication is a key step in the S phase. In this process, the replisome (DNA replication system) must overcome many obstacles, such as R-loop, that otherwise lead to replication fork stalling and genomic integrity impairment [6]. R-loop is a transcriptional complex consisting of a single-stranded DNA (ssDNA) and RNA: DNA hybrid produced during transcription [7]. Studies have shown that the R-loop exists independently of the replication process, and it formed at one stage of the cell cycle can be transmitted to another stage and the next cell cycle [8]. Therefore, the R-loop exists in all stages of the cell cycle and plays a crucial role in many physiological and pathological processes [9, 10]. The physiological role of the R-loop includes immunoglobulin (Ig) class switch recombination (CSR), gene expression, DNA replication, DNA repair, and regulation of transcription initiation/termination processes [11]. However, the temporal and/or spatial accumulation of unscheduled R-loop in the genome will play a detrimental function, such as transcriptional defects, transcription-replication conflicts (TRCs), cell cycle arrest, and genomic instability [12,13,14,15,16].

Recently, it is clear that the generation and resolution of the R-loop is regulated by different factors in different cell stages, and the coordination of these factors is very important for cells to regulate the homeostasis of the R-loop thereby protecting its genomes [8, 17,18,19]. These factors can cooperatively avoid the accumulation of the unscheduled R-loop by resolving it or even by preventing its initiation [11, 20]. It is worth noting that the presence of the pathological R-loop, caused by deficiencies or mutations in R-loop resolution factors, may hinder the process of the cell cycle [21]. Additionally, it is reported that the abnormal accumulation of R-loop can lead to diseases, such as neurological diseases and cancers [22]. Here, we review the relationship between the cell cycle and R-loop, the R-loop resolution factors acting on different stages of the cell cycle, and the diseases caused by the defective function of these factors.

The cell cycle

The cell cycle is an extremely conserved process of life activity. As aforementioned, the cell cycle can be divided into G0, G1, S, G2, and M phases [23]. The S phase is the DNA synthesis phase, while the M phase can be further divided into two processes: nuclear division and cytoplasmic division. To ensure the accurate replication and separation of genetic material, cell cycle progression is tightly regulated by cyclin-dependent kinases (Cdks) and their regulatory cyclin subunits [1, 24, 25]. Different Cdks bind to their partner cyclins at different stages to properly order the events of the cell cycle [26, 27]. In the process of the cell cycle, there are three important checkpoints, including the G1/S checkpoint, G2/M checkpoint, and spindle assembly checkpoint (SAC), which serve as DNA surveillance mechanisms to prevent genetic errors during cell division [26, 28, 29] (Fig. 1). The G1/S checkpoint controls the entry of cells from the quiescent G1 to the DNA synthesis phase. The G2/M checkpoint is the control point that determines the cell division. SAC acts in the middle and late M phase to prevent chromosome separation until the sister chromatids are correctly connected to the mitotic spindle. Activation of checkpoints often results in cell cycle arrest, which provides a temporal delay to repair DNA damage [15].

Fig. 1
figure 1

The cell cycle and check points. The cell cycle is an essential mechanism of cell proliferation, it can be divided into five stages: stationary phase (G0), pre-DNA synthesis (G1), DNA synthesis (S), post-DNA synthesis (G2), and mitosis (M). The whole cell cycle is regulated by a variety of cyclins and cyclin-dependent kinases (Cdks). In addition, G1/S check point, G2/M check point, and spindle check point play an important role in ensuring the normal progression of the cell cycle. R-loop can be formed spontaneously throughout the cell cycle and is most common during the S phase

The generation of R-loop

R-loop is a three-stranded nucleic acid structure formed during transcription, it is made up of a single-stranded DNA that has been displaced and an RNA: DNA hybrid [30,31,32] (Fig. 2a). In the process of transcription, the nascent RNA produced by RNA polymerase II(RNA Pol II) hybridizes with its template to form RNA: DNA hybrid strand [33]. R-loop length in highly transcribed genes can be greater than 1 kb [34]. R-loop can be mainly divided into two categories. One is the physiological R-loop, which plays an active role in many physiological activities. The other is the pathological R-loop which is formed in a non-procedural manner and poses a threat to the stability of the genome [11, 35]. Therefore, how is the R-loop generated? Under physiological conditions, a short RNA: DNA hybrid of about 8 bp is transiently produced by the binding of the newly synthesized RNA to the template DNA during transcription. Then, a special protein moiety of RNA polymerases (RNAPs) opens a channel for RNA to be stripped from the DNA template chain to ensure that the RNA: DNA hybrid produced during this process does not accumulate [36, 37]. Under pathological conditions, such as topoisomerase flaws, and RNA damage [38], R-loop can be abnormally formed and accumulated. The majority of unmethylated CpG islands (CGIs) promoters in the human genome have a large amount of guanine and cytosine chain asymmetry, which is called GC skew. Due to the greater thermal stability of GCs, the GC skew confers significant potential to form the R-loop structure when the newly transcribed G-rich RNA strand is annealed back to the template C-rich DNA strand [39, 40].

Fig. 2
figure 2

The structure of the R-loop and the process of transcription-replication conflicts. a R-loop is a three-stranded nucleic acid structure consisting of an RNA: DNA hybrid strand and a replaced DNA strand (ssDNA). b In the S phase of the cell cycle, transcription and replication share the same DNA strand as a template, which makes the two processes often collide. The transcription-replication conflict is divided into two types: HO and CD conflict. In the HO conflict, RNA polymerase II collides with the replicon, which leads to the generation of the R-loop. The consequences of co-directional conflict are different from those of HO conflict, which can avoid the accumulation of the R-loop

The repeat sequences in the genome are also easy to form R-loop structures. These repeat sequences are called microsatellites or short tandem repeats and have the potential to generate secondary structures, including hairpins, cruciform, triplex DNA, and G-quadruplexes (G4), these structures might obstruct DNA replication [41, 42]. Studies have shown that a variety of disease-related trinucleotide repeats, such as FRAXA(CGG)·(CCG) repeats, SCA1(CAG)·(CTG) repeats, DM1(CTG)·(CAG) repeats and FRDA(GAA) repeats, can form stable R-loop in vitro by transcription induction. The formation and maintenance of the R-loop require a negative supercoiled template [43]. The human fragile X mental retardation 1 gene (FMR1) contains a (CGG)n trinucleotide repeat. The transcription of the GC-rich FMR1 5'UTR region is conducive to the formation of R-loop. In addition, the amplification of CGG repeats and related transcription lead to increased formation of R-loop, which are more likely to fold into complex secondary structures [44].

When the R-loop is excessively accumulated or present in the wrong position in the genome, it poses a threat to the stability of the genome [12]. The generation of pathological R-loop is generally due to the deficiencies or mutations of important proteins involved in life activities.

The function of R-loop

The R-loop, which is present in all organisms, plays a double-edged sword role. On the one hand, the R-loop is crucial for several physiological processes, such as chromosome separation during mitosis, CRISPR-Cas9 bacterial defense, immunoglobulin (lg) switching, DNA replication and repair, and regulation of transcription initiation/termination processes [45, 46]. On the other hand, when the unplanned R-loop is generated and accumulated, it causes a series of damage reactions, including DNA double-strand breaks and genomic instability, which further lead to hypermutation, cell cycle arrest, and even cell death [47].

For example, telomeres are nucleoprotein structures at the ends of linear chromosomes in heterochromatin regions [48]. Telomeric repeat-containing RNA (TERRA) is a long noncoding RNA, it forms physiologically relevant RNA: DNA hybrids at telomeres [49]. Telomere shortening caused by a lack of telomerase activity can lead to premature senescence, while the presence of R-loop on short telomeres helps to activate DNA damage response (DDR) and promote the recruitment of Rad51 recombinase, thereby preventing early senescence onset [50]. However, in Saccharomyces cerevisiae, the accumulation of R-loop caused by hpr1 mutations leads to DNA breaks and activation of DNA damage checkpoints, which in turn leads to defective meiosis [10].

Multiple types of factors involved in R-loop regulation

Because RNA: DNA hybrids are important sources of structure changes in the genome, therefore, preventing the formation and accumulation of pathological R-loop has become an effective way to maintain genomic stability. In this process, RNase H and a variety of RNA biogenesis factors act as guardians of the genome's genomic stability [51]. To reduce DNA damage caused by the accumulation of R-loop, organisms can avoid the accumulation of R-loop by preventing its formation. Alternatively, it can also resolve the formed R-loop by using multiple types of factors [11]. So far, many factors that play roles in R-loop resolution have been found. These factors include ribonuclease, helicase, topoisomerase, DNA damage repair factors, transcription and RNA processing factors, chromosome remodeling factors, and RNA modifiers (Table 1). Specifically, ribonuclease is an enzyme that can degrade the RNA portion of the RNA: DNA hybrids. So far, there are two common ribonucleases, RNase H1 and RNase H2 [52, 53]. RNA helicase has RNA: DNA unwinding activity in vitro and can unwind RNA: DNA hybrids, such as DDX19, UAP56, and DDX1 [54]. DNA topoisomerases play an important role in R-loop prevention. The function of these enzymes is to solve the torsional stress during transcription and replication. Topoisomerase 1 (Top1) and Topoisomerase 2 (Top2) can relax the positive and negative supercoils in DNA [55]. Top1 is an enzyme that relaxes DNA supercoils and prevents R-loop formation, which makes it an important player in regulating R-loop homeostasis by ensuring faithful replication and genomic integrity [56]. When Top1 is deficient, R-loop-driven replication stress occurs, which leads to DNA damage [57]. Many factors involved in the repair of DNA damage also play important roles in the resolution of R-loop, such as BRCA1, BRCA2, and Fanconi anemia factor [58]. BRCA1 regulates the G2/M checkpoint by activating Chk1 kinase during DNA damage [59]. It has been reported that BRCA1 can form a complex with Senataxin (SETX) and deal with R-loop-related genomic instability at transcriptional termination pause sites [60]. In addition, many molecules in the DEAD-box (DDX) family have also been reported to resolve the R-loop. At present, more than 35 DDX helicase members have been found in humans, and these members are highly conserved [61]. The THO complex is a common transcription and RNA processing factor that has important roles in transcription elongation, RNA processing and export. Inactivation of the THO complex can cause R-loop accumulation which further leads to genomic instability [62]. At the same time, some chromatin factors are also important participants in R-loop homeostasis maintenance and genomic stability. Studies have shown that chromosome remodeling complexes help to resolve R-loop-mediated TRCs. SWI/SNF, ISWI, CHD, and INO80 are four major chromatin remodeling families [63, 64]. In addition, m6A RNA methylation modification is involved in the regulation of R-loop level [65, 66]. Another enzyme, Flap endonuclease 1 (FEN1), plays an important role in the maintenance of genomic stability by using its nucleolytic activity to resolve the R-loop [67]. Additionally, studies have shown that sterile alpha motif and HD domain-containing protein 1 (SAMHD1) and RNA-specific adenosine deaminase 1(ADAR1) can play the role of resolving the R-loop, thereby avoiding the threat to the genome [68,69,70].

Table 1 Factors regulating R-loop homeostasis

The reproduction of all organisms relies on the cell cycle. In this process, there are strict regulatory mechanisms at each stage to ensure the smooth progress of the cell cycle, but the prevalent emergence of unplanned R-loops makes it bumpy. Therefore, it is particularly important to understand which factors play their roles in resolving the R-loop at a specific stage of the cell cycle.

The relationship between the cell cycle and R-loop

Formation of R-loops during S phase due to transcriptional replication conflicts

Studies have shown that the R-loop can be formed spontaneously throughout the cell cycle and is the source of DNA damage. S9.6 antibody is a widely used tool for purification, analysis and quantification of R-loop structure. It binds to RNA: DNA hybrids in a sequence-independent manner and has high affinity for RNA: DNA hybrids [97]. By using flow cytometry, Sonia Barroso et al. measured the S9.6 Immunofluorescence (IF) intensity of whole cells pretreated with RNase III to analyze the level of RNA: DNA hybrids at different cell cycle stages, they found that the fluorescence intensity was increased significantly from G1 to G2, these results suggest that the de novo formation of R-loop occur from G1 to G2 [98].

Replication and transcription are normally carried out independently during the S phase of the cell cycle [99]. However, under some conditions, transcriptional replication conflicts (TRCs) often occur due to the two processes share the same template. TRCs can occur when replication and transcription machinery encounter in a head-on (HO) orientation, they move toward each other, or in co-directional (CD) orientation, they move in the same direction [100] (Fig. 2b). The S phase of eukaryotic cells is the most fragile period in which replication and transcription coexist temporally and spatially, so TRCs take place during the S phase [16]. Transcription is thought to go more quickly than replication. In particular, the transcription rate of RNA pol II in mammalian cells is roughly 3.8 kb/min, whereas the average replication rate in human cells is 1.5-2 kb/min [101]. TRCs may result in replication fork blocking, premature termination of transcription, DNA damage, and recombinant intermediates, which put the integrity of the genome in danger. Further study demonstrates that the level of the R-loop depends on the direction of the TRCs. HO collisions of TRCs aggravate the production of the R-loop, whereas CD collisions avoid the accumulation of the R-loop. The explanation is that HO collisions, rather than CD collisions, may block transcription, thereby confining nascent RNA strands near the DNA template and promoting the formation of RNA: DNA hybrids [102, 103]. Furthermore, HO collisions can lead to a pause in the replication process and a high level of hyper-recombination, while CD collisions will not lead to such a consequence. The difference between them may be due to the termination of transcription by RNA polymerase after TRCs, compared with head-on encounters, co-directional encounters can be avoided to some extent [104]. For bacteria and eukaryotes, the degree of topological complexity caused by TRCs is different. Bacteria have only one replication starting point, and the direction of transcription and replication conflicts is mostly CD collisions [105]. In eukaryotes, chromosome replication is initiated from multiple replication origins, and TRCs often occur as HO collisions, increasing the complexity of the topology [106]. Therefore, the genomic integrity of eukaryotic cells is easily threatened by the conflict between transcription and replication.

Activation of cell-cycle checkpoints by R-loops

In addition, there are some important checkpoints in the cell cycle which include G1/S, G2/M, and SAC checkpoints. When cells encounter replication stress and other pressures, the existence of checkpoints can block the transition between different stages of the cell cycle and avoid the threat of the genome. For example, when replication is abruptly halted, the checkpoints of the cell cycle can tightly control the stability of stalled forks [107]. When DNA is damaged, it activates checkpoints to delay cell cycle progression and DNA damage repair. In general, the activation of DNA damage checkpoints means cell cycle arrests (Fig. 3). When G1 and G2 checkpoints are activated, the cell will initiate a mechanism to suspend the cell cycle. The response of cells in the S phase to DNA damage is completely different from that of the previous two phases. It does not induce an immediate cell cycle arrest but rather decelerates progression before entering the subsequent stage [15]. Additionally, R-loop-mediated DNA damage triggers the activation of cell cycle checkpoints, which are indispensable for cellular survival during replication stress [108]. Studies have shown that the mcm2DENQ (Alleles of Mcm2-7, the catalytic core of the replicative helicase and a part of DNA replication checkpoint signaling cascade) mutation leads to the formation of RNA: DNA hybrids in the S phase, which will persist until the cell pass the spindle assembly checkpoint. These results indicate that the Mcm replicon helicase can prevent the accumulation of RNA: DNA hybrids and it also plays a crucial role in avoiding the conflict between transcription and replication [109, 110].

Fig. 3
figure 3

The increase of R-loop level leads to DNA double-strand breaks and cell cycle arrest. When cells encounter replication stress and other pressures, the existence of checkpoints can block the transition between different stages of the cell cycle and avoid the threat of the genome. R-loop-mediated DNA damage triggers the activation of cell cycle checkpoints, which are indispensable for cellular survival during replication stress

Factors that resolve the R-loop in the G1/S phase

The decision made by cells to transition from the G1 phase of the cell cycle to the S phase is crucial for normal development. During the G1 phase, cells decide whether to enter into the cell cycle, initiating DNA replication and division, or exit from it and enter stasis, senescence, or differentiation [111]. The maintenance of genomic integrity and the stable transmission of genetic information depend on many DNA repair processes. Failure to faithfully carry out these processes can lead to genetic material mutations and the development of genetic diseases [112]. In addition to DNA damage repair factors, RNA binding factors also play a key role in preventing genomic instability caused by the accumulation of the R-loop. The common RNA binding factor is the THO complex, and other factors related to RNA processing have similar functions in avoiding the accumulation of the R-loop, such as SETX/Sen1, DDX19, or DDX23 [17]. A recent study found that the THO complex can effectively prevent the accumulation of R-loop in the G1 and S phases. The factors involved in the resolution of the R-loop in the S phase are SETX/Sen1 and primase–polymerase (PrimPol) [8, 82] (Fig. 4)

Fig. 4
figure 4

Cell-cycle-dependent R-loop resolution factor. For the cell-cycle-dependent R-loop factor, the THO complex plays a role in both the G1 phase and the S phase. In addition, Sen1 and Primpol can also resolve it in the S phase. In the G2/M phase, RNase H2 is the principal factor that resolves the RNA portion of the RNA: DNA hybrid strand.

The THO complex is a conserved eukaryotic complex that acts on transcriptional elongation and RNA processing and export. The THO complex is very important for the formation of optimal messenger ribonucleoproteins (mRNPs) during transcriptional elongation. By ensuring the optimal packaging of mRNPs and by interacting with histone deacetylases ( such as Sin3A ), the THO complex may facilitate the transient closure of chromatin, thereby preventing the formation of R-loop [62].This protein complex consists of five interacting subunits in yeast, namely: Tho2, Hpr1, Mft1, Thp2, and Tex1 [113]. It has been found that the complex contains six members (THOC1-THOC6) in higher organisms [114]. When Hpr1, one of the subunits of the THO complex, was knocked down, the R-loop was greatly accumulated in G1-arrested cells or after entering the S-phase, which suggests that the THO complex could prevent the production of the R-loop in the G1 and S phase [8]. Deletion of THOC1 also leads to R-loop-dependent genomic instability [115]. It is reported that transcription increases mutation and recombination in bacteria, yeast, and humans [116]. The recombination caused by transcription mainly occurs in the S phase, which is related to the replication fork damage caused by TRCs [117]. The replication fork is a relatively fragile structure that often encounters obstacles that cause it to pause [118]. The increase in the co-transcriptional R-loop caused by the THO mutant is one of the factors that hinder the progress of the replication fork [8]. R-loop accumulation produces excessive ssDNA that triggers the activation of the S phase checkpoint. In summary, R-loop-mediated DNA damage caused by THO mutants can activate the S phase checkpoint, which is necessary for maintaining the stability of genetic material under replication stress [108].

In the past few years, many RNA helicases have been reported to be involved in the dynamics and stability of the R-loop. SETX is one of the earliest reported compounds. It is a human homolog of yeast Sen1p and is involved in RNA maturation and termination [119]. SETX is an RNA/DNA helicase that is considered to be involved in transcription and genome integrity maintenance. It is highly conserved throughout evolution and involves various biological processes from transcription termination to meiosis completion and genome integrity maintenance [120]. Sen1 helicase plays important roles in transcription termination and maintaining genomic stability. It has been found that SETX retains the function of its yeast homolog in transcription termination, but also acquires specific properties or characteristics of mammalian RNAPII [121]. The R-loop formed at the G-rich pause site, broken down by SETX, is a key step in the transcription termination [122]. Studies have demonstrated that the protein expression of Sen1 is subject to regulation by the cell cycle. By comparing the level of Sen1 at different stages of the entire cell cycle, Mischo et al. observed that its expression increased in the S and G2 phases. They posit that this phenomenon occurs due to an adaptive adjustment in Sen1 levels based on cellular requirements. During certain moments in the cell cycle, such as transcription encounters replication in the S phase, polymerase II is suspended and there is an increased likelihood of R-loop formation. Consequently, a higher level of Sen1 is required to fulfill its role in resolving the R-loop and maintaining genome stability [123, 124]. SETX can resolve RNA: DNA hybrids, its lack causes the accumulation of R-loop and DNA double-strand breaks, which in turn leads to genomic instability [125]. The conserved Sen1 helicase not only terminates non-coding transcription, but also interacts with replicators, and is reported to solve the genotoxic R-loop [126]. Martin-Alonso et al. found that the THO transcription complex can prevent the formation of the R-loop in both the G1 and S phases, while Sen1 RNA /DNA helicase can only specifically prevent the formation of the R-loop in the S phase of the cell cycle [8].

Additionally, primer polymerase (PrimPol) is also an important player during the process of R-loop resolution. Studies have shown that purine-rich repeats (GAA)10, G-quadruplex and H-DNA motifs containing secondary structure-forming sequences can hinder the process of replication, which is mainly attributed to the formation of RNA: DNA hybrids. The replication of these sequences requires the participation of PrimPol. The miss of PrimPol leads to an increase in the level of unscheduled R-loop around these sequences and becomes a replication barrier. Thus, PrimPol can use its reprime function to prevent excessive single-stranded DNA exposure to the S phase, thereby limiting the formation of the R-loop in the S phase [82].

Factor that resolve the R-loop in the G2/M phase

At present, most of the studies on the R-loop rely on the S9.6 antibody. To test the specificity of this antibody, researchers usually carry out RNase H treatment, which can cleave RNA in RNA: DNA hybrids. When RNase H is added, the accumulation of R-loop will be greatly reduced [127]. RNase H can be divided into two types: RNase H1 and RNase H2. The points shared by them are that they localize in the nucleus and function to reduce the accumulation of R-loop in the genome by resolving RNA: DNA hybrids formed during transcription. In addition, RNase H2 also plays a role in ribonucleotide excision repair (RER) [128]. In yeast, RNase H1 is encoded by RNH1 and RNase H2 is encoded by RNH201 [129]. The structure of the two RNase H enzymes is different, RNase H2 is a heterotrimeric enzyme, while RNase H1 is a monomeric structure. Compared with the monomer, it is more difficult for the polymer to exert its function. Studies have shown that both the mRNA expression levels and activities are different between the two enzymes throughout the cell cycle. The mRNA level and enzyme activity of RNase H1 are constant in the cell cycle, while for that of RNase H2, there are two peaks in the S phase and G2 / M phase, respectively. Therefore, RNase H1 is often used as a means to detect the specificity of the S9.6 antibody in the current research on the R-loop [19, 54, 71, 88, 94, 130]. The activity of RNase H2 increases when the RNH1 gene is deleted, suggesting that it has a complementary effect involved in the resolution of RNA: DNA hybrids [131]. A recent study showed that RNase H1 and RNase H2 are different in the regulation of RNA: DNA hybrids in different phases of the cell cycle. RNase H1 plays a role in RNA: DNA hybrids at all phases of the cell cycle, while RNase H2 exerts its function only at specific G2/M. The functional differences can also be attributed to the different chromatin association between the two enzymes. When the cells enter the S phase and G2 / M phase, the affinity between the RNaseH2 subunit and chromatin increases, while the RNaseh1 subunit is weakly associated with chromatin throughout the cell cycle. In addition, RNaseH1, can serve as an excellent stress sensor and respond to stress caused by R-loop accumulation at any stage of the cell cycle [19] (Fig. 4). Also, in telomeres, RNase H2 interacts with the telomere binding factor Rif2 and is recruited into telomeres to degrade TERRA R-loop in the late S phase, which is coordinated with the telomere replication process [50].

Moreover, Zimmer et al. demonstrated that the genome regions protected by RNase H1 and RNase H2 from R-loop-mediated damage are different. As two evolutionarily conserved enzymes, RNase H2 has a global function in preventing chromosome instability caused by RNA: DNA hybridization. In contrast, RNase H1 has a region-specific function [132]. Taken together, there are significant spatio-temporal differences between RNase H1 and RNase H2 in R-loop resolution.

Factors that resolve the R-loop throughout the cell cycle

The R-loop is ubiquitous throughout the cell cycle, that is G1, S, G2, and M phases. In addition to the above-mentioned factors that prevent the R-loop from producing or promoting the resolution of the produced R-loop at a specific period, some other factors are expressed and exert their functions in a cell cycle-independent manner. For example: RNase H1, UAP56/DDX39B [19, 62] (Fig. 5a, b).

Fig. 5
figure 5

Cell-cycle-independent R-loop resolution factor. a Regardless of the cell cycle status, RNase H1 can reduce the accumulation of the R-loop by cutting the RNA portion of the hybrid, which is attributed to the constant high expression level of RNase H1 in the cell cycle. b UAP56/DDX39B is an effective helicase that prevents and/or eliminates the co-transcribed R-loop. Its role is to release new RNA from DNA, giving it the opportunity for further processing

It has been shown that RNase H1 is very important in the movement of the replication fork by resolving the R-loop. When RNH1 is depleted, it can lead to the accumulation of RNA: DNA hybrids, slowing down the movement of replication forks, and increasing the DNA damage [133]. RNase H1 is considered to be a stress sensor to responds to the R-loop accumulation when it reaches a toxic level regardless of the cell cycle. Nguyen et al. showed that replication protein A (RPA) can recognize ssDNA and interact with RNase H1, enhancing the binding of RNase H1 to RNA: DNA hybrids, and stimulating the activity of RNase H1 [134]. Another factor, UAP56/DDX39B is a partner of the THO complex and is involved in preventing the accumulation of R-loop induced by transcription [62]. It is demonstrated that UAP56 has strong RNA-DNA helicase activity, which can separate RNA: DNA hybrids and release completely new RNA molecules to ensure the smooth progress of RNA processing and output [54]. Pérez-Calero et al. found that the accumulation of R-loop and DNA damage was detected throughout the cell cycle after UAP56 depletion. And overexpression of wild-type UAP56 rescued the accumulation of RNA: DNA hybrids and R-loop-related genomic instability. This indicates that UAP56 regulates the homeostasis of the R-loop throughout the cell cycle. Furthermore, by measuring the distribution of RNA: DNA hybrids after UAP56 deletion, the author found that the R-loop was accumulated in the promoter region of the antisense RNA, throughout the gene body and the transcriptional termination region, it confirmed the global function of UAP56 in preventing R-loop accumulation [54, 135].

Relationship between R-loop and diseases

As aforementioned, the R-loop plays an important role in many physiological processes, its aberrant presence in the genome is associated with several diseases. It is reported that some trinucleotide repeat-associated diseases, neurological diseases, and cancers are related to the R-loop [136](Table 2). Many proteins with proper function are essential for preventing DNA damage caused by the R-loop and genomic instability. Once these proteins are mutant or deficient, they can lead to the perturbation of R-loop homeostasis and the occurrence of diseases.

Table 2 R-loop and Links to Human Disease

Trinucleotide repeats are easy to form RNA: DNA hybrids in vivo, and the presence of multiple repeats results in human diseases. For example, CGG repeat amplification is associated with type 1 myotonic dystrophy (DM1), type 2 myotonic dystrophy (DM2), spinocerebellar ataxia type 8 (SCA8), and other diseases [152].

Aicardi-Goutières syndrome (AGS) is an autosomal recessive genetic disease with typical clinical manifestations of neurological dysfunction. It has been reported that AGS is caused by mutations in genes encoding intracellular nucleic acid metabolic enzymes, including 3’→5’ DNA exonuclease (TREX1), RNase H2, SAMHD1, and ADAR1 [153]. In humans, the RNase H2 catalytic subunits RNASEH2A, RNASEH2B, and RNASEH2C are all required for enzymatic activity. Crow et al. also found that any gene mutation of the three subunits of the RNase H2 enzyme complex can lead to Aicardi-Goutières syndrome (AGS), in which the level of RNA: DNA hybrids is increased, suggesting that the formation of abnormal RNA: DNA hybrids contribute to AGS [139, 154]. In humans, the deficiency of RNase H2 can also cause other autoimmune diseases and cancers, such as systemic lupus erythematosus, skin cancer, and colorectal cancer [155].

Amyotrophic lateral sclerosis (ALS) is a common neurodegenerative disease. Type 4 amyotrophic lateral sclerosis (ALS4) is a rare autosomal dominant ALS that occurs in children or adolescents. Its clinical features are similar to ALS patients. In addition to limb weakness and muscle atrophy, it is also manifested in the slow progression of the disease [156, 157]. The pathogenesis of ALS4 is still unclear, but studies have shown that it is a neurological disease caused by mutations in the SETX gene. Furthermore, mutation of SETX also causes ataxia with oculomotor apraxia 2 (AOA-2), which is also a neurological disease [119, 156].

Genomic instability is an important hallmark of cancers. The non-programmed accumulation of the R-loop is associated with genomic instability. Therefore, there is a potential association between the R-loop and the development of cancers [22, 136]. BRCA1 and BRCA2 are two subtypes of breast cancer susceptibility genes. They are both tumor suppressor genes. Once mutations occur, they will increase the risk of cancers [137]. At the same time, studies have shown that BRCA1 and BRCA2 play an important role in preventing the accumulation of R-loop in the genome. Their mutations or deletions can lead to increased R-loop levels and DNA damage [83, 158]. In general, the accumulation of the R loop is one of the sources of genomic instability in cancer cells.

Recent studies have shown that RNA: DNA hybrids not only exist in the nucleus but also accumulate in the cytoplasm. When SETX or BRCA1 mutations cause nuclear R-loop dysregulation, the accumulation of hybrids in the cytoplasm is perceived by the immune receptors cGAS and TLR3, which in turn activates IRF3-mediated immune signals and apoptosis, and studies have shown that the cytoplasmic RNA: DNA hybrids derived from the R-loop is associated with human diseases, such as ataxia oculomotor apraxia type 2 (AOA2), Aicardi–Goutières syndrome and cancers [158, 159].

Conclusions and perspectives

The regulation of the cell cycle is a complex process, which requires the interaction of many proteins, cytokines, and cell cycle signaling pathways to ensure its proper progression. The hallmark event of cell proliferation is DNA replication, which mainly occurs in the S phase. At this stage, when the transcription and the replication machinery share the same DNA template chain, it makes the collision between them occur more frequently and results in the generation of R-loop, which in turn affects the process of the replication fork. Although the physiological existence of the R-loop plays an important role in normal cell activities, its accumulation can threaten the stability of the genome, which is related to the occurrence and development of many diseases [160].

Under physiological conditions, the regulatory factors responsible for the formation and resolution of the R-loop need to work coordinately to maintain the steady-state balance of the R-loop in the cell cycle. At the same time, in response to R-loop-induced DNA damage, cell cycle checkpoint will delay cell cycle progression or induce cells to withdraw from the cell cycle, thereby avoiding gene deletion, mutation, or genomic instability.

Although R-loop is involved in a variety of disease processes, cell cycle-based dysregulation of the R-loop appears to have a greater impact on cancer. Cancer is a malignant disease characterized by unlimited proliferation and DNA replication, which improve the utilization of the genome as a template and increase the frequency of TRCs [161]. It has been reported that the occurrence of TRCs promotes the production of R-loop. Studies have shown that the accumulated R-loop in the nucleus to a certain extent can be released into the cytoplasm and perceived by immune receptors, which provoke anti-tumor immunity, that is the production of type I interferon (IFN-I) [158, 162]. In addition, genomes of cancer carry numerous somatic mutations, including the R-loop resolution-related gene, for example, BRCA1. It is clear that the deficiency of BRCA1 not only increases the risk of breast or ovarian cancers but also the level of R-loop. BRCA1 plays a key role in the decomposition of the R-loop, its deficiency causes the accumulation of R-loop in cancer cells, which in turn leads to subsequent genomic instability, replication stress, and loss of viability. Therefore, the R-loop may be a potential target for cancer therapy [163, 164]. Together, it is conceivable that the combination of TRCs and deficiency of BRCA1 can inevitably and synergistically elicit potent anti-tumor immunity, lead to cell cycle arrest, and even cell death, which may be a therapeutic vulnerability of cancer with defective BRCA1.

Among the aforementioned factors implicated in R-loop formation and resolution, only a few factors had been determined to exert their functions in specific cell cycle stages. Further work will be needed to identify more factors that have roles in each stage of cell cycle, especially in S phase. In human tumors, there are a large number of mutations [161], it is important to investigate how the function-altering mutations influence cell cycle progress when they affect a specific R-loop-related factor. Many R-loop-related factors exert their function in a context dependent manner [81], during the cell cycle progress, the context is changing constantly, it will be challenging to elucidate how these factors to adapt to the changing context to ensure the cell cycle progress smoothly.