Background

Transposable elements (TEs) are mobile endogenous genetic elements predominantly transmitted through vertical transfer (i.e., from one generation to the next) like any other gene. In most metazoan germlines, transposition is repressed during gametogenesis by specialized 23–29-nt small RNAs associated with PIWI proteins [1,2,3]. These small RNAs have been named PIWI-interacting RNAs or piRNAs. They are produced from numerous heterochromatic loci, called piRNA clusters, that mainly contain TE fragments coming from ancient insertions serving as libraries of mobile sequences to repress. Among the 140 ovarian piRNA clusters of Drosophila melanogaster, only a few of them (the flamenco somatic and the 42AB, 38C, 80F, 20A germline clusters) have been extensively studied to identify factors involved in piRNA-dependent silencing [4,5,6,7,8,9]. It is well established that germline piRNA clusters are transcribed by a specialized RNA polymerase II complex that contains Moonshiner (Moon) recruited to the locus through its interaction with a complex containing the HP1 homolog Rhino (Rhi), Deadlock (Del), and Cutoff (Cuff) forming the RDC complex [6, 10]. Furthermore, only maternally inherited piRNAs participate in the transgenerational memory of TE sequences to repress as no paternally inherited piRNAs have been found [11, 12]. Although our understanding of the molecular events involved in the maintenance of active piRNA clusters through generations has expanded substantially, major gaps still exist especially in the early events of functional piRNA cluster establishment.

In gonads, 21-nt siRNAs can be also synthesized from dual-stranded piRNA clusters [13]. They are usually produced from double-stranded transcripts that are recognized by the Dcr-2 nuclease, and once loaded on the Ago2 protein, they can induce the cleavage of complementary RNA targets [14]. However, their germline function is not clear as they can be dispensable without significantly affecting viability, fertility, TE repression, and piRNA cluster maintenance [15].

Rarely, TEs are also transmitted through horizontal transfer (HT) corresponding to DNA transmission between unrelated individuals. It has been however noted that HTs are more frequent than originally thought [16,17,18] raising the question of how and by which dynamics new piRNAs are produced by naïve genomes in the absence of maternal inheritance of complementary piRNAs. One of the best documented HT is the one of P element that has successfully invaded the genome of natural populations of D. melanogaster within two decades during the twentieth century and of D. simulans since the beginning of the twenty-first century [19,20,21]. In D. melanogaster, the subtelomere of the X chromosome (cytological site 1A) is a hot spot of P insertions [22,23,24,25]. This locus also known as telomeric-associated sequences (X-TAS) is also one of the Drosophila piRNA clusters (hereafter called cluster 1A) that can be dispensable in laboratory environments [26]. Cluster 1A contains repeats with regions sharing homologies with the autosomal subtelomeric piRNA clusters of the 2R and the 3R chromosomes (clusters 60F and 100F) and a 0.9-kb region called T3 not present elsewhere in the genome, a unique feature among all known piRNA clusters [26]. From a P copy inserted in cluster 1A, piRNAs derived from P are produced in the female germline capable of repressing euchromatic active P elements [11, 27,28,29,30,31]. Moreover, lacZ encoding P transgenes inserted in cluster 1A (e.g., P(lArB) in P-1152 strain) have been shown to silence female germline expression of another P-lacZ transgene located in euchromatin [32, 33]. This euchromatic P-lacZ has served as a reliable reporter system (or “piRNA sensor”) for studying functional piRNA biology as its silencing depends on piRNA biogenesis factors [5, 12, 34,35,36,37,38] but not on siRNA biogenesis factors [15, 34]. Indeed, using this sensor, we have shown that the mechanism of repression is accomplished according to an ON/OFF mode, where egg chambers show either strong (ON) or no (OFF) lacZ silencing [34]. We have also shown that when subtelomeric P(lArB) transgenes were paternally inherited, the number of fully repressed egg chambers in the first generation is low and increases progressively to reach a maximum level of repression after five generations [34].

We have also found that a naïve locus made of seven tandemly repeated P(lacW) transgenes, in the strain BX2, that is maintained as a non-piRNA producer over the years, can be fully converted into a stable piRNA cluster in one generation by maternally inherited piRNAs matching the whole length of the transgenes, uncovering a stable case of epigenetic conversion called paramutation [12]. The switch from a naïve locus to an active one able to produce piRNAs, hereafter referred to as “conversion,” is associated with an enrichment of H3K9me3 [5, 39]. Such functional conversion can occur when the locus producing maternal piRNAs is located on different chromosomes and is partially homologous to P(lacW) [15]. Moreover, ovarian small RNA analyses revealed that conversion of the full length of P(lacW) can be completed when tested after the third generation [15]. These results along with others suggested that the piRNA machinery is able to eventually co-opt an unknown sequence from the maternal piRNA repertoire to produce de novo piRNAs of this new sequence [15, 40,41,42]. A similar scenario could happen when a naïve genome faces HT of new TEs that insert into piRNA clusters. At first, the TE copy newly integrated into a piRNA cluster is surrounded by sequences that are targeted by maternal piRNAs (Additional file 1: Fig. S1). In fine, new piRNAs against this TE will be synthesized and able to repress active euchromatic copies. The rareness of such event and the repetitive nature of piRNA clusters have made it difficult to directly address the precise latency and the identification of early molecular events involved in these co-option processes (Additional file 1: Fig. S1C).

We report here a study where we have modeled a TE neo insertion into a piRNA cluster in a naïve genome to question the kinetics of production of new specific piRNAs and their capacity to repress from the first generation, necessary to protect genome integrity. To model such event, we have used several transgenes, derived from the P transposon, inserted in different piRNA clusters or inserted in euchromatin working as piRNA sensors. Using the paternal origin of transgenes inserted in piRNA clusters, we have been able to correlate the emergence of new piRNA production with their silencing capacities using functional assays from the very first generation. We have also shown that the kinetics of co-option by piRNA cluster leading to the conversion of a sequence could depend on intrinsic properties such as its length. We have identified that all regions of the transgene are converted concomitantly with the same efficiency at each generation, but this conversion is restricted to sequences nested in piRNA clusters as previously shown [43] revealing an active mechanism preventing cis-propagation of piRNA clusters to their flanking regions [44]. By studying more specifically a germline subtelomeric piRNA cluster, cluster 1A, from the P-1152 strain containing P(lArB) and T3 sequences, we have identified that heterogeneity can be observed inside piRNA clusters as they can exhibit different rates of conversion and different piRNA profiles (symmetrical and asymmetrical dual-strand clusters) associated with chromatin and transcription variations along the locus. Altogether, this study brings new insights into piRNA cluster dynamics.

Results

Functional conversion of paternally inherited subtelomeric transgenes completed within four generations is associated with piRNA synthesis

It was previously observed that silencing of a lacZ sensor induced by subtelomeric P(lArB) transgenes inserted in cluster 1A was female germline-specific, with a maternal effect that showed variegated ON/OFF lacZ egg chambers repression (between 80 and 100% of repressed egg chambers) and dependent on the piRNA biogenesis pathway [12, 34, 36, 38, 45]. By contrast, paternally inherited P(lArB) induced lacZ silencing in few ovarian egg chambers in the first generation (between 5 and 35%) that increased in subsequent generations [34]. The progressive increase in the number of repressed egg chambers per ovary suggested that the amount of lacZ piRNAs per ovary produced by the subtelomeric P(lArB) was proportionally increasing at each generation.

To test this model, we have set up reciprocal crosses between the P-1152 and the Canton strains. P-1152 contains two P(lArB) transgenes inserted in cluster 1A (Fig. 1A, B). Canton lacks cluster 1A (Δ-1A strain) and is devoid of P-derived transgenes (Additional file 1: Fig. S2A, Table 1, Additional file 2: Table S1 [26]). However, Canton is carrying the autosomic subtelomeric piRNA clusters (clusters 60F and 100F) that produce piRNAs targeting the common regions between cluster 1A and the autosomal clusters (T1, T2, T4, and INV-4, Fig. 1A [26]). For these crosses, P(lArB) were first paternally or maternally inherited and then maternally maintained as hemizygous in the successive generations to establish the paternal and the maternal lineages (PI and MI, respectively) (Fig. 1C). Four independent replicate lines were generated for each lineage. Ovarian lacZ silencing was assayed at each generation by crossing PI and MI females with males containing the lacZ sensor (Additional file 1: Fig. S2B). Ovarian X-gal staining of the progenies allowed to quantify the level of conversion of P(lArB) into an active piRNA-producing locus. When P(lArB) were paternally inherited (PI), the first generation showed a limited amount of repressed egg chambers (G1; 9.25%) that progressively reached the same level of repression as the maternal lineage after four generations (> 90%, Fig. 1D and Additional file 2: Table S2), reminiscent of previous results with another lacZ sensor [34]. In parallel, to correlate the level of repression with the production of subtelomeric P(lArB) piRNAs, ovarian small RNAs were sequenced, and the normalized 23–29-nt P(lArB) small RNAs were analyzed at each generation. At G1, the number of 23–29-nt small RNAs in PI females was low compared to MI females increasing progressively at each generation reaching a plateau at G4 corresponding to the same amount of 23–29-nt small RNAs than in MI females (Fig. 1E, F, Additional file 1: Fig. S3A and Additional file 2: Table S3). This correlates with the lacZ repression (Fig. 1D). The enrichment in uridine as the first nucleotide for the 23–29-nt small RNAs (1U bias) (Fig. 1F) and an enrichment of **-pong pairs (Fig. 1E), two signatures of germline piRNA biogenesis, together with previous mutant analyses reinforced the assumption that these small RNAs are genuine piRNAs [5, 12, 34, 36, 38]. Therefore, these results correlate the lacZ silencing efficiency with piRNA amount. Furthermore, the distribution of piRNAs on sense and antisense along P(lArB) corresponds to a dual-strand piRNA cluster profile (Fig. 1E). These results were reproduced in different genetic backgrounds (Canton and w1118, Additional file 1: Fig. S4) and with another P derived transgene inserted in the autosomal subtelomeric piRNA cluster 100F (Additional file 1: Fig. S5). Interestingly, piRNAs synthesis was homogeneously increasing for all regions whatever their position within P(lArB) in PI, with roughly the same kinetics over generations. The most distal parts of the transgene, adjacent to sequences targeted by maternally inherited piRNAs, do not show a quicker conversion process than the internal parts. These results support the hypothesis that the kinetics of conversion was roughly the same independently of their position within the transgene, their sequence origin (Drosophila or E. coli), or their nature (genes or P-derived sequences) (Fig. 2, Additional file 1: Fig. S3C-G). This is in accordance with the capacity of Rhino and its partners to erase internal transcriptional signals, leading most likely to a uniform production of piRNAs independently of the sequence origin [4, 6, Full size image

Table 1 Strains used in this study
Fig. 2
figure 2

Normalized 23–29-nt reads on P(lArB) regions in MI and PI lineages. Normalized ovarian 23–29-nt reads map** to different regions of P(lArB): plasmid (A), rosy (B), Adh (C), lacZ (D) genes, and the P-derived sequences (E) in the MI H and PI D sublines (Additional file 2: Table S3). All the sequences, either exogenous (lacZ, plasmid, and P) or endogenous (rosy, Adh) from the Drosophila genome, have similar kinetics of producing a progressive increase number of 23–29-nt small RNAs over generations

Consistent with an HT of TE into a naïve genome, G1 males of P(lArB)-PI contribute only to DNA transgenic copies without contributing to complementary piRNAs inheritance. Moreover, our results suggest that maternally inherited piRNAs produced by the autosomal subtelomeres in Δ-1A females can target and convert progressively across generations the subtelomeric repeats surrounding P(lArB) insertions (Fig. 1E, F and Additional file 1: Fig. S1).

P(lArB) and T3 show distinct conversion dynamics

Cluster 1A is composed of repeats shared with other piRNA clusters in autosomal subtelomeres and of a unique 0.9-kb T3 domain not found outside of cluster 1A (Fig. 1A) [26]. In view of the progressively increasing number of P(lArB) piRNAs in the paternal lineage (Fig. 1F), we hypothesized the same dynamic over four generations for a paternally inherited T3 domain. To test this, ovarian small RNA libraries were reanalyzed by aligning the 23–29-nt reads to the T3 domain. Unexpectedly, the same amount of normalized T3 23–29-nt small RNAs was found in both paternal and maternal lineages from the first generation (Fig. 1E, G, Additional file 1: Fig. S3B and Additional file 2: Table S3). This was observed using the Canton or w1118 Δ-1A strain genetic background (Additional file 1: Fig. S4D). As in our previous work, the 23–29-nt T3 small RNAs were mostly produced from one strand leading to an asymmetrical dual-strand piRNA cluster, showing an enrichment in uridine at first nucleotide (1U bias) [26] and an overlap** of 10 nucleotides bias among the small RNA pairs (a **-pong signature, Fig. 1E, G).

To test if T3 small RNAs were functional for repression from the first generation, we designed a P-derived transgenic sensor using the red fluorescent protein (RFP) reporter gene transcriptionally fused to the T3 domain and expressed under the control of the UASp sequences (pRFP-T3). RFP expression was induced in female germline by the Gal4 protein expressed under the control of the nanos (nos) promoter (Fig. 3A). Female germline expression of pRFP-T3 was observed in the Δ-1A w1118 background confirming the absence of other T3 piRNA sources (Fig. 3B). When the P-1152 strain was used as a donor of T3 small RNAs, almost complete silencing of pRFP-T3 was observed induced by small RNAs produced by both the T3 domain and P-derived sequences of P(lArB) (Additional file 1: Fig. S6A and B). To avoid this, we used the Oregon strain, as a donor of T3 small RNAs [26], because this strain is devoid of P-derived sequences (P transgene or natural P element) and carries cluster 1A [1, 26]. Ovarian pRFP-T3 expression was repressed when cluster 1A of the Oregon strain was maternally inherited (100% of RFP repressed egg chambers (n = 1008), Fig. 3C and Additional file 1: Fig. S6C) suggesting that small RNAs produced from T3 are indeed fully functional germline repressors. Moreover, the paternally inherited T3 locus from Oregon was also able to strongly repress pRFP-T3 from the first generation (99.6% of RFP repressed egg chambers (n = 748), Fig. 3D). Unlike P(lArB), T3 paternal allele is functionally converted in a single generation in all cells, and in direct correlation with the amount of T3 small RNAs detected in PI G1 females (Fig. 1E, G and Additional file 1: Fig. S3B). To confirm that the pRFP-T3 silencing was piRNA mediated, we knocked down germline expression of Piwi and Nxf2, two co-transcriptional silencing factors of the piRNA pathway, Bootlegger (Boot) that recruits nuclear export factors like Nxf3-Nxt1 to piRNA cluster loci [1, 47] and Moon, a subunit specific of germline piRNA cluster RNA polymerase [10]. Figure 3E shows that RFP silencing was strongly affected by the knockdown of these factors supporting the notion that 23–29-nt T3 small RNAs are functional piRNAs targeting pRFP-T3 reporter in the female germline. These knockdowns were also affecting lacZ sensor silencing induced by subtelomeric P(lArB) (Additional file 1: Fig. S7).

Fig. 3
figure 3

Cluster 1A relies on the germline piRNA pathway. A Schematic representation of the experimental cross: Oregon flies containing cluster 1A carrying T3 and the nos-Gal4 germline driver were crossed with w1118 flies devoid of cluster 1A (Δ-1A) but expressing the pRFP-T3 sensor. B Strong ovarian germline RFP expression of progenies from control nos-Gal4 females crossed with males encoding the pRFP-T3 sensor in the absence of cluster 1A. Maternally (C) or paternally (D) inherited T3 strongly represses ovarian germline expression of the pRFP-T3 piRNA sensor. E Ovarian pRFP-T3 repression of maternally inherited T3 is strongly affected by germline knockdown of piwi, nxf2, boot, and moon (piwi-KD, nxf2-KD, boot-KD, moon-KD). Knockdown for white served as control. Repression was assayed by counting the percentage of RFP-silenced egg chambers at stages 8–10. The total numbers of counted egg chambers are indicated in parenthesis. RFP expression is in red, and DAPI staining, indicating nuclei, is in white. Parental crosses are indicated above micrographs

Thus, paternally inherited P(lArB) and T3 are piRNA-producing sequences that display different rates of conversion as well as different piRNA distribution profiles (symmetric dual-strand for P(lArB) and asymmetrical dual-strand for T3), although they are located in the same locus and dependent on the same piRNA pathway. Functionally, these results might indicate that piRNA cluster activation is dependent on maternal piRNA inheritance at each generation that targets the flanking subtelomeric regions (Additional file 1: Fig. S1), but also on properties of sequences present within the locus.

Sequence length could influence the conversion efficiency

The contrasting conversion rate observed in G1 between the paternally inherited 0.9-kb T3 domain and the 18-kb P(lArB) suggests that short sequences could be converted more efficiently than longer ones. To test whether the sequence length could influence the frequency of conversion, we used several strains: the seven tandemly repeated P(lacW) transgenes in the BX2 strain that can be converted into an active piRNA cluster by complementary maternal piRNAs [12, 15], the P-1152 strain, and the RS3 strain that carries the P(RS3) transgene inserted in the autosomal 3R subtelomeric piRNA cluster, cluster 100F (Fig. 4, Table 1 and Additional file 2: Table S1). Both the P(lArB) and P(lacW) transgenes encode the 3.5-kb E. coli lacZ gene and the 1.8-kb bacterial plasmid backbone (Fig. 4A, C). The P(RS3) and P(lacW) transgenes both encode the 4.1-kb white gene (Fig. 4B, D). In addition, P(lArB), P(RS3), and P(lacW) have in common the 5′ (0.58 kb) and 3′ (0.23 kb) distal regions of the P element.

Fig. 4
figure 4

Conversion of the P(lacW) transgenes by P(lArB) or P(RS3). Diagrams of P(lArB) inserted in cluster 1A (A) and P(RS3) inserted in cluster 100F (B). piRNAs produced by both clusters are represented by small colored lines below the transgenes. Crosses to convert the seven P(lacW) transgenes inserted in tandem by P(lArB) (C) or by P(RS3) (D). Complementary maternal piRNAs produced by either P(lArB) or P(RS3) are indicated above the P(lacW) scheme. Normalized reads of 23–29-nt map** to lacZ (E, H), white (F, I), and plasmid sequence (G, J). When P(lacW) transgenes are activated by P(lArB), the density of 23–29-nt small RNAs between G1 and G4 is similar for lacZ (E) and the plasmid sequence (G) and 2.25-fold higher for white (red box, F). When P(lacW) transgenes are activated by P(RS3), the density ratio of 23–29-nt small RNAs is close to 1 between G1 and G7 for all the domains of P(lacW) (1.3 for lacZ (H), 0.8 for white (I) and 1.3 for the 1.8-kb plasmid region (red box, J) that is not targeted by maternal piRNAs in G1). The density of normalized 23–29-nt reads per kb (reads/kb) and the fraction of 1U bias at 5′ are indicated in each panel. The P(lArB) (18 kb), P(RS3) (6 kb), and P(lacW) (10.7 kb) transgenes are not drawn to scale

In this set of experiments, hemizygous P(lArB) and P(RS3) females (donors of piRNAs) were crossed to BX2 males hemizygous for P(lacW) (Fig. 4 and Additional file 1: Fig. S8). The G1 progenies that paternally inherited the P(lacW) transgenes and maternally inherited the piRNAs from either P(lArB) or P(RS3), but not the subtelomeric transgenes, were then crossed to each other for several generations (Fig. 4C, D and Additional file 1: Fig. S8). Previously, ovarian 23–29-nt RNAs from G3 up to G10 map** to the different regions of P(lacW) were identified at a time when the complete conversion was reached, that is to say when stable paramutation already occurred [15]. Here, to question the conversion establishment, ovarian 23–29-nt RNAs were analyzed as soon as G1 and up to G4 or G7 generations. piRNAs matching lacZ and plasmid sequences maternally inherited from P(lArB) were able to convert in G1 the complementary regions in P(lacW) (Fig. 4E, G). The lacZ gene, expressed as a transcriptional fusion with P first exons, and the white gene of P(lacW) were also converted from the first generation by complementary piRNAs synthesized by the maternal P(RS3) allele (density ratio of 1.3 and 0.8 between G1 and G7, respectively, Fig. 4H, I, 100% of lacZ sensor silencing Fig. 4H and Additional file 1: Fig. S8). Strikingly, the piRNA population in the G1 progeny was limited for the 4.1 kb white sequence absent in P(lArB) (658 reads/kb) that significantly increased in G4 (1480 reads/kb) (ratio of 2.25 between G1 and G4, Fig. 4F), while the piRNAs for the 1.8-kb plasmid sequence not included in P(RS3) were already detected at G1 (ratio of 1.3 between G1 and G7, Fig. 4J).

Therefore, combining these results with the fact that the 18-kb P(lArB) transgene requires several generations to be fully converted when it is paternally inherited (Fig. 1) and that the 0.9-kb T3 conversion occurs into only one generation, we suggest that the efficiency of conversion of a new sequence inserted into a piRNA cluster targeted by maternally inherited piRNAs, but not targeted itself, could depend on its length.

We have also tested whether the conversion rate could be influenced by the nucleotide composition of sequences. One hypothesis is that the T3 and plasmid sequences might be enriched in some dinucleotides compared to the P(lArB) and white sequence. To address this, we have computed the dinucleotide content of the four sequences. We have found that the content of dinucleotides is quite consistent between sequences with different lengths. We only noticed a slight bias toward AT/TA dinucleotides for the T3 sequence, but not for the others including the plasmid, that is converted with the same efficiency as T3 (Additional file 1: Fig. S9A, Additional file 2: Table S5). To understand the potential role of this bias on sequence conversion rate, we have included it in a linear regression model (piRNA.density ~ sequence.length + AT or TA) but obtained poor p-values (0.665 and 0.199, respectively) to conclude. Therefore, it is unlikely that dinucleotide content, in itself, might explain the difference of conversion efficiency between the considered sequences.

Finally, we have tested if small RNAs with imperfect map** (i.e., 3 mismatches) with the references sequences could exist in the parental strains that could participate to the one-generation conversion of T3 and the plasmid, while absent or less abundant for P(lArB) and white gene. The parental strains are Canton and w1118 for T3 and P(lArB) (Fig. 1 and Additional file 1: Fig. S4), P-1152 for white gene included in P(lacW) transgene (Fig. 4F), and RS3 for plasmid included in P(lacW) (Fig. 4J). For T3, 3 mismatch piRNAs were identified in the Canton and w1118 parental strains (1563 and 1378 reads/kb, respectively) (Additional file 1: Fig. S9B and C, Additional file 2: Table S6). This result was expected, as the L subtelomeric repeats (left arms of the autosomal chromosomes shown to be piRNA clusters [1, 26]) contain a small domain with similarities with T3 [26]. It has been shown that the Piwi protein from a sponge specie can be tolerant to mismatches for the piRNA target binding but requires extensive pairing for the endonuclease activity preventing unwanted mRNA targeting [1: Fig. S9F and G) and are not functional for the silencing of the pRFP-T3 piRNA sensor (Fig. 3B, Additional file 1: Fig. S6B). It can be also noted that the region sharing similarities with the L subtelomeres does not show a high piRNA density of 0 mismatch T3 piRNAs (see the read counts around position 600 in Fig. 1E, Additional file 1: Fig. S4D, S9B, C). This suggests that the L piRNAs do not participate to the conversion of the T3 domain. Few 3 mismatch piRNAs were identified map** to the plasmid in RS3 (8.8 reads/kb), and with the same order of magnitude for the white gene in P-1152 (5.8 reads/kb) and to P(lArB) in the Canton, w1118 (8.1 and 9.5 reads/kb, Additional file 1: Fig. S9B, C, D, E and Additional File 2: Table S6). Therefore, no common feature concerning the role of 3 mismatch small RNAs was observed between T3 and the plasmid. Altogether, these results strongly suggest that the conversion efficiency of a sequence, not targeted in G1 but flanked by sequences targeted by the maternal piRNA pool depends, at least in part, on its length: in one generation, a low frequency of conversion can occur for sequences longer than 4 kb (i.e., white or P(lArB)), whereas the high frequency of conversion occurs for shorter sequences (i.e., plasmid sequence or T3).

Conversion is restricted to sequences embedded within pre-existing piRNA clusters

The above results defined the conversion of loci surrounded on both sides by sequences targeted by maternal piRNAs. We then examined whether such conversion could also spread onto adjacent genomic sequences. Few 23–29-nt small RNAs flanking the insertion site of P(lacW) were detected that were not increasing between G1 and G4 (Additional file 1: Fig. S10A and B). The majority of them correspond to the transcribed strand of Ago1 where the array of P(lacW) is inserted, suggesting that they were produced primarily by phasing without amplification [49]. The same analysis was performed on CG17636, the first gene on the X chromosome, close to cluster 1A. Few and unchanged 23–29-nt matching CG17636 were identified between G1 and G4 (Additional file 1: Fig. S10C). Similar results were observed on endogenous homologous loci present in P(lArB) transgenes (Additional file 1: Fig. S11A, B, C). We conclude from these results that spreading of conversion from transgene sequences in cis as observed here and in earlier studies [43] or in trans outside the piRNA clusters is very limited suggesting the existence of a tight control that restricts piRNA cluster spreading and defines precisely their borders [44], like the transcription of genes flanking cluster 42AB or cluster 80F [10].

Cluster 1A is a heterogeneous piRNA cluster

Although P(lArB) and T3 are located in the same cluster 1A of the P-1152 strain, their piRNA profiles and kinetics of conversion when paternally inherited are different (Fig. 1E–G). To understand early molecular events occurring in G1, we first analyzed the ovarian heterochromatin throughout cluster 1A. Using chromatin immunoprecipitation followed by quantitative PCR (ChIP-qPCR, with primers shown in Fig. 5A, Additional file 1: Table S7), a high trimethylated Lysine 9 of Histone 3 (H3K9me3) enrichment was found on P(lArB), when maternally inherited as compared to paternally inherited (Fig. 5B), confirming previous observations [5]. On T3, high H3K9me3 enrichment was observed in both PI and MI, with the overall level of H3K9me3 on PI being higher than on P(lArB) (Fig. 5B). Therefore, maternally inherited piRNAs can induce H3K9me3 enrichment on all sequences of cluster 1A. However, in PI G1, H3K9me3 enrichment is heterogeneous along the 1A locus, from weak on lacZ to high on T3, consistent with their piRNA productions and silencing of lacZ and pRFP-T3 piRNA sensors (Figs. 1, 2, and 3).

Fig. 5
figure 5

Chromatin state and steady-state transcription of cluster 1A. A Schematic representation of one of the subtelomeric repeats of cluster 1A of the P-1152 strain, containing the PlArB transgenes (indicated by an asterisk). Yellow arrows indicate the position of qPCR primers. The X subtelomeric repeats (1.8 kb) and the P(lArB) (18 kb) are not drawn to scale. B H3K9me3 ovarian enrichment on three different regions of P(lArB) and of T3 in maternal and paternal inheritance (MI, red; PI, blue) measured by ChIP-qPCR in G1. The signal was normalized to the 42AB region highly enriched in H3K9me3 marks. ChIP experiments were performed on three independent biological samples. P-values were calculated using a bilateral t-test (n = 3). B Ovarian RNA accumulations of G1 P(lArB) and T3 in P-1152 (T3(P-1152)) and T3 in Oregon (T3(Oregon)) were measured by RT-qPCR in control KD (white-KD) and moon-KD and normalized to the expression of RpL32 gene. P-values were calculated using a one-way ANOVA test followed by a Tukey HSD test (n = 3). ns, not significant. *P < 0.05, **P < 0.01, ***P < 0.001, ****P < 0.0001 (Additional file 2: Table S8)

We next asked whether the chromatin variations observed in P(lArB) and T3 and piRNAs synthesis were correlated with the RNA steady state of cluster 1A. For this, we analyzed the ovarian RNA accumulation across the locus by RT-qPCR experiments. Using the same primers (Fig. 5A), RNA accumulation of P(lArB) and T3 in the P-1152 strain were found higher in PI than in MI (see white-KD control in Fig. 5C). After four generations, differences between PI and MI were no longer detected, as the system reached an equilibrium similar to the maternal lineage (Fig. 1F, G, and Additional file 1: Fig. S12A). One explanation is that, in MI, high amounts of P(lArB) and T3 piRNAs address the complementary P(lArB) and T3 transcripts to the piRNA biogenesis inducing their degradation. In PI, low amounts of P(lArB) piRNAs (Fig. 1E, F) could instead prevent their recognition as piRNA precursors allowing accumulation of P(lArB) transcripts (Fig. 5C). Surprisingly, the high amount of T3 piRNA in PI (Fig. 1E, G) was not correlated with low accumulation of T3 containing transcripts (Fig. 5C). We therefore wondered if the presence of P(lArB) could affect T3 RNA accumulation. As cluster 1A of Oregon was used to obtain P(lArB) insertion in the P-1152 strain [50], we measured T3 RNA accumulation in the Oregon strain (“T3(Oregon)”) and found unexpectedly that T3 RNA steady state was unchanged between MI and PI contrary to T3 in the vicinity of P(lArB) (“T3(P-1152)”) (Fig. 5C). The size and structure of germline piRNA cluster transcripts are still unknown; however, we assumed that chimeric transcripts can exist between different domains of piRNA clusters. Based on this assumption, in T3(Oregon) PI, chimeric transcripts containing T3 and the other subtelomeric domains (INV-4, T1, T2, T4) can be targeted by piRNAs produced by the autosomal subtelomeric piRNA clusters. These transcripts are then directed to the piRNA degradation pathway with the same efficiency as in MI (Fig. 5C). In T3(P-1152) PI, the chimeric transcripts containing T3 and subtelomeres are processed as described above leading to the production of piRNAs observed in Fig. 1G. In addition, the chimeric transcripts containing T3 and P(lArB) can accumulate because they are not efficiently targeted by maternally inherited piRNAs that lack T3 and P(lArB) (Fig. 5C).

We then tested if the non-canonical transcription specific of germline piRNA cluster was directly required for transcription of P(lArB) and T3 in both genomic contexts (T3(P-1152) and T3(Oregon)). Germline knockdown of moonshiner (moon-KD, Additional file 1: Fig. S12B and C), involved heterochromatic transcription of dual-strand germline clusters [10], exhibited a clear increase of P(lArB) and T3 RNA steady state compared to the control white-KD MI (Fig. 5C). These results indicate that cluster 1A transcription is Moon dependent, that these transcripts are funneled to the piRNA machinery, and that the presence of P(lArB) affects the RNA steady-state of T3 in T3(P-1152) (Figs. 3E and 5C). One explanation is that transcriptional signals (initiation, termination) of P(lArB) transgenes might not be totally erased by the piRNA cluster chromatin and could therefore influence adjacent T3 RNA accumulation.

According to dual-strand piRNA synthesis (Fig. 1E), accumulation of P(lArB) transcripts in G1 PI (Fig. 5C) could lead to the production of double-stranded RNAs that could be potentially processed into siRNAs in the absence of maternally inherited piRNAs. Consistent with a bioRxiv preprint from Luo et al. [51], we have questioned whether siRNAs were produced in parallel to piRNAs. We have compared the kinetics of occurrence of siRNAs and piRNAs during the P(lArB) conversion process in PI over the four generations (Fig. 1F). Indeed, a high amount of P(lArB) siRNAs is accumulated in G1 PI that persists across the first 4 generations, whereas piRNAs require the 4 generations to reach the plateau (Fig. 6A and Additional file 1: Fig. S13). The same profile of small RNA distribution was detected for all regions of the transgene in the Canton background and in the w1118 genetic background (Additional file 1: Fig. S13A, B, and D). No such siRNA amount was found in the MI lineage, where P(lArB) was converted a long time ago (Fig. 6B and Additional file 1: Fig. S13C). Importantly, functional assays indicate that these transgenic siRNAs are not functional for the silencing of the P(lacZ) reporter (Fig. 1D, Additional file 2: Table S2). To complete this observation, we have also looked at siRNAs corresponding to the white gene when activated by P(lArB) (Fig. 4F). In this context, white siRNAs were produced from the first generation with a less spectacular abundance compared to white piRNAs, than in the case of P(lArB) G1 conversion, and that accumulate in G4 (Fig. 6C). Thus, the presence of siRNAs could precede the production of piRNAs, but this is not a general phenomenon. Their emergence can be also the result of accumulation in G1 PI of transcripts that are not targeted by maternal piRNAs and become the subtract of Dcr-2 endonuclease, in accordance with the fact that siRNAs were shown to be dispensable for germline piRNA cluster maintenance, silencing of piRNA sensor and paramutation [15, 34].

Fig. 6
figure 6

siRNAs and piRNAs abundance during conversion. Size distribution of ovarian small RNAs isolated from the P(lArB)-PI, subline D (A), or the P(lArB)-MI subline H (B) matching to T3 and P(lArB) in G1, G2, G3, and G4. When paternally inherited, P(lArB) is converted for piRNA synthesis progressively across generations, while T3 is converted from the first generation (G1). C Size distribution of small RNAs isolated in the G1 and G4 progenies of females containing P(lArB) crossed with males containing P(lacW) mapped on the white sequence. In this context, white gene is progressively converted (Fig. 4F). The ratio number of normalized 23–29-nt over 21 nt is indicated for the P(lArB) panel