Background

Understanding the regulatory mechanism underlying HSPC fate determination at different developmental stages is a primary goal of hematopoiesis biology. This is helpful in improving generation of functional HSPC in vitro. The HSPC development process is highly conserved between zebrafish and mammals and a series of important findings of HSPC ontology are based on zebrafish [1, 2]. For example, HSPC generation through endothelial-to-hematopoietic transition (EHT) is directly observed in zebrafish embryos [3]. There are three waves of hematopoiesis during zebrafish or mammalian development, with nascent HSPC arising from the ventral wall of dorsal aorta (DA) of zebrafish or aorta-gonad-mesonephros (AGM) region of mammals through the process of EHT, acquiring the ability of self-renewal and reconstruction of all blood lineages [4]. Then , this group of cells move to caudal hematopoietic tissue (CHT) of zebrafish or fetal liver of mammals to be fetal HSPC which can rapid expand and differentiate [5, 6]. Finally, these cells seed into kidney marrow (KM) of zebrafish or bone marrow of mammals, to become adult HSPC and support adult hematopoiesis [7].

Although significant achievements have been made to know this process, a comprehensive understanding of the dynamic regulatory mechanisms governing HSPC development is still lacking. Recent studies showed that despite the critical role of transcription factors (TFs), epigenetic modifications are also important in HSPC fate decision [8, 9]. Chromatin conformation is fundamental for transcriptional regulation via multiple mechanisms, from long-distance interactions between enhancers and promoters to higher-order chromosome compartments and topologically associated domains (TADs) that can act as transcription restrained units [10, 27]. FeatureCounts was used to quantify gene expression and obtain reads count. Fold changes in gene transcription levels were estimated using DESeq2 [28]. Enrichment analysis of gene function was performed in the Metascape platform (http://metascape.prg/gp/index.html). Active promoters were defined as 6 kb region centered on the transcription start site of genes with FPKM > 1.

Hi-C data processing and visualization

HiC-Pro (v.3.1.0) was used for the processing of Hi-C data [29]. Only uniquely mapped read pairs with map** quality no less than 10 were saved for further analysis, and dangling end reads, self-circled reads, and religated reads were all trimmed out. Non-duplicated reads were used to generate Hi-C contact matrices at the binning resolution of 10 kb, 50 kb and 100 kb. To validate the reproducibility of data, we calculated the GenomeDisco score between two libraries [30]. Contact heatmaps were generated with matrices at different resolutions by fanc (v.0.9.25) [31]. The p(s)-curves were calculated from genome distances of 20 kb to 50 Mb separated into 500 bins logarithmically. We applied the Von Neumann Entropy (VNE) approach to quantify the disorder of chromatin structure for 100-kb resolution intra-chromosomal matrices [32]. Hi-C matrix (M) was converted to correlation matrix C using corr (log2 [M]). Then, the eigenvalues (λi) of matrix C was obtained by eigen-decomposition and normalized with \({\uplambda i}=\frac{{\uplambda i}}{{\sum }_{j=1}^{n}{\uplambda j}}\). VNE was calculated as \(-{\sum }_{i=1}^{n}{\uplambda iln}({\uplambda i})\).

Compartments were called d by analyzing the first eigenvector of the KR normalized contact maps at 100 kb resolution. The compartments with higher gene density were assigned as type A, while the compartments with lower gene density were assigned as type B. Compartment strength was calculated using AB/AA + BB. Saddle plots were calculated as previously described [33]. Hi-C matrix bins were sorted according to the PC1 values. Sorted frequencies were aggregated into 50 groups and averaged to obtain a compartmentalization saddle plot. Number of ATAC-seq peaks overlapped with compartments were analyzed by BEDtools with at least 1 bp shared [34].

TADs and TAD boundaries were identified at 50 kb resolution as described [35]. Shared TADs between different samples were defined as overlap** area larger than 75% for both samples. We calculated the standard deviations of the insulation score of each TAD boundary across three-cell stages and sorted boundaries by standard deviations. Then the top 1000 variable TAD boundaries were selected based on the ranking order. Clustering of boundaries was carried out using Pheatmap package in R.

Loops and interactions were detected with HiCCUPS in Juicer Tools at 10 kb resolution [36]. Enhancer or promoter involved loops were those with at least one anchor overlapped with enhancer regions (distal H3K27ac ChIP-seq peaks) or promoter regions (TSS ± 3 kb). Enhancer–promoter (E–P), promoter–promoter (P–P), and enhancer–enhancer (E–E) interactions were also identified. Shared loops were defined as loops with both anchors not shifting more than one bin. Aggregate peak analysis was processed with ‘apa’ in Juicer Tools, which generated aggregate heatmaps and average contact signals.

TF motif analysis

Motifs were identified in H3K27ac ChIP-seq peaks located in loop anchors and the differential ATAC-seq peak regions using findMotifsGenome.pl in Homer [37]. The parameters were set as “-size given”. Only those motifs whose q-values smaller than 0.01 were treated as significantly enriched motifs.

HPC7 analysis

ChIP-seq peaks of transcription factor, FPKM value of RNA-seq and loops identified from pcHi-C was downloaded from public data (GSE48086, GSE22178, E-MTAB-3954) [38,39,40]. Fraction of ChIP-seq peaks overlap** with loop anchors were compared with fractions of peaks overlap** with equal numbers of randomly chosen regions having same length with loop anchors. Genes were classified as PU.1 occupancy on both anchors if there is at least one loop connecting the gene promoter and having both anchors bound by PU.1. Genes with FPKM values ≥ 2 were considered as expressed.

Results

Adult HSPC of zebrafish exhibits hierarchical chromatin structure similar to mammalian cells

In order to reveal the chromatin structure of zebrafish HSPCs for the first time, we performed four replicates of sisHi-C on FACS sorted cd41+gata1-adult HSPCs from three-month-old transgenic zebrafish Tg(cd41:GFP gata1:DsRed). RNA-seq, ATAC-seq, H3K4me3 ChIP-seq and H3K27ac ChIP-seq were also conducted to illustrate the characteristic of HSPC chromatin folding (Fig. 1A). A total of 172842225 valid pairs were obtained from the four Hi-C replicates. GenomeDisco analysis showed replication score of any two replicates are higher than 0.85 at both 50 kb and 100 kb resolution, so we combined the four replicates in the following analysis (Fig. S1A). The Hi-C contact map of zebrafish adult HSPC showed canonical hierarchical chromatin organization at different resolutions, including compartments, TAD and loops, similar to mammalian cells (Figs. 1B and S2).

Fig. 1
figure 1

Characteristics of zebrafish HSPC 3D genome organization. A Schematic diagram of experimental design. B Hi-C contact matrix of chromosome 7 at 100 kb, 50 kb and 10 kb resolution are showed as example. C Chromatin compartmentalization of chromosome 7. The autocorrelation matrices and the first eigenvector profiles are shown. In the first eigenvector, compartment B is colored as blue and A as orange. D Boxplot showing the distribution of gene density and RNA-seq reads density in the A/B compartment. E Genome-wide insulation score profiles around TAD boundaries. F TADs detected in 8-12 Mb region of chromosome 7 in both HSPC and brain are shown as an example. G Aggregate loop plots showing the strength of interactions between HSPC loop anchors. H The distribution of identified interactions on functional elements. E, enhancer; P, promoter; None, neither enhancer nor promoter. I Distribution of transcript per million (TPM) expression value of genes involved and not involved in loops. ***p < 0.001

At the compartment level, about half of the genomic regions were assigned as A and B compartments, respectively (Fig. 1C). We found that A compartments contained more expressed genes (p < 2.22e−16, wilcox.test) and the expression level of encompassed genes was also higher compared with B compartments (p < 2.22e−16, wilcox.test, Fig. 1D). A compartment also showed more enrichment of H3K27ac, H3K4me3 ChIP-seq peaks and ATAC-seq open regions comparing with B compartments (p < 2.22e−16 and < 2.22e−16, wilcox.test, respectively, Fig. S1B). These results showed that A compartments are more active in zebrafish HSPC. At 50 kb resolution, a total of 1643 TADs with a median size of 800 kb were identified. The accuracy of called TADs was verified by the strongest insulation of aggregated boundaries (Fig. 1E). Similar to mammalian cells, the TAD boundaries of zebrafish HSPC enriched for transcribed TSS (Fig. S1C). In order to illustrate the conservation of TAD structures between tissues, we analyzed publicly available Hi-C data of zebrafish brain (GSE134055) [41] and detected 1595 TADs. Of these, 1175 TADs are shared between brain and HSPC (Fig. 1F). The overlap ratio almost approach that of biological replicates of HSPC (Fig. S1D). Although most majority of TADs are conserved, there are 420 brain- and 468 adult HSPC-specific TAD boundaries. We analyzed the function of genes located in tissue-specific TAD boundaries and found that enriched pathways are related to the function of specific tissues. For example, phospholipid metabolic process, which is important for brain function, is most enriched in brain specific boundaries (Fig. S1E) [42]. Finally, at the loop level, a total of 4189 loops were detected in adult HSPC. The aggregated peak analysis (APA) showed high confidence of identified loops (Fig. 1G). We identified distal enhancers genome-widely taking advantage of H3K27ac and H3K4me3 ChIP-seq data, and loops were assigned to functional elements. We found that most majority of identified loops connect enhancer (E) or promoter (P) (Fig. 1H). In addition, expression of genes involved in loops are significantly higher than genes not connected by loops, implying the functionality of detected loops (p < 2.22e−16, wilcox.test, Fig. 1l).

In summary, we conducted the first investigation into the chromatin conformation of zebrafish HSPC and discovered a hierarchical organization with similar features as mammalian cells.

Dispersed chromatin structure in zebrafish nascent HSPC

We want to reveal the reprogramming of chromatin structure and its contribution to the development of HSPC. Nascent HSPCs from the AGM region at 36 hpf, as well as fetal HSPCs from the CHT region at 3 dpf were collected and performed sisHi-C for at least two replicates (Fig. 2A). A total of 26350389 and 118555641 valid pairs were obtained for nascent and fetal HSPC, respectively (Table S1). High reproducibility of Hi-C experiments was validated by a median GenomeDISCO score of nearly 0.8 for all replicates (Fig. S3A). To make the Hi-C data of different stages comparable, we downsampled the pooled valid pairs to the number of nascent HSPC.

Fig. 2
figure 2

Global reorganization of chromatin structure during zebrafish HSPC development. A Schematic representation of chromatin conformation detection for nascent HSPCs in the AGM region at 36hpf and fetal HSPCs in the CHT region at 3dpf. B A 45 Mb region of chromosome 7 is shown with 50-kb resolution as an example of the contact maps during the process of HSPC development. C Contact heatmaps of fetal and adult HSPC were subtracted by that of nascent HSPC for the same region as B. D Contact frequency decay curves at different stages of HSPC development. E Quantification of the intra-chromosomal disorder in chromatin structure using Von Neumann Entropy (VNE)

Hi-C contact maps are more similar between fetal HSPC and adult HSPC, but substantially different from nascent HSPC (Fig. 2B). Firstly, median GenomoDisco score of Hi-C contact maps were 0.844 and 0.904 comparing fetal HSPC with nascent and adult HSPC, respectively (Fig. S3B). Secondly, interactions in nascent HSPC are concentrated near diagonal area, while more long-range interactions spanning dozens of megabases were observed in fetal and adult HSPC visually. This trend was obvious when subtracting contact matrix of fetal and adult HSPC by the matrix of nascent stage (Fig. 2C). Thirdly, contact frequency decay curves of fetal and adult HSPC are more similar (Fig. 2D). Jensen–Shannon divergence (JSD) was 0.0017 and 0.0012 comparing fetal HSPC with nascent and adult HSPC, respectively. Especially, decay curves showed depletion of contacts at distance of ~ 10 Mb and slower decrease at distance of ~ 30 Mb in nascent HSPC, which reminded a more relaxed chromatin organization [12]. In addition, higher proportion of inter-chromosomal interactions in nascent HSPC also indicated loose structure (Table S1).

Chromatin structure, transcriptome and chromatin accessibility indicate more relaxed structure of nascent HSPC. Chromosomal level Von Neumann Entropy (VNE) index was calculated to quantify chromatin disorder (Fig. 2E), and the result showed that the entropy of nascent HSPC was significantly higher than that of fetal and adult HSPC, indicating more disordered organization. We also compared transcriptome and chromatin accessibility of the three stages (CRA001858) [43]. Clustering analysis showed that transcriptome changes are more pronounced from nascent to fetal stages (Fig. S3C). Importantly, the number of significantly downregulated genes is more than twice of upregulated genes from nascent to fetal HSPC (Fig. S3D). Chromatin openness exhibit similar feature with more pronounced difference between nascent and fetal HSPC, and nearly two folds of regions become closed than that become accessible (Fig. S3E and F). These results showed that transcription and chromatin availability were consistent with chromatin structure and support more relaxed conformation of nascent HSPC.

In conclusion, gene transcription, chromatin accessibility and 3D genome structure were dynamic changed during zebrafish HSPC development, especially from nascent to fetal stages. Chromatin of nascent HSPC was more relaxed.

Coordination of compartments and chromatin accessibility in transcriptional regulation

We subsequently explored the reprogramming of the 3D genome at the sub-chromosome level. Using contact maps at 100 kb resolution, we identified 47.7–49% of the genome as accessible A compartments (Fig. 3A). These regions exhibit higher gene density and transcriptional activity compared with B compartments at all stages (Fig. S4A). We found that switching of A/B compartments affect gene expression and cell function. About 18% and 12% of genomic regions undergo compartment changes from nascent to fetal HSPC and from fetal to adult HSPC, respectively (Fig. S4B). Genes contained within regions changing from B to A compartment tend to be upregulated, while those contained in A to B tend to be downregulated (Fig. 3B). In addition, function of these genes was correlated with HSPC stage-specific characteristics. For example, Genes switched from B to A compartment and showed transcriptional upregulation from nascent to fetal HSPC enriched in pathways of ‘RNA processing’ and ‘ribosome biogenesis’ (Fig. 3C), while from fetal to adult HSPC, ‘lipid biosynthetic’ and ‘phagocytosis’ related pathways are enriched (Fig. S4C), in accordance with the rapid proliferation of fetal HSPC and adaptive immunity of adult HSPC [44, 5A). Of these, 1218 loops were shared (with both anchors shifting no more than one bin) between fetal and adult HSPC (Fig. 5B). Further quantitative analysis showed the alteration of loops is actually minor. We plot the APA profile of the 2971 adult HSPC-specific loops using nascent and fetal contact matrix, and found that interactions between anchors of these loops are also higher than neighboring regions in both nascent and fetal stages (Fig. S6A). In addition, cosine similarity of contact frequencies of these 2971 loops were as high as 0.87 and 0.93 when comparing nascent with fetal HSPC and fetal with adult HSPC, respectively. These observations indicated that adult HSPC specific contacts actually had been concentrated in nascent and fetal stages, which is further strengthened at adult HSPC.

Fig. 5
figure 5

PU.1 mediate promoter involved chromatin loo** in adult HSPC. A Aggregate loop plots showing the strength of interactions between fetal HSPC loop anchors. B Overlap of loops between fetal and adult HSPC. C Bubble plots showing gene expression and TF motif enrichment identified at H3K27ac peak region in adult HSPC loop anchors. Enriched p-value was calculated by HOMER. D Bar chart showing the overlap proportion of the TF-binding peaks with the loop anchors (blue) or background random selected regions (orange) for HPC7 cell. E Interaction score for different groups of loops based on PU.1 occupancy on loop anchors. F Proportion of expressed genes for different groups of genes based on PU.1 occupancy on loops connecting gene promoter. G Transcriptional level for expressed genes in different groups as in E. H Gene ontology analysis for genes having loops connecting gene promoter with both anchors occupied by PU.1. I Promoter capture Hi-C loops, PU.1 peaks and signal, as well as RNA-seq signal was present near Akt2 gene for HPC7 cell line

We attempt to identify transcriptional factors with role in mediating chromatin interactions and transcriptional activation in zebrafish HSPC. Focusing on transcriptional regulatory loops, we did motif analysis on H3K27ac ChIP-seq peak regions located in loop anchors of adult HSPC. Result showed that in addition to ETS-domain transcription factor family, such as PU.1, 31 other TFs were identified whose motifs are enriched (p < 0.01). As our results showed loop interactions of adult HSPC were already existed in nascent and fetal stages, so we paid attention to TFs that are expressed in all three stages. The homologs of 15 TFs in zebrafish are expressed in all three stages based on RNA-seq data (Fig. 5C). Several TFs with known function in HSPC development were identified, such as TAL1, RUNX1, GATA1 and PU.1 [38, 49]. The most enriched TFs are YY1 and PU1. In addition, expression of PU.1 was gradually increased, which may contribute the enhanced interaction strength of adult HSPC. However, YY1 transcriptional level kept relatively stable during HSPC development (Fig. 5C). These results indicated that PU.1 may mediate loop interactions in zebrafish HSPC.

In order to clarify the potential of identified TFs mediating loop structures and regulate transcription, we utilized a common blood stem/progenitor cell model HPC-7, which has ChIP-seq data of 5 (PU.1, GATA2, RUNX1, SCL and MEIS1) out of the 15 candidate TFs as well as high-resolution promoter-capture Hi-C (pcHi-C) and RNA-seq data (GSE48086, GSE22178, E-MTAB-3954) [38,39,40]. We calculated the frequency of the 5 TF peaks that overlap with the interacting fragments identified by pcHi-C, and compared it with randomly picked noninteracting control regions. All of the 5 TFs showed significant enrichment at interacting regions indicating their potential role in genomic loo** (Fig. 5D). We also calculated number of loops that have specific TF binding, and found PU.1 was most frequently present on loop anchor (Fig. S6B). Nearly half of promoter-involved loops have PU.1 binding on both or one anchor. To investigate function of PU.1 in mediating chromatin loops, we directly compared the pcHi-C score of loops with both anchors, one anchor or no anchor having PU.1 binding. Although all these interactions were called as loops, interaction score are significantly higher for loops with both anchors having PU.1 binding than loops with one anchor binding. The difference is also significant for loops with one PU.1 binding than no binding (Fig. 5E). The result indicated that PU.1 may mediate or at least strengthen the loo** interactions. We further analyzed influence of PU.1 binding to transcription. Genes were classified into three groups based on PU.1 occupancy on both anchors, one anchor or no anchor of loops connecting gene promoters. We found more proportion of genes are expressed and the gene expression level is significantly higher in genes with both anchors binding by PU.1 than genes with one anchor binding by PU.1 (Fig. 5F and G). This difference is also obvious when comparing genes with one anchor binding by PU.1 and no anchor binding. Gene ontology analysis showed that genes with two loop anchors occupied by PU.1 are enriched in ‘cell cycle’ and ‘immune system’ related pathways, in accordance with the rapid proliferation and multipotent hematopoietic differentiation of HPC7 cells (Fig. 5H). Several well-known genes important for cell cycle and immune reaction having PU.1 binding on both anchors of loops connecting its promoter, such as Akt2 and Cdk2 (Figs. 5I and S6C) [50].

Taken together, our results implied that PU.1 may mediate 3D genome loo** interactions and potentially regulate gene expression of HSPC.

Discussion

By integrating multi-omics datasets generated from 3D genome structure, transcriptome, chromatin accessibility as well as histone modification, this study reports the structural dynamic of multi-layered 3D genome and its contribution to shape HSPC ontogeny in zebrafish for the first time. In particular, PU.1 was detected and verified by public data that potentially mediates chromatin loop formation and regulates gene expression as well as HSPC characteristics.

We found that chromatin of zebrafish HSPC are organized into hierarchic structure with similar feature as mammalian cells. During development of HSPC, the obscure 3D genome structure in nascent HPSC was strengthened in fetal and adult HSPC at all layers, including compartments, TADs and loops. Integrating with studies in mouse, the reprogramming of chromatin structure during HSPC development have commonalities and specificity between species. Murine fetal and adult HSPCs preserved large-scale compartments and TADs structure, while intra-TAD interactions are more dynamic [15]. This is in accordance with our results, which showed highly similar contact frequency decay curves as well as conserved position and comparable strength for both compartments and TADs between zebrafish fetal and adult HSPC. The loop structure was more variable, with fewer loops and weaker strength in fetal HSPC compared with adult HSPC. However, although murine nascent HSPC did not show impaired chromatin structure [16], our results in zebrafish nascent HSPC illustrated more relaxed chromatin organization and compromised strength for compartmentalization and TADs. The disordered structure in zebrafish nascent HSPC was supported by changes in chromatin accessibility, gene expression as well as the chromatin entropy. This may underlie the substantial molecular and phenotypical differences between nascent and fetal HSPCs showed by previous studies [5, 51]. In addition, in agreement with 3D structure in early mammalian embryos is obscure but gradually enhanced during development [52], the relatively relaxed structure highlights a highly plastic state at the early stages of HSPC development and may be important for transitions from endothelial to hematopoietic properties.

The ETS-family transcription factor PU.1 is a key regulator of hematopoiesis. PU.1 is activated in HSPC and is expressed in mast cells, B cells, granulocytes, and macrophages but is switched off in T cells. Previous studies illustrate that PU.1 play crucial roles in the development of both myeloid and lymphoid lineages as well as lymphoid-primed multipotent progenitors [53,54,55]. For HSPC, PU.1 is important for maintenance or expansion of HSPC number in murine fetal liver [56], and for homing and long-term engraftment in the bone marrow [57]. In addition, bone marrow HSCs disrupted with PU.1 in situ could not maintain hematopoiesis and were outcompeted by normal HSCs. PU.1 also limits hematopoietic stem cell expansion and prevents exhaustion of adult HSPC [58]. These results illustrate multiple functions of PU.1 in HSPC development, maintenance and differentiation [59]. We provided evidence that PU.1 may regulate HSPC gene expression through mediating chromatin loops. Some studies have illustrated PU.1 can function as loop mediator at specific loci or genes [60, 61]. As far as we know, for the first time, our results proposed the genome-widely structural function of PU.1 in mediating enhancer-promoter interactions in HSPC. In addition, the evidence from murine HPC7 support the conservation of PU.1 structural roles between species. One recent study in murine HSPC highlighted RUNX1 engaged in chromatin interactions and promoted hematopoiesis. Interestingly, RUNX1 and PU.1 were shown to have physical interactions [62], and the relationship of these two proteins as well as other interaction partners in mediating chromatin interactions in HSPC needs further investigation. In addition, more direct evidence, such as PU.1 HiChIP [63] or ChIP-loop [64], is needed to validate the function of PU.1 in mediating chromatin interactions. The rarity of in vivo HSPC especially in nascent and fetal stage is a limitation, and the analysis maybe achieved with the development of low-input detection methods in the future.

In summary, zebrafish is a widely used model system for HSPC research, and to our knowledge, this is the first research studying the feature of chromatin conformation and its dynamic during HSPC development in zebrafish. We revealed contribution of 3D genome reprogramming to transcriptional regulation and HSPC fate transition. Zebrafish nascent HSPC is featured by the loose structure that is not observed in mouse, which emphasized the species specificity. In addition, runx1-engaged enhancer-promoter interactions were found to promote hematopoiesis during the emergency of nascent HSPC in mouse. Our study paid more attention to the regulatory role of 3D genome to the development of HSPC and revealed PU.1 mediating chromatin loop formation and potentially regulating gene expression during HSPC development. We believe research from different species will expand our understanding of the regulation mechanism of HSPC fate determination.

Conclusions

Our findings demonstrate that the chromatin organization of zebrafish HSPC resemble mammalian cells with similar hierarchical structure. Nascent HSPC is featured by loose conformation with obscure structure at all layers. Notably, PU.1 was identified as a potential factor mediating the formation of promoter-involved loops and regulating gene expression of HSPC. Our results provided a global view of chromatin structure dynamics associated with development of zebrafish HSPC and discovered key transcription factor involved in HSPC chromatin interactions, which will provide new insights into the epigenetic regulatory mechanisms underlying vertebrate HSPC fate decision.