Abstract
The development of Hi-C technology has generated terabytes of chromatin interaction data, which bring possibilities for insight study of chromatin structure. Several studies revealed that mammalian chromosomes are folded into topological associated domains (TADs), which are conserved across cell types. Accurate detection of topological associated domains is now a vital process for revealing the relationship between the structure and function of genome organization. Unfortunately, the current TAD detection methods require massive computing resources, careful parameter adjustment and/or encounter inconsistent results. In this paper, we propose a novel method, Spectral-Based TAD Detector (SBTD), and evaluate its performance with a set of widely accepted statistical methods. We treat the chromatin interaction matrix as a graph and first introduce cosine similarity as a measure of the interaction patterns between bins. The results show that SBTD identifies higher quality TADs than the popular methods (DomainCaller, TopDom and SpectralTAD) and the internal bins of TADs identified by SBTD have higher correlation. Besides, The TADs identified by SBTD show a highly similar histone modification signal enrichment pattern at the boundary as reported in the previous literature. Finally, the motif enrichment analysis shows that compared with the background region, the DNA motifs of known insulator proteins are significantly enriched in the TAD boundary region identified by our method, which proves the high performance of our proposed method. Overall, SBTD is much more effective than existing methods with only one easy-to-adjust parameter, cluster number, for which we provide optimization guidelines.
Graphic abstract
Similar content being viewed by others
Availability of data and materials
The Hi-C data sets of human cell types (hESC and IMR90) and mouse cell types (mCortex and mESC) were downloaded from the GEO database under the GEO accession number GSE35156. They can also be downloaded from http://chromosome.sdsc.edu/mouse/hi-c/download.html.
Code availability
The source code is available here: https://github.com/chunlin-long/SBTD.
References
Cremer T, Cremer C (2001) Chromosome territories, nuclear architecture and gene regulation in mammalian cells. Nat Rev Genet 2(4):292–301. https://doi.org/10.1038/35066075
Sexton T, Schober H, Fraser P, Gasser SM (2007) Gene regulation through nuclear organization. Struct Mol Biol 14(11):1049
Lanctot C, Cheutin T, Cremer M, Cavalli G, Cremer T (2007) Dynamic genome architecture in the nuclear space: regulation of gene expression in three dimensions. Nat Rev Genet 8(2):104–115. https://doi.org/10.1038/nrg2041
Dekker J, Rippe K, Dekker M, Kleckner N (2002) Capturing chromosome conformation. Science 295(5558):1306–1311. https://doi.org/10.1126/science.1067799
Dostie J, Richmond TA, Arnaout RA, Selzer RR, Lee WL, Honan TA, Rubio ED, Krumm A, Lamb J, Nusbaum C, Green RD, Dekker J (2006) Chromosome conformation capture carbon copy (5C): a massively parallel solution for map** interactions between genomic elements. Genome Res 16(10):1299–1309. https://doi.org/10.1101/gr.5571506
Lieberman-Aiden E, van Berkum NL, Williams L, Imakaev M, Ragoczy T, Telling A, Amit I, Lajoie BR, Sabo PJ, Dorschner MO, Sandstrom R, Bernstein B, Bender MA, Groudine M, Gnirke A, Stamatoyannopoulos J, Mirny LA, Lander ES, Dekker J (2009) Comprehensive map** of long-range interactions reveals folding principles of the human genome. Science 326(5950):289–293. https://doi.org/10.1126/science.1181369
Dowen JM, Fan ZP, Hnisz D, Ren G, Abraham BJ, Zhang LN, Weintraub AS, Schuijers J, Lee TI, Zhao KJ, Young RA (2014) Control of cell identity genes occurs in insulated neighborhoods in mammalian chromosomes. Cell 159(2):374–387. https://doi.org/10.1016/j.cell.2014.09.030
Dixon JR, Selvaraj S, Yue F, Kim A, Li Y, Shen Y, Hu M, Liu JS, Ren B (2012) Topological domains in mammalian genomes identified by analysis of chromatin interactions. Nature 485(7398):376–380. https://doi.org/10.1038/nature11082
Sofueva S, Yaffe E, Chan WC, Georgopoulou D, Rudan MV, Mira-Bontenbal H, Pollard SM, Schroth GP, Tanay A, Hadjur S (2013) Cohesin-mediated interactions organize chromosomal domain architecture. EMBO J 32(24):3119–3129. https://doi.org/10.1038/emboj.2013.237
Dekker J, Heard EJFL (2016) Structural and functional diversity of topologically associating domains. FEBS Lett 589(20):2877–2884
Gorkin DU, Leung D, Ren B (2014) The 3D genome in transcriptional regulation and pluripotency. Cell Stem Cell 14(6):762–775. https://doi.org/10.1016/j.stem.2014.05.017
Pope BD, Ryba T, Dileep V, Yue F, Wu WS, Denas O, Vera DL, Wang YL, Hansen RS, Canfield TK, Thurman RE, Cheng Y, Gulsoy G, Dennis JH, Snyder MP, Stamatoyannopoulos JA, Taylor J, Hardison RC, Kahveci T, Ren B, Gilbert DM (2014) Topologically associating domains are stable units of replication-timing regulation. Nature 515(7527):402. https://doi.org/10.1038/nature13986
Taberlay PC, Achinger-Kawecka J, Lun ATL, Buske FA, Sabir K, Gould CM, Zotenko E, Bert SA, Giles KA, Bauer DC, Smyth GK, Stirzaker C, O’Donoghue SI, Clark SJ (2016) Three-dimensional disorganization of the cancer genome occurs coincident with long-range genetic and epigenetic alterations. Genome Res 26(6):719–731. https://doi.org/10.1101/gr.201517.115
Huang H, Chen ST, Titus KR, Emerson DJ, Bassett DS, Phillips-Cremins JE (2019) A subset of topologically associating domains fold into mesoscale core-periphery networks. Sci Rep 9:9526. https://doi.org/10.1038/s41598-019-45457-9
Symmons O, Uslu VV, Tsujimura T, Ruf S, Nassari S, Schwarzer W, Ettwiller L, Spitz F (2014) Functional and topological characteristics of mammalian regulatory domains. Genome Res 24(3):390–400. https://doi.org/10.1101/gr.163519.113
Dixon JR, Gorkin DU, Ren B (2016) Chromatin domains: the unit of chromosome organization. Mol Cell 62(5):668–680. https://doi.org/10.1016/j.molcel.2016.05.018
Harmston N, Ing-Simmons E, Tan G, Perry M, Merkenschlager M, Lenhard B (2017) Topologically associating domains are ancient features that coincide with metazoan clusters of extreme noncoding conservation. Nat Commun 8:441. https://doi.org/10.1038/s41467-017-00524-5
Dixon JR, Jung I, Selvaraj S, Shen Y, Antosiewicz-Bourget JE, Lee AY, Ye Z, Kim A, Rajagopal N, **e W, Diao YR, Liang J, Zhao HM, Lobanenkov VV, Ecker JR, Thomson JA, Ren B (2015) Chromatin architecture reorganization during stem cell differentiation. Nature 518(7539):331–336. https://doi.org/10.1038/nature14222
Chen J, Hero AO III, Rajapakse I (2016) Spectral identification of topological domains. Bioinformatics 32(14):2151–2158. https://doi.org/10.1093/bioinformatics/btw221
Shin HJ, Shi Y, Dai C, Tjong H, Gong K, Alber F, Zhou XJ (2016) TopDom: an efficient and deterministic method for identifying topological domains in genomes. Nucleic Acids Res 44(7):13. https://doi.org/10.1093/nar/gkv1505
Cresswell KG, Stansfield JC, Dozmorov MGJBB (2020) Correction to: SpectralTAD: an R package for defining a hierarchy of topologically associated domains using spectral clustering. BMC Bioinform 21(1):373
Levy-Leduc C, Delattre M, Mary-Huard T, Robin S (2014) Two-dimensional segmentation for analyzing Hi-C data. Bioinformatics 30(17):I386–I392. https://doi.org/10.1093/bioinformatics/btu443
Oluwadare O, Cheng JL (2017) ClusterTAD: an unsupervised machine learning approach to detecting topologically associated domains of chromosomes from Hi-C data. BMC Bioinform 18:14. https://doi.org/10.1186/s12859-017-1931-2
Haddad N, Vaillant C, Jost D (2017) IC-Finder: inferring robustly the hierarchical organization of chromatin folding. Nucleic Acids Res 45(10):e81. https://doi.org/10.1093/nar/gkx036
Ye YS, Gao L, Zhang SH (2019) MSTD: an efficient method for detecting multi-scale topological domains from symmetric and asymmetric 3D genomic maps. Nucleic Acids Res 47(11):11. https://doi.org/10.1093/nar/gkz201
Huang JL, Marco E, Pinello L, Yuan GC (2015) Predicting chromatin organization using histone marks. Genome Biol 16:11. https://doi.org/10.1186/s13059-015-0740-z
Hong S, Kim D (2017) Computational characterization of chromatin domain boundary-associated genomic elements. Nucleic Acids Res 45(18):10403–10414. https://doi.org/10.1093/nar/gkx738
Zufferey M, Tavernari D, Oricchio E, Ciriello G (2018) Comparison of computational methods for the identification of topologically associating domains. Genome Biol 19:18. https://doi.org/10.1186/s13059-018-1596-9
GSE35156 (Accessed 13 Oct 2019) Normalized Hi-C data. http://chromosome.sdsc.edu/mouse/hi-c/download.html
Yaffe E, Tanay A (2011) Probabilistic modeling of Hi-C contact maps eliminates systematic biases to characterize global chromosomal architecture. Nature Genet 43(11):1059-U1040. https://doi.org/10.1038/ng.947
Forcato M, Nicoletti C, Pal K, Livi CM, Ferrari F, Bicciato S (2017) Comparison of computational methods for Hi-C data analysis. Nat Methods 14(7):679. https://doi.org/10.1038/nmeth.4325
Gorkin DU, Barozzi I, Zhao Y, Zhang Y, Huang H, Lee AY, Li B, Chiou J, Wildberg A, Ding B, Zhang B, Wang M, Strattan JS, Davidson JM, Qiu Y, Afzal V, Akiyama JA, Plajzer-Frick I, Novak CS, Kato M, Garvin TH, Pham QT, Harrington AN, Mannion BJ, Lee EA, Fukuda-Yuzawa Y, He Y, Preissl S, Chee S, Han JY, Williams BA, Trout D, Amrhein H, Yang H, Cherry JM, Wang W, Gaulton K, Ecker JR, Shen Y, Dickel DE, Visel A, Pennacchio LA, Ren B (2020) An atlas of dynamic chromatin landscapes in mouse fetal development. Nature 583(7818):744. https://doi.org/10.1038/s41586-020-2093-3
von Luxburg U (2007) A tutorial on spectral clustering. Stat Comput 17(4):395–416. https://doi.org/10.1007/s11222-007-9033-z
Won H, Luis TU, Stein JL, Parikshak NN, Huang J, Opland CK, Gandal MJ, Sutton GJ, Hormozdiari F, Lu DJN (2016) Chromosome conformation elucidates regulatory relationships in develo** human brain. Nature 538(7626):523
Van Bortle K, Nichols MH, Li L, Ong CT, Takenaka N, Qin ZS, Corces VG (2014) Insulator function and topological domain border strength scale with architectural protein occupancy. Genome Biol 15(6). https://doi.org/10.1186/gb-2014-15-5-r82
Barski A, Cuddapah S, Cui KR, Roh TY, Schones DE, Wang ZB, Wei G, Chepelev I, Zhao KJ (2007) High-resolution profiling of histone methylations in the human genome. Cell 129(4):823–837. https://doi.org/10.1016/j.cell.2007.05.009
** FL, Li Y, Dixon JR, Selvaraj S, Ye Z, Lee AY, Yen CA, Schmitt AD, Espinoza CA, Ren B (2013) A high-resolution map of the three-dimensional chromatin interactome in human cells. Nature 503(7475):290–294. https://doi.org/10.1038/nature12644
Heinz S, Benner C, Spann N, Bertolino E, Lin YC, Laslo P, Cheng JX, Murre C, Singh H, Glass CK (2010) Simple combinations of lineage-determining transcription factors prime cis-regulatory elements required for macrophage and B cell identities. Mol Cell 38(4):576–589. https://doi.org/10.1016/j.molcel.2010.05.004
Pavlaki I, Docquier F, Chernukhin I, Kita G, Gretton S, Clarkson CT, Teif VB (1861) Klenova E (2018) Poly(ADP-ribosyl)ation associated changes in CTCF-chromatin binding and gene expression in breast cells. Biochim Biophys Acta Gene Regul Mech 8:718–730. https://doi.org/10.1016/j.bbagrm.2018.06.010
Tarjan DR, Flavahan WA, Bernstein BE (2019) Epigenome editing strategies for the functional annotation of CTCF insulators. Nat Commun 10:4258. https://doi.org/10.1038/s41467-019-12166-w
Nguyen P, Cui HM, Bisht KS, Sun LC, Patel K, Lee RS, Kugoh H, Oshimura M, Feinberg AP, Gius D (2008) CTCFL/BORIS is a methylation-independent DNA-binding protein that preferentially binds to the paternal H19 differentially methylated region. Can Res 68(14):5546–5551. https://doi.org/10.1158/0008-5472.Can-08-1005
Vizcaino C, Mansilla S, Portugal J (2015) Sp1 transcription factor: a long-standing target in cancer chemotherapy. Pharmacol Ther 152:111–124. https://doi.org/10.1016/j.pharmthera.2015.05.008
Tan D, Holzner M, Weng M, Srivastava Y, Jauch R (2020) SOX17 in cellular reprogramming and cancer. Semin Cancer Biol 67:65–73
Ghule PN, Seward DJ, Fritz AJ, Boyd JR, van Wijnen AJ, Lian JB, Stein JL, Stein GS (2018) Higher order genomic organization and regulatory compartmentalization for cell cycle control at the G1/S-phase transition. J Cell Physiol 233(10):6406–6413. https://doi.org/10.1002/jcp.26741
Hsu J, Sage J (2016) Novel functions for the transcription factor E2F4 in development and disease. Cell Cycle 15(23):3183–3190. https://doi.org/10.1080/15384101.2016.1234551
Le Dily F, Vidal E, Cuartero Y, Quilez J, Nacht AS, Vicent GP, Carbonell-Caballero J, Sharma P, Villanueva-Canas JL, Ferrari R, De Llobet LI, Verde G, Wright RHG, Beato M (2019) Hormone-control regions mediate steroid receptor-dependent genome organization. Genome Res 29(1):29–39. https://doi.org/10.1101/gr.243824.118
Acknowledgements
This work was supported by the National Natural Science Foundation of China (No. 21775107, 21675114); the National Science and Technology Major Project of the Ministry of Science and Technology under Grant 2018ZX10201002-002-004, 2018/1-2020/12.
Author information
Authors and Affiliations
Corresponding authors
Ethics declarations
Conflict of interest
On behalf of all authors, the corresponding author states that there is no conflict of interest.
Ethics approval
Not applicable.
Consent to participate
Not applicable.
Consent for publication
Not applicable.
Rights and permissions
About this article
Cite this article
Long, C., Liao, Y., Li, Y. et al. SBTD: A Novel Method for Detecting Topological Associated Domains from Hi-C Data. Interdiscip Sci Comput Life Sci 13, 638–651 (2021). https://doi.org/10.1007/s12539-021-00453-4
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12539-021-00453-4