ImputeCC Enhances Integrative Hi-C-Based Metagenomic Binning Through Constrained Random-Walk-Based Imputation

  • Conference paper
  • First Online:
Research in Computational Molecular Biology (RECOMB 2024)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14758))

Abstract

Metagenomic Hi-C (metaHi-C) enables the recognition of relationships between contigs in terms of their physical proximity within the same cell, facilitating the reconstruction of high-quality metagenome-assembled genomes (MAGs) from complex microbial communities. However, current Hi-C-based contig binning methods solely depend on Hi-C interactions between contigs to group them, ignoring invaluable biological information, including the presence of single-copy marker genes. Here, we introduce ImputeCC, an integrative contig binning tool tailored for metaHi-C datasets. ImputeCC integrates Hi-C interactions with the inherent discriminative power of single-copy marker genes, initially clustering them as preliminary bins, and develops a new constrained random walk with restart (CRWR) algorithm to improve Hi-C connectivity among these contigs. Extensive evaluations on mock and real metaHi-C datasets from diverse environments, including the human gut, wastewater, cow rumen, and sheep gut, demonstrate that ImputeCC consistently outperforms other Hi-C-based contig binning tools. ImputeCC’s genus-level analysis of the sheep gut microbiota further reveals its ability and potential to recover essential species from dominant genera such as Bacteroides, detect previously unrecognized genera, and shed light on the characteristics and functional roles of genera such as Alistipes within the sheep gut ecosystem.

Availability: ImputeCC is implemented in Python and available at https://github.com/dyxstat/ImputeCC. The Supplementary Information is available at https://doi.org/10.5281/zenodo.10776604.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 109.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 139.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free ship** worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Albertsen, M., Hugenholtz, P., Skarshewski, A., Nielsen, K.L., Tyson, G.W., Nielsen, P.H.: Genome sequences of rare, uncultured bacteria obtained by differential coverage binning of multiple metagenomes. Nat. Biotechnol. 31(6), 533–538 (2013)

    Article  Google Scholar 

  2. Baudry, L., Foutel-Rodier, T., Thierry, A., Koszul, R., Marbouty, M.: MetaTOR: a computational pipeline to recover high-quality metagenomic bins from mammalian gut proximity-ligation (me) libraries. Front. Genet. 10, 753 (2019)

    Article  Google Scholar 

  3. Bickhart, D.M., Kolmogorov, M., Tseng, E., Portik, D.M., Korobeynikov, A., Tolstoganov, I., Uritskiy, G., Liachko, I., Sullivan, S.T., Shin, S.B., et al.: Generating lineage-resolved, complete metagenome-assembled genomes from complex microbial communities. Nat. Biotechnol. 40(5), 711–719 (2022)

    Article  Google Scholar 

  4. Bickhart, D.M., Watson, M., Koren, S., Panke-Buisse, K., Cersosimo, L.M., Press, M.O., Van Tassell, C.P., Van Kessel, J.A.S., Haley, B.J., Kim, S.W., et al.: Assignment of virus and antimicrobial resistance genes to microbial hosts in a complex microbial community by combined long-read assembly and proximity ligation. Genome Biol. 20, 153 (2019)

    Article  Google Scholar 

  5. Burton, J.N., Liachko, I., Dunham, M.J., Shendure, J.: Species-level deconvolution of metagenome assemblies with Hi-C–based contact probability maps. G3 (Bethesda) 4(7), 1339–1346 (2014)

    Google Scholar 

  6. Bushnell, B.: BBMap: a fast, accurate, splice-aware aligner. Tech. rep., Lawrence Berkeley National Lab.(LBNL), Berkeley, CA (United States) (2014)

    Google Scholar 

  7. Chaumeil, P.A., Mussig, A.J., Hugenholtz, P., Parks, D.H.: GTDB-Tk v2: memory friendly classification with the genome taxonomy database. Bioinformatics 38(23), 5315–5316 (2022)

    Article  Google Scholar 

  8. Chen, Y., Wang, Y., Paez-Espino, D., Polz, M.F., Zhang, T.: Prokaryotic viruses impact functional microorganisms in nutrient removal and carbon cycle in wastewater treatment plants. Nat. Commun. 12, 5398 (2021)

    Article  Google Scholar 

  9. Chklovski, A., Parks, D.H., Woodcroft, B.J., Tyson, G.W.: CheckM2: a rapid, scalable and accurate tool for assessing microbial genome quality using machine learning. Nat. Methods 20, 1203–1212 (2023)

    Article  Google Scholar 

  10. DeMaere, M.Z., Darling, A.E.: Sim3C: simulation of Hi-C and Meta3C proximity ligation sequencing technologies. GigaScience 7(2), gix103 (2018)

    Google Scholar 

  11. DeMaere, M.Z., Darling, A.E.: bin3C: exploiting Hi-C sequencing data to accurately resolve metagenome-assembled genomes. Genome Biol. 20, 46 (2019)

    Article  Google Scholar 

  12. Du, Y., Fuhrman, J.A., Sun, F.: ViralCC retrieves complete viral genomes and virus-host pairs from metagenomic Hi-C data. Nat. Commun. 14, 502 (2023)

    Article  Google Scholar 

  13. Du, Y., Laperriere, S.M., Fuhrman, J., Sun, F.: Normalizing Metagenomic Hi-C Data and Detecting Spurious Contacts Using Zero-Inflated Negative Binomial Regression. J. Comput. Biol. 29, 106–120 (2022)

    Article  Google Scholar 

  14. Du, Y., Sun, F.: HiCBin: binning metagenomic contigs and recovering metagenome-assembled genomes using Hi-C contact maps. Genome Biol. 23, 63 (2022)

    Article  Google Scholar 

  15. Du, Y., Sun, F.: MetaCC allows scalable and integrative analyses of both long-read and short-read metagenomic Hi-C data. Nat. Commun. 14, 6231 (2023)

    Article  Google Scholar 

  16. Finn, R.D., Clements, J., Eddy, S.R.: HMMER web server: interactive sequence similarity searching. Nucl Acids Res 39(suppl_2), W29–W37 (2011)

    Google Scholar 

  17. Handelsman, J.: Metagenomics: application of genomics to uncultured microorganisms. Microbiol. Mol. Biol. Rev. 68(4), 669–685 (2004)

    Article  Google Scholar 

  18. Hugenholtz, P., Tyson, G.W.: Metagenomics. Nature 455(7212), 481–483 (2008)

    Google Scholar 

  19. Hugerth, L.W., Larsson, J., Alneberg, J., Lindh, M.V., Legrand, C., Pinhassi, J., Andersson, A.F.: Metagenome-assembled genomes uncover a global brackish microbiome. Genome Biol. 16, 279 (2015)

    Article  Google Scholar 

  20. Karp, R.M.: An algorithm to solve the m\(\times \) n assignment problem in expected time O (mn log n). Networks 10(2), 143–152 (1980)

    Article  MathSciNet  Google Scholar 

  21. Li, H.: Aligning sequence reads, clone sequences and assembly contigs with BWA-MEM. ar**v (2013). 10.48550/ar**v.1303.3997

    Google Scholar 

  22. Marbouty, M., Cournac, A., Flot, J.F., Marie-Nelly, H., Mozziconacci, J., Koszul, R.: Metagenomic chromosome conformation capture (meta3C) unveils the diversity of chromosome organization in microorganisms. eLife 3, e03318 (2014)

    Google Scholar 

  23. Meslier, V., Quinquis, B., Da Silva, K., Plaza Oñate, F., Pons, N., Roume, H., Podar, M., Almeida, M.: Benchmarking second and third-generation sequencing platforms for microbial metagenomics. Sci Data 9(1), 694 (2022)

    Article  Google Scholar 

  24. Nissen, J.N., Johansen, J., Allesøe, R.L., Sønderby, C.K., Armenteros, J.J.A., Grønbech, C.H., Jensen, L.J., Nielsen, H.B., Petersen, T.N., Winther, O., et al.: Improved metagenome binning and assembly using deep variational autoencoders. Nat. Biotechnol. 39, 555–560 (2021)

    Article  Google Scholar 

  25. Ondov, B.D., Treangen, T.J., Melsted, P., Mallonee, A.B., Bergman, N.H., Koren, S., Phillippy, A.M.: Mash: fast genome and metagenome distance estimation using MinHash. Genome Biol. 17, 132 (2016)

    Article  Google Scholar 

  26. Parks, D.H., Imelfort, M., Skennerton, C.T., Hugenholtz, P., Tyson, G.W.: CheckM: assessing the quality of microbial genomes recovered from isolates, single cells, and metagenomes. Genome Res. 25(7), 1043–1055 (2015)

    Article  Google Scholar 

  27. Press, M.O., Wiser, A.H., Kronenberg, Z.N., Langford, K.W., Shakya, M., Lo, C.C., Mueller, K.A., Sullivan, S.T., Chain, P.S., Liachko, I.: Hi-C deconvolution of a human gut microbiome yields high-quality draft genomes and reveals plasmid-genome interactions. bioRxiv (2017). 10.1101/198713

    Google Scholar 

  28. Rao, S.S., Huntley, M.H., Durand, N.C., Stamenova, E.K., Bochkov, I.D., Robinson, J.T., Sanborn, A.L., Machol, I., Omer, A.D., Lander, E.S., et al.: A 3D map of the human genome at kilobase resolution reveals principles of chromatin loo**. Cell 159(7), 1665–1680 (2014)

    Article  Google Scholar 

  29. Reichardt, J., Bornholdt, S.: Statistical mechanics of community detection. Phys. Rev. E 74(1), 016110 (2006)

    Article  MathSciNet  Google Scholar 

  30. Rho, M., Tang, H., Ye, Y.: FragGeneScan: predicting genes in short and error-prone reads. Nucl Acids Res 38(20), e191–e191 (2010)

    Article  Google Scholar 

  31. Routy, B., Gopalakrishnan, V., Daillère, R., Zitvogel, L., Wargo, J.A., Kroemer, G.: The gut microbiota influences anticancer immunosurveillance and general health. Nat. Rev. Clin. Oncol. 15, 382–396 (2018)

    Article  Google Scholar 

  32. Stalder, T., Press, M.O., Sullivan, S., Liachko, I., Top, E.M.: Linking the resistome and plasmidome to the microbiome. ISME J. 13(10), 2437–2446 (2019)

    Article  Google Scholar 

  33. Traag, V.A., Waltman, L., Van Eck, N.J.: From Louvain to Leiden: guaranteeing well-connected communities. Sci. Rep. 9, 5233 (2019)

    Article  Google Scholar 

  34. Wu, Y.W., Tang, Y.H., Tringe, S.G., Simmons, B.A., Singer, S.W.: MaxBin: an automated binning method to recover individual genomes from metagenomes using an expectation-maximization algorithm. Microbiome 2(26) (2014)

    Google Scholar 

  35. Yaffe, E., Relman, D.A.: Tracking microbial evolution in the human gut using Hi-C reveals extensive horizontal gene transfer, persistence and adaptation. Nat. Microbiol. 5(2), 343–353 (2020)

    Article  Google Scholar 

  36. Yatsunenko, T., Rey, F.E., Manary, M.J., Trehan, I., Dominguez-Bello, M.G., Contreras, M., Magris, M., Hidalgo, G., Baldassano, R.N., Anokhin, A.P., et al.: Human gut microbiome viewed across age and geography. Nature 486, 222–227 (2012)

    Article  Google Scholar 

Download references

Acknowledgments

Y.D. and F.S. conceived the ideas and designed the study. Y.D. implemented the methods, carried out the computational analyses, and drafted the manuscript. Y.D. and W.Z. developed the software. All authors modified and finalized the paper. The research is partially funded by NSF grant EF-2125142. The authors declare no competing interests.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Fengzhu Sun .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Du, Y., Zuo, W., Sun, F. (2024). ImputeCC Enhances Integrative Hi-C-Based Metagenomic Binning Through Constrained Random-Walk-Based Imputation. In: Ma, J. (eds) Research in Computational Molecular Biology. RECOMB 2024. Lecture Notes in Computer Science, vol 14758. Springer, Cham. https://doi.org/10.1007/978-1-0716-3989-4_7

Download citation

  • DOI: https://doi.org/10.1007/978-1-0716-3989-4_7

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-1-0716-3988-7

  • Online ISBN: 978-1-0716-3989-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Navigation