Abstract
The SARS-CoV-2 pandemic has highlighted the need to better define in-hospital transmissions, a need that extends to all other common infectious diseases encountered in clinical settings. To evaluate how whole viral genome sequencing can contribute to deciphering nosocomial SARS-CoV-2 transmission 926 SARS-CoV-2 viral genomes from 622 staff members and patients were collected between February 2020 and January 2021 at a university hospital in Munich, Germany, and analysed along with the place of work, duration of hospital stay, and ward transfers. Bioinformatically defined transmission clusters inferred from viral genome sequencing were compared to those inferred from interview-based contact tracing. An additional dataset collected at the same time at another university hospital in the same city was used to account for multiple independent introductions. Clustering analysis of 619 viral genomes generated 19 clusters ranging from 3 to 31 individuals. Sequencing-based transmission clusters showed little overlap with those based on contact tracing data. The viral genomes were significantly more closely related to each other than comparable genomes collected simultaneously at other hospitals in the same city (n = 829), suggesting nosocomial transmission. Longitudinal sampling from individual patients suggested possible cross-infection events during the hospital stay in 19.2% of individuals (14 of 73 individuals). Clustering analysis of SARS-CoV-2 whole genome sequences can reveal cryptic transmission events missed by classical, interview-based contact tracing, hel** to decipher in-hospital transmissions. These results, in line with other studies, advocate for viral genome sequencing as a pathogen transmission surveillance tool in hospitals.
Similar content being viewed by others
Introduction
In January 2020, the first infections with severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) were detected in Germany. Starting with these very first infections, viral sequencing data was used to better understand the disease transmission1. With the help of viral sequencing data, it has also been possible to determine the origin and approximate time of the introduction of the virus to e.g. New York City, California, Iceland, or Bavaria2,3,4,5. Similarly, genomic analysis has been applied to trace local infection chains in hospitals and care facilities6,7,8. SARS-CoV-2 remains an important challenge to healthcare facilities daily, highlighting the importance of defining transmission pathways to improve the safety of patients and healthcare workers alike. The need to better understand transmission chains advocates for the use of viral genome sequencing. However, so far it is not well-understood how state-of-the-art viral genome sequencing combined with bioinformatics approaches for transmission tracing compares to conventional, interview-based, contact tracing in the setting of healthcare institutions. This extends far beyond the current SARS-CoV-2 pandemic to all common infectious diseases encountered in clinical and hospital settings.
This analysis aimed to understand in-hospital transmission clusters of SARS-CoV-2 at the university medical center of the Technical University of Munich (TUM) (Klinikum rechts der Isar), in Munich, Germany, by SARS-CoV-2 whole viral genome sequencing in combination with two bioinformatically defined clustering approaches compared to interview-based contact tracing.
Methods
Viral whole genome sequences were obtained from residual diagnostic material positive for SARS-CoV-2 by PCR. Samples were collected at Klinikum rechts der Isar in Munich, Germany, from February 3, 2020, to January 10, 2021 (“TUM samples”, Suppl Fig. 1, cf. Suppl Methods). CleanPlex®9 or Artic10 SARS-CoV-2 sequencing panels were used for library preparation. Sequencing was performed on Illumina platforms for a total of 926 samples from 622 probands at three sequencing sites across Germany (cf. Suppl Methods). A proband is defined as any SARS-CoV-2-positive individual included in the study.
Bwa-mem17,18,19 that demonstrate the utility of viral genome sequencing in the identification of transmission clusters within healthcare institutions and beyond and identified divergent clusters between viral genome sequencing and interview-based contact tracing. We found that viral genome clusters derived using two different computational approaches tended to be smaller, more closely related genetically and to be spanning spatially larger portions of the hospital. Similarly, for example, a study using SARS-CoV-2 whole genome sequencing analyses in a tertiary referral hospital in Madrid, Spain, showed that the introduction of five different SARS-CoV-2 strains was responsible for what was assumed to be a homogeneous outbreak due to a single transmission chain by interview-based contact tracing17. Also, the addition of local viral genome sequencing data covering the same time period from outside the hospital resulted in ruling out the involvement of two cases in the outbreak, due to the high probability of community-acquired infections17. Czech-Sioli et al.8, in investigating 284 samples in their analysis, came to similar conclusions that temporally preceding index cases and transmission routes can be missed when using only interview-based contact tracing. Through alignment with GISAID data15, they also showed that placing sequences in a local context is essential to distinguish independent entries from in-hospital transmission. Additionally, as interview-based contact tracing cannot identify cross-infections, the efficacy of containment procedures cannot be assessed.
The bioinformatic analysis of sequencing data provides information of transmission pathways that were not previously suspected. This includes, for example, staff members from service areas, which are not involved in direct patient care, for whom no connection to transmission clusters was expected. Further, by providing largely unbiased, depersonalized information, transmission chain tracing using viral genome sequencing will eliminate the (perceived) denunciation involved in personal contact tracing while at the same time providing more accurate results.
Virus genome sequencing also harbours the potential to better understand cryptic transmission events at micro-scale. Several studies, describing patterns of within-host diversity, found evidence of co-infections20,21,22 and that co-infection with certain strains might be driven by infection from two different sources of infection20,21. This could be similar to our observations where the predominant virus strain in almost 20% of individuals for whom this data was available changed during the hospital stay. While we cannot fully exclude the possibility of sample mix-ups or cross-contaminations, this surprisingly high number of individuals with more than one viral strain during their hospital treatment highlights the continued importance of individual isolation measures in SARS-CoV-2-infected individuals to limit in-hospital viral persistence and spread.
While likely more precise in identifying in-hospital clusters of transmission for SARS-CoV-2, there still are important limitations to the viral genome sequencing approach that challenge its widespread use. It is not always possible to obtain high-quality genomes with quick turn-around times, due to, for instance, low viral load, technical limitations or high costs. This can result in incomplete datasets and, consequently, transmission chains. Also, the necessary technical and computational resources are often only available at larger-scale academic institutions, hampering its widespread implementation in quotidian clinical practice. Lastly, genetic tracing is challenging for newly emerging pathogens with low genetic diversity, as illustrated by our data during the first wave of SARS-CoV-2 infections (January 2020 to June 2020), where it is difficult to distinguish between actual transmission and incidental genetic similarity.
In our experience, interview-based contact tracing alone is not sufficient to fully map transmission pathways and clusters in a large university hospital. Complementation with viral genome sequencing data proofed very beneficial, especially, by highlighting in-hospital transmission chains that were spatially more expansive than expected. This information is of paramount importance, however, to efficiently contain transmission chains.
While the SARS-CoV-2 pandemic provided an impetus to the implementation of viral genome sequencing in clinical practice, the same advantages will also apply to transmission tracing of nearly all other pathogens and metagenomic approaches can further broaden the scope of pathogens detected.
Data availability
All consensus sequences, that met the needed quality requirements, were uploaded to the GISAID repository, https://gisaid.org/, with the according metadata. Genome sequences and associated metadata on GISAID is accessible at https://doi.org/https://doi.org/10.55876/gis8.231229gn. The GISAID supplemental table can be found in the supplementary material.
References
Wölfel, R. et al. Virological assessment of hospitalized patients with COVID-2019. Nature 581(7809), 7809. https://doi.org/10.1038/s41586-020-2196-x (2020).
Gonzalez-Reiche, A. S. et al. Introductions and early spread of SARS-CoV-2 in the New York City area. Science 369(6501), 297–301. https://doi.org/10.1126/science.abc1917 (2020).
Deng, X. et al. Genomic surveillance reveals multiple introductions of SARS-CoV-2 into Northern California. Science 369(6503), 582–587. https://doi.org/10.1126/science.abb9263 (2020).
Gudbjartsson, D. F. et al. Spread of SARS-CoV-2 in the icelandic population. N. Engl. J. Med. 382(24), 2302–2315. https://doi.org/10.1056/NEJMoa2006100 (2020).
Muenchhoff, M. et al. Genomic epidemiology reveals multiple introductions of SARS-CoV-2 followed by community and nosocomial spread, Germany, February to May 2020. Euro Surveill. 26(43), 2002066. https://doi.org/10.2807/1560-7917.ES.2021.26.43.2002066 (2021).
Meredith, L. W. et al. Rapid implementation of SARS-CoV-2 sequencing to investigate cases of health-care associated COVID-19: A prospective genomic surveillance study. Lancet Infect. Dis. 20(11), 1263–1271. https://doi.org/10.1016/S1473-3099(20)30562-4 (2020).
Lucey, M. et al. Whole-genome sequencing to track SARS-CoV-2 transmission in nosocomial outbreaks. Clin. Infect. Dis. https://doi.org/10.1093/cid/ciaa1433 (2021).
Czech-Sioli, M. et al. Integration of sequencing and epidemiological data for surveillance of SARS-CoV-2 infections in a tertiary-care hospital. Clin. Infect. Dis. https://doi.org/10.1093/cid/ciac484 (2022).
UG4001-04-CleanPlex-SARS-CoV-2-Panel-User-guide.pdf. Accessed: Jun. 17, 2021. [Online]. Available: https://www.paragongenomics.com/wp-content/uploads/2021/02/UG4001-04-CleanPlex-SARS-CoV-2-Panel-User-guide.pdf
Mwakibete, H., et al., ARTIC-NEB: SARS-CoV-2 Library PrepV.4. protocols.io, Feb. 2021. [Online]. Available: https://www.protocols.io/view/artic-neb-sars-cov-2-library-prep-br77m9rn.pdf#page=1&zoom=auto,-23,848
Li, H. Aligning sequence reads, clone sequences and assembly contigs with BWA-MEM, ar**v e-prints, ar**v:1303.3997, (2013).
Garrison, E., & Marth, G. Haplotype-based variant detection from short-read sequencing, ar**v e-prints, ar**v:1207.3907, (2012).
Hadfield, J. et al. Nextstrain: Real-time tracking of pathogen evolution. Bioinformatics 34(23), 4121–4123. https://doi.org/10.1093/bioinformatics/bty407 (2018).
Release v13 nextstrain/ncov, GitHub. Accessed: Jan. 13, 2024. [Online]. Available: https://github.com/nextstrain/ncov/releases/tag/v13
Khare, S. et al. GISAID’s role in pandemic response. CCDCW 3(49), 1049–1051. https://doi.org/10.46234/ccdcw2021.255 (2021).
Phipson, B. & Smyth, G. K. Permutation P-values should never be zero: Calculating exact P-values when permutations are randomly drawn. Stat. Appl. Genetics Mol. Biol. https://doi.org/10.2202/1544-6115.1585 (2010).
Pérez-Lago, L. et al. Overlap** of independent SARS-CoV-2 nosocomial transmissions in a complex outbreak. mSphere 6(4), e00389-e421. https://doi.org/10.1128/mSphere.00389-21 (2021).
Baumgarte, S. et al. Investigation of a limited but explosive COVID-19 outbreak in a German secondary school. Viruses 14(1), 87. https://doi.org/10.3390/v14010087 (2022).
Haanappel, C. P. et al. Combining epidemiological data and whole genome sequencing to understand SARS-CoV-2 transmission dynamics in a large tertiary care hospital during the first COVID-19 wave in The Netherlands focusing on healthcare workers. Antimicrob. Resist. Infect. Control 12(1), 46. https://doi.org/10.1186/s13756-023-01247-7 (2023).
Tonkin-Hill, G. et al. Patterns of within-host genetic diversity in SARS-CoV-2. eLife 10, e66857. https://doi.org/10.7554/eLife.66857 (2021).
Pérez-Lago, L. et al. SARS-CoV-2 superinfection and reinfection with three different strains. Transbound. Emerg. Dis. https://doi.org/10.1111/tbed.14352 (2021).
Dezordi, F. Z. et al. Unusual SARS-CoV-2 intrahost diversity reveals lineage superinfection. Microb. Genom 8(3), 000751. https://doi.org/10.1099/mgen.0.000751 (2022).
Acknowledgements
We gratefully acknowledge all data contributors, i.e., the Authors and their Originating laboratories responsible for obtaining the specimens, and their submitting laboratories for generating the genetic sequence and metadata and sharing via the GISAID Initiative, on which this research is based. We thank the whole teams of hospital hygiene and virus diagnostics, and in particular Valentina Beka, Till Bunse, Dieter Hoffmann and Hedwig Roggendorf for their assistance in the project.
Funding
Open Access funding enabled and organized by Projekt DEAL. The study was supported via the National Network of University Medicine (NUM) funded by the Federal German Ministry of Education and Science (BMBF). ECS was supported by the Munich Clinician Scientist Program (MCSP) and the German Research Foundation (DFG, SCHU 2419/2-1). MvK acknowledges funding from the BMBF grant number 01KI2016. AG, HB, UP and OK report support from the Free State of Bavaria via the research initiative Bay-VOC, MM, OK and UP funding from the Bavarian Ministry of Science and Culture via the FOR-COVID consortium.
Author information
Authors and Affiliations
Contributions
E.E., E.C.S., J.G. and U.P. were responsible for the conception and design of the project. The acquisition and analysis of the data was carried out by A.G., A.K., N.H.S., T.M., S.D., A.A., M.S., S.P., C.E., A.R., A.T., M.V.K., F.G., C.P.C., D.H.B., M.M., H.B., O.T.K. and J.G., E.E., E.C.S., A.G., A.K., N.H.S., J.G. and U.P. interpreted the data. All authors were involved in substantial revisions and read and approved the final manuscript.
Corresponding authors
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher's note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Esser, E., Schulte, E.C., Graf, A. et al. Viral genome sequencing to decipher in-hospital SARS-CoV-2 transmission events. Sci Rep 14, 5768 (2024). https://doi.org/10.1038/s41598-024-56162-7
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41598-024-56162-7
- Springer Nature Limited