Recurrent neural network for predicting absence of heterozygosity from low pass WGS with ultra-low depth

Tang, Fei; Wang, Zhonghua; Sun, Yan; Fan, Linlin; Yang, Yun; Guo, Xueqin; Wang, Yaoshen; Yan, Saiying; Qiao, Zhihong; Li, Yun; Jiang, Ting; Wang, **aoli; Man, Jianfen; Wang, Lina; Wang, Shunyao; Peng, Huanhuan; Peng, Zhiyu; **e, **aoyuan; Song, Lijie

doi:10.1186/s12864-024-10400-4

Recurrent neural network for predicting absence of heterozygosity from low pass WGS with ultra-low depth

Research
Open access
Published: 14 May 2024

Volume 25, article number 470, (2024)
Cite this article

Download PDF

You have full access to this open access article

BMC Genomics Aims and scope Submit manuscript

Recurrent neural network for predicting absence of heterozygosity from low pass WGS with ultra-low depth

Download PDF

Fei Tang¹^na1,
Zhonghua Wang¹^na1,
Yan Sun²^na1,
Linlin Fan¹^na1,
Yun Yang³^na1,
Xueqin Guo³,
Yaoshen Wang¹,
Saiying Yan¹,
Zhihong Qiao¹,
Yun Li⁴,
Ting Jiang⁴,
**aoli Wang⁴,
Jianfen Man³,
Lina Wang³,
Shunyao Wang²,
Huanhuan Peng⁴,
Zhiyu Peng²,
**aoyuan **e⁵ &
…
Lijie Song^1,6

373 Accesses
Explore all metrics

Abstract

Background

The absence of heterozygosity (AOH) is a kind of genomic change characterized by a long contiguous region of homozygous alleles in a chromosome, which may cause human genetic disorders. However, no method of low-pass whole genome sequencing (LP-WGS) has been reported for the detection of AOH in a low-pass setting of less than onefold. We developed a method, termed CNVseq-AOH, for predicting the absence of heterozygosity using LP-WGS with ultra-low sequencing data, which overcomes the sparse nature of typical LP-WGS data by combing population-based haplotype information, adjustable sliding windows, and recurrent neural network (RNN). We tested the feasibility of CNVseq-AOH for the detection of AOH in 409 cases (11 AOH regions for model training and 863 AOH regions for validation) from the 1000 Genomes Project (1KGP). AOH detection using CNVseq-AOH was also performed on 6 clinical cases with previously ascertained AOHs by whole exome sequencing (WES).

Results

Using SNP-based microarray results as reference (AOHs detected by CNVseq-AOH with at least a 50% overlap with the AOHs detected by chromosomal microarray analysis), 409 samples (863 AOH regions) in the 1KGP were used for concordant analysis. For 784 AOHs on autosomes and 79 AOHs on the X chromosome, CNVseq-AOH can predict AOHs with a concordant rate of 96.23% and 59.49% respectively based on the analysis of 0.1-fold LP-WGS data, which is far lower than the current standard in the field. Using 0.1-fold LP-WGS data, CNVseq-AOH revealed 5 additional AOHs (larger than 10 Mb in size) in the 409 samples. We further analyzed AOHs larger than 10 Mb, which is recommended for reporting the possibility of UPD. For the 291 AOH regions larger than 10 Mb, CNVseq-AOH can predict AOHs with a concordant rate of 99.66% with only 0.1-fold LP-WGS data. In the 6 clinical cases, CNVseq-AOH revealed all 15 known AOH regions.

Conclusions

Here we reported a method for analyzing LP-WGS data to accurately identify regions of AOH, which possesses great potential to improve genetic testing of AOH.

View this article's peer review reports

Comparison of three variant callers for human whole genome sequencing

Article Open access 14 December 2018

A highly sensitive and specific workflow for detecting rare copy-number variants from exome sequencing data

Article Open access 30 January 2020

AutoMap is a high performance homozygosity map** tool using next-generation sequencing data

Article Open access 22 January 2021

Background

The absence of heterozygosity (AOH) is a kind of genomic change characterized by a long contiguous region of homozygous alleles in a chromosome [1]. Several underlying mechanisms of AOH have been reported, such as meiotic segregation errors [2], parental consanguinity [3], or complex chromosomal rearrangements [4]. AOHs do not necessarily have clinical consequences, however, they may cause serious pathogenic effects when it is related to imprinting effects [5] or autosomal recessive disease mechanisms [3]. For example, more than 25% of patients with Prader–Willi syndrome are caused by isodisomy (the inheritance of both homologs from a single parent and only one homolog of that parent is present) or heterodisomy (the inheritance of both homologs from a single parent and both homologs of that parent are present) [6]. Sahoo et al. found that whole-genome uniparental isodisomy (UPD) caused pregnancy loss in ~ 1% of cases [7]. In a study of rare autosomal trisomy by genome-wide noninvasive prenatal testing, the author found that 4.16% of cases with rare autosomal trisomies originate from uniparental disomy [15], while variants on the X chromosome were phased by Eagle2 (without the pedigree-based correction) [16]. Due to this inconsistency in variant phasing, the probability calculation of CNVseq-AOH may be influenced. So, we separately calculated the concordant rate on autosomes and the X chromosome.

For the 784 AOHs on autosomes, in general, the prediction sensitivity of CNVseq-AOH increased with depth (Fig. 2a). As expected, the sensitivity of CNVseq-AOH was 100% (784/784) when the depth was > = onefold (Supplementary Table 1). With a depth of 0.5-fold, the sensitivity reached 99.9%. Only one AOH with an overlap of 47% was missed by CNVseq-AOH (Supplementary Table 1). The sensitivity of CNVseq-AOH reached 96.23% even with a depth of 0.1-fold, which is far lower than current studies, which need 4-to-fivefold depth [11, 17]. For the 79 AOHs on the X chromosome, the sensitivity of CNVseq-AOH was 59.49% (47/79) with a depth of 0.1-fold. The prediction sensitivity of CNVseq-AOH also increased with depth (Fig. 2b). However, even with a depth of threefold, the prediction sensitivity is still not 100%. There were 6 AOHs missed by CNVseq-AOH with an overlap ranging from 18%-44%. These 6 AOHs were located in similar regions on the X chromosome (Supplementary Table 1), which were also missed by CNVseq-AOH when using 0.5-fold and onefold depth. We further calculated the SNP numbers per 1 Mb on all the chromosomes in the 1KGP. The number of SNPs per 1 Mb on the X chromosome (mean of 26,839.9) was significantly less than the number of autosomes (18,439.9) (T-test, with P-value of 2.94E-12). One reasonable explanation for the relatively low sensitivity for AOHs on the X chromosome is that, compared with autosome, the variant information in the phasing results of the X chromosome in 1KGP was insufficient to calculate the probabilities for resampled reads.

For SNP-based microarrays, a threshold of > = 10 Mb has been suggested for reporting AOH [18]. In the real clinical setting, AOH larger than 10 Mb in one chromosome is recommended for reporting the possibility of UPD [19, 20]. There were 291 AOH regions larger than 10 Mb in the 1KGP. For these AOHs, CNVseq-AOH can predict AOHs with a sensitivity of 100% (291/291) when the depth was > = 0.5-fold (Supplementary Table 1). With a depth of 0.1-fold, the sensitivity reached 99.66% (290/291). CNVseq-AOH provided a prediction sensitivity of 94.5% (275/291) even with a depth of 0.05-fold.

For 0.1-fold LP-WGS data, it takes an average of 11 min to process a single sample using an 8-core CPU with 8 GB of RAM (from data alignment to reporting), including an average of 10 min for alignment, 25 s for feature learning, and 10 s for AOH prediction and reporting.

Additional AOHs detected by CNVseq-AOH

Compared to AOHs detected by CMA, additional AOHs were detected by CNVseq-AOH. We analyzed additional AOHs detected by CNVseq-AOH with a depth of 0.1-fold. A total of 267 additional AOHs were detected in the 409 samples by CNVseq-AOH, approximately 0.65 AOHs for each sample. The number of the additionally detected AOHs decreased with the length of AOH (Supplementary Fig. 1). Using high-coverage data, we further validated these AOHs by visualization using an in-house script. The results showed that, 50.56% (135/267) additional AOHs were true positives (Supplementary Table 2; Supplementary Fig. 2). In the clinical setting, a threshold of > 10 Mb was recommended for reporting the possibility of UPD [19, 20]. Using a threshold of > 10 Mb, only 5 additional AOHs were detected by CNVseq-AOH for the 409 samples with 0.1-fold depth (Supplementary Table 2).

Interestingly, we found an AOH region (seq[GRCh38] hmz(6)(p12.3q12) chr6:g. 47568317_64568317hmz) using CNVseq-AOH, which crossed the centromeric regions of chromosome 6 in this case (Fig. 3c, d). Although with sufficient markers for this region (Fig. 3a), no AOH was reported in this region by CMA, which indirectly reflects the detection performance of CNVseq-AOH for regions crossing the centromeric regions. This AOH was further validated using high-coverage data, which also showed positive signals in this region (Fig. 3b).

RNN VS. Hidden Markov model

RNN and Hidden Markov Model (HMM) are both widely used models for processing sequential data. HMM, a probabilistic model, is particularly effective for problems involving time series data. Currently, no published literature employs the HMM method for the detection of AOH, hence it cannot be cited. In this study, HMM with Gaussian emissions (the “hmmlearn.hmm.GaussianHMM” module in Python) was used for AOH prediction. We established an HMM model with 5 hidden states and a full covariance matrix, and compared it with CNVseq-AOH for AOH prediction. As a result, the prediction sensitivity of CNVseq-AOH is better than the HMM-based method with differing depths (Fig. 4).

Validation of CNVseq-AOH with 6 clinical cases

We further applied CNVseq-AOH on 6 clinical cases with previously detected AOHs (Table 1). A mean depth of 0.573-fold (raw reads) was obtained for each sample. Uniquely aligned high-quality reads (UAHRs) reads were used for the detection of AOH. A UAHR was defined as a read that was uniquely aligned to the human genome reference with a quality value of more than 20 per base (containing no partial adapter sequences and no more than 5% that were not determined in the read length).

Table 1 Validation of CNVseq-AOH with 6 clinical cases

Full size table

As a result, CNVseq-AOH detected all the 15 AOH regions (Table 1). In some cases with multiple known AOHs (Case 1, Case 2, and Case 5), a greater number of AOH regions were detected by CMA, probably because several AOH regions were split into sub-regions by CMA.

Discussion

RNN, known as recurrent neural network, is a very popular class of neural network. RNN is especially useful with sequential data. The neuron in RNN can use the internal state to “memory” previous input information, combining the information of the current input, to determine the next output state. RNN was widely used in natural language processing (NLP) [21]. However, the application of RNN in human genomic research is still rare. In this study, we described an RNN-based method, CNVseq-AOH, for predicting the absence of heterozygosity using LP-WGS. To the best of our knowledge, CNVseq-AOH is the first application combining population-based haplotype information, adjustable sliding windows, and RNN in genetic testing. CNVseq-AOH shows the feasibility of using ultra-low sequencing depth for the detection of clinically significant AOHs and demonstrates its potential in genetic testing.

One of the key innovations of CNVseq-AOH is the use of population-based haplotype information. Based on our testing, population-based haplotype information greatly influenced the feasibility of CNVseq-AOH. For the 409 samples in the current study, ancestry-matched populations (or genetically similar population) were used for analysis. We further compared the sensitivity using ancestry-matched populations for feature learning and using all available haplotype information from multiple ethnicities for feature learning at 0.1-fold. As a result, using a threshold of 50% overlap with the AOHs detected by CMA, the sensitivity of CNVseq-AOH reached 93.40% (806/863) when using ancestry-matched populations for feature learning. When switching to the strategy using all available haplotype information from multiple ethnicities for feature learning, the sensitivity is only 60.95% (526/863). Simultaneously, when employing a strategy using all available haplotype information from multiple ethnicities for feature learning, the accuracy is also significantly impacted (Supplementary Fig. 3). Not all the populations are captured in the 1KGP. The number of samples in a specific population varied a lot. This may influence the accuracy of our method and impede the wide application of CNVseq-AOH. Expanding the data collection to include new populations and samples may solve the problem.

One limitation of CNVseq-AOH is that it cannot be used for the detection of mosaic AOH. So, we did not include the 4 cases with mosaic AOH for testing in the first place. Based on the signals for these 4 cases (Supplementary Fig. 4), CNVseq-AOH possesses the potential for predicting mosaic AOH. This may require a different model and a great number of ascertained positive cases with mosaic AOHs for model training, which is an interesting topic but beyond the scope of this study. Another limitation of the current study is the performance of CNVseq-AOH for the detection of AOHs on the X chromosome. With a depth of 0.1-fold, a detection sensitivity of only 59.49% was achieved for the 79 AOHs on the X chromosome in the 1KGP. In phase three of the 1KGP, variants on the X chromosome were phased without the pedigree-based correction using Eagle2 (http://ftp.1000genomes.ebi.ac.uk/vol1/ftp/data_collections/1000G_2504_high_coverage/working/20201028_3202_phased/README_SNV_INDEL_phasing_111822.pdf), resulting in less number of SNPs per 1 Mb. So, the information of biallelic SNPs in the VCF file for the X chromosome is insufficient to calculate the probabilities for resampled reads. Actually, this is not a limitation of CNVseq-AOH, which means that, with sufficient information in the reference panel, CNVseq-AOH also possesses the potential to provide high prediction sensitivity for AOHs located on the X chromosome. Next, we plan to reanalyze these samples to optimize the performance of CNVseq-AOH for the detection of AOHs on the X chromosome.

In this study, we investigated sequencing depth on model performance in the 1KGP. In general, the prediction sensitivity of CNVseq-AOH increased with sequencing depth. However, data in the 1KGP was generated using various sequencing parameters (different sample types, library construction protocols, sequencing platforms, etc.), so the evaluation of sequencing depth may be biased. For clinical laboratories, depth evaluation using real clinical samples and uniform sequencing parameters is necessary before clinical application.

Conclusions

In summary, we developed a method for predicting the absence of heterozygosity using LP-WGS data, which overcomes the sparse nature of typical LP-WGS by combing population-based haplotype information, adjustable sliding windows, and RNN. Next, we plan to apply our method to clinical pregnant women who underwent prenatal diagnosis, thereby further evaluating the performance and potential utility of CNVseq-AOH under realistic clinical scenarios.

Availability of data and materials

The raw data of the cases with previously identified AOH events based on SNP-based microarrays and high coverage WGS data from 1KGP is available in the (https://www.ebi.ac.uk/ena/browser/view/PRJEB31736?show=reads) under the accession number PRJEB31736. The data of the 6 clinical cases generated and analyzed during the current study is not publicly available as they are patient samples and sharing them could compromise research participant privacy.

References

Liu J, He Z, Lin S, Wang Y, Huang L, Huang X, Luo Y. Absence of heterozygosity detected by single-nucleotide polymorphism array in prenatal diagnosis. Ultrasound Obstet Gynecol. 2021;57(2):314–23.
Article CAS PubMed Google Scholar
Potapova T, Gorbsky GJ. The consequences of chromosome segregation errors in mitosis and meiosis. Biology (Basel). 2017;6(1):12.
PubMed PubMed Central Google Scholar
Rehder CW, David KL, Hirsch B, Toriello HV, Wilson CM, Kearney HM. American College of Medical Genetics and Genomics: standards and guidelines for documenting suspected consanguinity as an incidental finding of genomic testing. Genet Med. 2013;15(2):150–2.
Article PubMed Google Scholar
Carvalho CM, Pfundt R, King DA, Lindsay SJ, Zuccherato LW, Macville MV, Liu P, Johnson D, Stankiewicz P, Brown CW, et al. Absence of heterozygosity due to template switching during replicative rearrangements. Am J Hum Genet. 2015;96(4):555–64.
Article CAS PubMed PubMed Central Google Scholar
Yauy K, de Leeuw N, Yntema HG, Pfundt R, Gilissen C. Accurate detection of clinically relevant uniparental disomy from exome sequencing data. Genet Med. 2020;22(4):803–8.
Article CAS PubMed Google Scholar
Dong Z, Zhang J, Hu P, Chen H, Xu J, Tian Q, Meng L, Ye Y, Wang J, Zhang M, et al. Low-pass whole-genome sequencing in clinical cytogenetics: a validated approach. Genet Med. 2016;18(9):940–8.
Article CAS PubMed Google Scholar
Sahoo T, Dzidic N, Strecker MN, Commander S, Travis MK, Doherty C, Tyson RW, Mendoza AE, Stephenson M, Dise CA, et al. Comprehensive genetic analysis of pregnancy loss by chromosomal microarrays: outcomes, benefits, and challenges. Genet Med. 2017;19(1):83–9.
Article CAS PubMed Google Scholar
**ang J, Li R, He J, Wang X, Yao L, Song N, Fu F, Zhou S, Wang J, Gao X, et al. Clinical impacts of genome-wide noninvasive prenatal testing for rare autosomal trisomy. Am J Obstet Gynecol MFM. 2023;5(1):100790.
Article CAS PubMed Google Scholar
Wang H, Dong Z, Zhang R, Chau MHK, Yang Z, Tsang KYC, Wong HK, Gui B, Meng Z, **ao K, et al. Low-pass genome sequencing versus chromosomal microarray analysis: implementation in prenatal diagnosis. Genet Med. 2020;22(3):500–10.
Article CAS PubMed Google Scholar
Chau MHK, Wang H, Lai Y, Zhang Y, Xu F, Tang Y, Wang Y, Chen Z, Leung TY, Chung JPW, et al. Low-pass genome sequencing: a validated method in clinical cytogenetics. Hum Genet. 2020;139(11):1403–15.
Article CAS PubMed Google Scholar
Dong Z, Chau MHK, Zhang Y, Yang Z, Shi M, Wah YM, Kwok YK, Leung TY, Morton CC, Choy KW. Low-pass genome sequencing-based detection of absence of heterozygosity: validation in clinical cytogenetics. Genet Med. 2021;23(7):1225–33.
Article CAS PubMed PubMed Central Google Scholar
Qian Y, Sun Y, Guo X, Song L, Sun Y, Gao X, Liu B, Xu Y, Chen N, Chen M, et al. Validation and depth evaluation of low-pass genome sequencing in prenatal diagnosis using 387 amniotic fluid samples. J Med Genet. 2023;60(10):933–8.
Article PubMed Google Scholar
Li H, Durbin R. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics. 2009;25(14):1754–60.
Article CAS PubMed PubMed Central Google Scholar
Ariad D, Yan SM, Victor AR, Barnes FL, Zouves CG, Viotti M, McCoy RC: Haplotype-aware inference of human chromosome abnormalities. Proc Natl Acad Sci USA. 2021;118(46):e2109307118
Delaneau O, Marchini J, Zagury JF. A linear complexity phasing method for thousands of genomes. Nat Methods. 2011;9(2):179–81.
Article PubMed Google Scholar
Loh PR, Danecek P, Palamara PF, Fuchsberger C, Reshef YA, Finucane HK, Schoenherr S, Forer L, McCarthy S, Abecasis GR, et al. Reference-based phasing using the Haplotype reference consortium panel. Nat Genet. 2016;48(11):1443–8.
Article CAS PubMed PubMed Central Google Scholar
Lu Y, Jiang Y, Zhou X, Hao N, Lu G, Guo X, Guo R, Liu W, Xu C, Chang J, et al. Evaluation and analysis of Absence of Homozygosity (AOH) using chromosome analysis by medium coverage whole genome sequencing (CMA-seq) in prenatal diagnosis. Diagnostics (Basel). 2023;13(3):560.
Article CAS PubMed PubMed Central Google Scholar
Papenhausen P, Schwartz S, Risheg H, Keitges E, Gadi I, Burnside RD, Jaswaney V, Pappas J, Pasion R, Friedman K, et al. UPD detection using homozygosity profiling with a SNP genoty** microarray. Am J Med Genet A. 2011;155A(4):757–68.
Article PubMed Google Scholar
Armour CM, Dougan SD, Brock JA, Chari R, Chodirker BN, DeBie I, Evans JA, Gibson WT, Kolomietz E, Nelson TN, et al. Practice guideline: joint CCMG-SOGC recommendations for the use of chromosomal microarray analysis for prenatal diagnosis and assessment of fetal loss in Canada. J Med Genet. 2018;55(4):215–21.
Article CAS PubMed Google Scholar
Liu W, Lu J, Zhang J, Li R, Lin S, Zhang Y, Wang Y, Yin A. A consensus recommendation for the interpretation and reporting of copy number variation and regions of homozygosity in prenatal genetic diagnosis. Zhonghua Yi Xue Yi Chuan Xue Za Zhi. 2020;37(7):701–8.
PubMed Google Scholar
Rezaeenour J, Ahmadi M, Jelodar H, Shahrooei R. Systematic review of content analysis algorithms based on deep neural networks. Multimed Tools Appl. 2023;82(12):17879–903.
Article PubMed Google Scholar

Download references

Acknowledgements

Not applicable.

Funding

This study was supported by the National Key R&D Program of China (No. 2023YFC2705600). This program is a non-profit research project by government, and had no role in the study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Author information

Fei Tang, Zhonghua Wang, Yan Sun, Linlin Fan, and Yun Yang contributed equally to this work.

Authors and Affiliations

Clin Lab, BGI Genomics, Tian**, 300308, China
Fei Tang, Zhonghua Wang, Linlin Fan, Yaoshen Wang, Saiying Yan, Zhihong Qiao & Lijie Song
BGI Genomics, Shenzhen, 518083, China
Yan Sun, Shunyao Wang & Zhiyu Peng
Clin Lab, BGI Genomics, Wuhan, 430074, China
Yun Yang, Xueqin Guo, Jianfen Man & Lina Wang
Clin Lab, BGI Genomics, Shenzhen, 518083, China
Yun Li, Ting Jiang, **aoli Wang & Huanhuan Peng
Tian** Women’s and Children’s Health Center, Tian**, 300070, China
**aoyuan **e
DTU Bioengineering, Technical University of Denmark, 2800, Kongens Lyngby, Denmark
Lijie Song

Authors

Fei Tang
View author publications
You can also search for this author in PubMed Google Scholar
Zhonghua Wang
View author publications
You can also search for this author in PubMed Google Scholar
Yan Sun
View author publications
You can also search for this author in PubMed Google Scholar
Linlin Fan
View author publications
You can also search for this author in PubMed Google Scholar
Yun Yang
View author publications
You can also search for this author in PubMed Google Scholar
Xueqin Guo
View author publications
You can also search for this author in PubMed Google Scholar
Yaoshen Wang
View author publications
You can also search for this author in PubMed Google Scholar
Saiying Yan
View author publications
You can also search for this author in PubMed Google Scholar
Zhihong Qiao
View author publications
You can also search for this author in PubMed Google Scholar
Yun Li
View author publications
You can also search for this author in PubMed Google Scholar
Ting Jiang
View author publications
You can also search for this author in PubMed Google Scholar
**aoli Wang
View author publications
You can also search for this author in PubMed Google Scholar
Jianfen Man
View author publications
You can also search for this author in PubMed Google Scholar
Lina Wang
View author publications
You can also search for this author in PubMed Google Scholar
Shunyao Wang
View author publications
You can also search for this author in PubMed Google Scholar
Huanhuan Peng
View author publications
You can also search for this author in PubMed Google Scholar
Zhiyu Peng
View author publications
You can also search for this author in PubMed Google Scholar
**aoyuan **e
View author publications
You can also search for this author in PubMed Google Scholar
Lijie Song
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

Conceptualization: Lijie Song, **aoyuan **e, Zhonghua Wang, Yan Sun. Data Curation: Linlin Fan, Yun Yang, Xueqin Guo, Zhihong Qiao, Yun Li, Ting Jiang, **aoli Wang. Formal Analysis: Fei Tang, Zhonghua Wang, Yan Sun, Yaoshen Wang, Saiying Yan, Jianfen Man, Lina Wang. Funding Acquisition: Yan Sun, Yun Yang. Investigation: Fei Tang, Zhonghua Wang, Yan Sun, Yaoshen Wang, Saiying Yan, Jianfen Man, Lina Wang, Shunyao Wang. Methodology: Fei Tang, Zhonghua Wang, Yan Sun, Yaoshen Wang, Saiying Yan, Jianfen Man, Lina Wang. Project Administration: Lijie Song, **aoyuan **e. Resources: Lijie Song, **aoyuan **e, Zhonghua Wang, Yan Sun, Zhiyu Peng, Huanhuan Peng. Software: Fei Tang, Zhonghua Wang. Supervision: Lijie Song, **aoyuan **e. Validation: Fei Tang, Zhonghua Wang, Yan Sun, Yaoshen Wang, Saiying Yan, Jianfen Man, Lina Wang. Visualization: Fei Tang, Zhonghua Wang, Yan Sun. Writing – Original Draft Preparation: Yan Sun. Writing – Review & Editing: All authors.

Corresponding authors

Correspondence to **aoyuan **e or Lijie Song.

Ethics declarations

Ethics approval and consent to participate

This study and all the protocols were approved by the ethics committee of THE INSTITUTIONAL REVIEW BOARD OF BGI (NO. BGI-IRB 22062). Informed consent for the anonymous usage of remaining samples and data for scientific research and possible publication was obtained from all participants. This study was performed in accordance with the principles of the Helsinki Declaration.

Consent for publication

Not applicable.

Competing interests

The authors declare that they have no competing interests.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Supplementary file 1.

Supplementary file 2.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Cite this article

Tang, F., Wang, Z., Sun, Y. et al. Recurrent neural network for predicting absence of heterozygosity from low pass WGS with ultra-low depth. BMC Genomics 25, 470 (2024). https://doi.org/10.1186/s12864-024-10400-4

Download citation

Received: 17 November 2023
Accepted: 09 May 2024
Published: 14 May 2024
DOI: https://doi.org/10.1186/s12864-024-10400-4

Recurrent neural network for predicting absence of heterozygosity from low pass WGS with ultra-low depth