Current Progress of Bioinformatics for Human Health

  • Chapter
  • First Online:
Methodologies of Multi-Omics Data Integration and Data Mining

Part of the book series: Translational Bioinformatics ((TRBIO,volume 19))

  • 943 Accesses

Abstract

Massive biological data provides a broad view to understand the dynamics of human health status and disease from multiple aspects. During the past decade, the tremendous volume of biological data has been produced in different ways. How to analyze the high-volume data precisely and efficiently, and take advantage from it has become one of the most essential bottlenecks for precision medicine. Newly developed bioinformatics tools are bringing opportunities for these challenges, from sequence-based algorithms such as genome assembly and genome comparison, to disease classifiers like regular machine learning and neural network. In this chapter, we summarize the widely-used state-of-the-art computational approaches of multi-omics data to study human health and diseases, including bioinformatics methods and tools for genomics, transcriptomics, metagenomics, and single-cell data, as well as machine learning algorithms and strategies.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 149.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 199.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free ship** worldwide - see info
Hardcover Book
USD 199.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free ship** worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  • A, S, et al. Aberrant RNA splicing in cancer; expression changes and driver mutations of splicing factor genes. Oncogene. 2016;35(19)

    Google Scholar 

  • A, W.E, et al. Iterative rank-order normalization of gene expression microarray data. BMC Bioinformatics. 2013b;14:1.

    Google Scholar 

  • Aanes H, et al. Normalization of RNA-sequencing data from samples with varying mRNA levels. PLoS One. 2014;9(2):e89158.

    Article  Google Scholar 

  • Abyzov A, Gerstein M. AGE: defining breakpoints of genomic structural variants at single-nucleotide resolution, through optimal alignments with gap excision. Bioinformatics. 2011;27(5):595–603.

    Article  CAS  Google Scholar 

  • Abyzov A, et al. CNVnator: an approach to discover, genotype, and characterize typical and atypical CNVs from family and population genome sequencing. Genome Res. 2011;21(6):974–84.

    Article  CAS  Google Scholar 

  • Amir A, et al. Deblur rapidly resolves single-nucleotide community sequence patterns. mSystems. 2017;2:2.

    Article  Google Scholar 

  • André FN, C.T. A. Pre-mRNA splicing and human disease. Genes Dev. 2003;17(4)

    Google Scholar 

  • Armour CR, et al. A metagenomic meta-analysis reveals functional signatures of health and disease in the human gut microbiome. mSystems. 2019;4:4.

    Article  Google Scholar 

  • Asshauer KP, et al. Tax4Fun: predicting functional profiles from metagenomic 16S rRNA data. Bioinformatics. 2015;31(17):2882–4.

    Article  CAS  Google Scholar 

  • Bajaj JS, et al. Linkage of gut microbiome with cognition in hepatic encephalopathy. Am J Physiol Gastrointest Liver Physiol. 2012;302(1):G168-75.

    Article  Google Scholar 

  • Bankevich A, et al. SPAdes: a new genome assembly algorithm and its applications to single-cell sequencing. J Comput Biol. 2012;19(5):455–77.

    Article  CAS  Google Scholar 

  • Baralle FE, Jimena G. Alternative splicing as a regulator of development and tissue identity. Nat Rev Mol Cell Biol. 2017;18:7.

    Article  Google Scholar 

  • Batzoglou S, et al. ARACHNE: a whole-genome shotgun assembler. Genome Res. 2002;12(1):177–89.

    CAS  Google Scholar 

  • Bisanz JE, et al. Meta-analysis reveals reproducible gut microbiome alterations in response to a high-fat diet. Cell Host Microbe. 2019;26(2):265–72. e4

    Article  CAS  Google Scholar 

  • Blaser MJ, et al. Toward a predictive understanding of Earth's microbiomes to address 21st century challenges. MBio. 2016;7:3.

    Article  Google Scholar 

  • Bolyen E, et al. Reproducible, interactive, scalable and extensible microbiome data science using QIIME 2 (vol 37, pg 852, 2019). Nat Biotechnol. 2019;37(9):1091.

    Article  CAS  Google Scholar 

  • Brandler WM, et al. Frequency and complexity of de novo structural mutation in autism. Am J Hum Genet. 2016;98(4):667–79.

    Article  CAS  Google Scholar 

  • Breiman L. Random forests. Mach Learn. 2001;45(1):5–32.

    Article  Google Scholar 

  • Buttigieg PL, et al. The environment ontology in 2016: bridging domains with increased scope, semantic density, and interoperation. J Biomed Semantics. 2016;7(1):57.

    Article  Google Scholar 

  • Callahan BJ, McMurdie PJ, Holmes SP. Exact sequence variants should replace operational taxonomic units in marker-gene data analysis. ISME J. 2017;11(12):2639–43.

    Article  Google Scholar 

  • Callahan BJ, et al. DADA2: high-resolution sample inference from Illumina amplicon data. Nat Methods. 2016;13(7):581–3.

    Article  CAS  Google Scholar 

  • Cammarota G, et al. Gut microbiome, big data and machine learning to promote precision medicine for cancer. Nat Rev Gastroenterol Hepatol. 2020;

    Google Scholar 

  • Caporaso JG, et al. QIIME allows analysis of high-throughput community sequencing data. Nat Methods. 2010;7(5):335–6.

    Article  CAS  Google Scholar 

  • Carl P, et al. Global rank-invariant set normalization (GRSN) to reduce systematic distortions in microarray data. BMC Bioinformatics. 2008;9:1.

    Google Scholar 

  • Carvalho CM, Lupski JR. Mechanisms underlying structural variant formation in genomic disorders. Nat Rev Genet. 2016;17(4):224–38.

    Article  CAS  Google Scholar 

  • Chen IA, et al. IMG/M v.5.0: an integrated data management and comparative analysis system for microbial genomes and microbiomes. Nucleic Acids Res. 2019;47(D1):D666–77.

    Article  CAS  Google Scholar 

  • Chen W, et al. Map** translocation breakpoints by next-generation sequencing. Genome Res. 2008;18(7):1143–9.

    Article  CAS  Google Scholar 

  • Chen Y, et al. Parallel-meta suite: interactive and rapid microbiome data analysis on multiple platforms. iMeta. 2022;1(1):e1.

    Article  Google Scholar 

  • Climente-González H, et al. The functional impact of alternative splicing in cancer. Cell Rep. 2017;20:9.

    Article  Google Scholar 

  • Cole T, et al. Differential gene and transcript expression analysis of RNA-seq experiments with TopHat and cufflinks. Nat Protoc. 2012;7:3.

    Google Scholar 

  • Comin M, et al. Comparison of microbiome samples: methods and computational challenges. Brief Bioinform. 2020;

    Google Scholar 

  • Cortes C, Vapnik V. Support-vector networks. Mach Learn. 1995;20(3):273–97.

    Article  Google Scholar 

  • Costea PI, et al. Towards standards for human fecal sample processing in metagenomic studies. Nat Biotechnol. 2017;35(11):1069–76.

    Article  CAS  Google Scholar 

  • D, R.M., M.D. J, and S.G. K, edgeR: a Bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics Oxford, England, 2010. 26(1).

    Google Scholar 

  • Deng Y, et al. A hierarchical fused fuzzy deep neural network for data classification. IEEE Trans Fuzzy Syst. 2016;25(4):1006–12.

    Article  Google Scholar 

  • Di W, et al. The use of miRNA microarrays for the analysis of cancer samples with global miRNA decrease, vol. 19. New York, N.Y: RNA; 2013. p. 7.

    Google Scholar 

  • Douglas GM, et al. PICRUSt2 for prediction of metagenome functions. Nat Biotechnol. 2020;38(6):685–8.

    Article  CAS  Google Scholar 

  • Duvallet C, et al. Meta-analysis of gut microbiome studies identifies disease-specific and shared responses. Nat Commun. 2017a;8(1):1784.

    Article  Google Scholar 

  • Duvallet C, et al. Meta-analysis of gut microbiome studies identifies disease-specific and shared responses. Nat Commun. 2017b;8(1):1–10.

    Article  CAS  Google Scholar 

  • Edgar RC. Search and clustering orders of magnitude faster than BLAST. Bioinformatics. 2010;26(19):2460–1.

    Article  CAS  Google Scholar 

  • Edgar RC. UPARSE: highly accurate OTU sequences from microbial amplicon reads. Nat Methods. 2013;10(10):996–8.

    Article  CAS  Google Scholar 

  • Edgar RC. UNOISE2: improved error-correction for Illumina 16S and ITS amplicon sequencing. bioRxiv. 2016:081257.

    Google Scholar 

  • Edgar RC. Accuracy of taxonomy prediction for 16S rRNA and fungal ITS sequences. PeerJ. 2018;6:e4652.

    Article  Google Scholar 

  • Eglė J, Arvydas K. Alternative splicing and hypoxia puzzle in Alzheimer’s and Parkinson’s diseases. Genes. 2021;12:8.

    Google Scholar 

  • Elena B, et al. rnaSPAdes: a de novo transcriptome assembler and its application to RNA-Seq data. GigaScience. 2019;8:9.

    Google Scholar 

  • English AC, et al. Assessing structural variation in a personal genome—towards a human reference diploid genome. BMC Genomics. 2015;16(1):1–15.

    Article  CAS  Google Scholar 

  • Fatih O, M.P. M. RNA sequencing: advances, challenges and opportunities. Nat Rev Genet. 2011;12:2.

    Google Scholar 

  • Ferlaino M, et al. An integrative approach to predicting the functional effects of small indels in non-coding regions of the human genome. BMC Bioinformatics. 2017;18(1):1–8.

    Article  Google Scholar 

  • Feuk L, Carson AR, Scherer SW. Structural variation in the human genome. Nat Rev Genet. 2006;7(2):85–97.

    Article  CAS  Google Scholar 

  • Forslund K, et al. Disentangling type 2 diabetes and metformin treatment signatures in the human gut microbiota. Nature. 2015;528(7581):262–6.

    Article  CAS  Google Scholar 

  • Franzosa EA, et al. Species-level functional profiling of metagenomes and metatranscriptomes. Nat Methods. 2018;15(11):962–8.

    Article  CAS  Google Scholar 

  • Friedman JH. Greedy function approximation: a gradient boosting machine. Ann Stat. 2001:1189–232.

    Google Scholar 

  • Friedman JH. Stochastic gradient boosting. Comput Stat Data Anal. 2002;38(4):367–78.

    Article  Google Scholar 

  • Gevers D, et al. The treatment-naive microbiome in new-onset Crohn's disease. Cell Host Microbe. 2014;15(3):382–92.

    Article  CAS  Google Scholar 

  • Giuseppe B, et al. Alternative splicing in Alzheimer's disease. Aging Clin Exp Res. 2019;33(4)

    Google Scholar 

  • Glasmachers T. Limits of end-to-end learning. In: Min-Ling Z, Yung-Kyun N, editors. Proceedings of the ninth Asian conference on machine learning; 2017., PMLR: Proceedings of Machine Learning Research. p. 17–32.

    Google Scholar 

  • Gonzalez A, et al. Qiita: rapid, web-enabled microbiome meta-analysis. Nat Methods. 2018;15(10):796–8.

    Article  CAS  Google Scholar 

  • Gonzalez-Garay ML. The road from next-generation sequencing to personalized medicine. Pers Med. 2014;11(5):523–44.

    Article  CAS  Google Scholar 

  • Gu J, et al. Recent advances in convolutional neural networks. Pattern Recogn. 2018;77:354–77.

    Article  Google Scholar 

  • Gu W, et al. SVLR: genome structural variant detection using long-read sequencing data. J Comput Biol. 2021;

    Google Scholar 

  • H, S.M. et al., Oases: robust de novo RNA-seq assembly across the dynamic range of expression levels. Bioinformatics Oxford, England, 2012. 28(8).

    Google Scholar 

  • Hacquard S, et al. Microbiota and host nutrition across plant and animal kingdoms. Cell Host Microbe. 2015;17(5):603–16.

    Article  CAS  Google Scholar 

  • Halfvarson J, et al. Dynamics of the human gut microbiome in inflammatory bowel disease. Nat Microbiol. 2017;2:17004.

    Article  CAS  Google Scholar 

  • Harrison PW, et al. The European nucleotide archive in 2018. Nucleic Acids Res. 2019;47(D1):D84–8.

    Article  CAS  Google Scholar 

  • Hedges DJ, et al. Evidence of novel fine-scale structural variation at autism spectrum disorder candidate loci. Mol Autism. 2012;3(1):1–11.

    Article  Google Scholar 

  • Heller D, Vingron M. SVIM: structural variant identification using mapped long reads. Bioinformatics. 2019;35(17):2907–15.

    Article  CAS  Google Scholar 

  • Hillmann B, et al. Evaluating the information content of shallow shotgun metagenomics. Msystems. 2018;3:6.

    Article  Google Scholar 

  • Hood L, Rowen L. The human genome project: big science transforms biology and medicine. Genome Med. 2013;5(9):1–8.

    Article  Google Scholar 

  • Huang S, et al. Predictive modeling of gingivitis severity and susceptibility via oral microbiota. ISME J. 2014;8(9):1768–80.

    Article  Google Scholar 

  • Huang S, et al. Longitudinal multi-omics and microbiome meta-analysis identify an asymptomatic gingival state that links gingivitis, Periodontitis, and Aging. mBio. 2021;12:2.

    Article  Google Scholar 

  • Huang X, Madan A. CAP3: a DNA sequence assembly program. Genome Res. 1999;9(9):868–77.

    Article  CAS  Google Scholar 

  • Huiling X, et al. Using generalized procrustes analysis (GPA) for normalization of cDNA microarray data. BMC Bioinformatics. 2008;9:1.

    Google Scholar 

  • Huson DH, Reinert K, Myers EW. The greedy path-merging algorithm for contig scaffolding. J ACM (JACM). 2002;49(5):603–15.

    Article  Google Scholar 

  • J, H.T, K.K. A. baySeq: empirical Bayesian methods for identifying differential expression in sequence count data. BMC Bioinformatics. 2010;11:1.

    Google Scholar 

  • Jiang H, Zhong F, and Zhu B. Filling scaffolds with gene repetitions: maximizing the number of adjacencies. in Annual Symposium on Combinatorial Pattern Matching. 2011. Springer.

    Google Scholar 

  • Jiang H, et al. Scaffold filling under the breakpoint distance. in RECOMB International Workshop on Comparative Genomics. Springer. 2010.

    Google Scholar 

  • Jiang H, et al. Scaffold filling under the breakpoint and related distances. IEEE/ACM Trans Comput Biol Bioinform. 2012;9(4):1220–9.

    Article  Google Scholar 

  • ** Z, et al. MultiTrans: an algorithm for path extraction through mixed integer linear programming for transcriptome assembly. IEEE/ACM transactions on computational biology and bioinformatics, 2021. PP.

    Google Scholar 

  • **g G, et al. Parallel-META 3: comprehensive taxonomical and functional analysis platform for efficient comparison of microbial communities. Sci Rep. 2017;7:40371.

    Article  CAS  Google Scholar 

  • **g G, et al. Dynamic meta-storms enables comprehensive taxonomic and phylogenetic comparison of shotgun metagenomes at the species level. Bioinformatics. 2019;

    Google Scholar 

  • **g G, et al. Microbiome search engine 2: a platform for taxonomic and functional search of global microbiomes on the whole-microbiome level. mSystems. 2021a;6:1.

    Article  Google Scholar 

  • **g G, et al. Meta-apo improves accuracy of 16S-amplicon-based prediction of microbiome function. BMC Genomics. 2021b;22(1):9.

    Article  CAS  Google Scholar 

  • Johnson JS, et al. Evaluation of 16S rRNA gene sequencing for species and strain-level microbiome analysis. Nat Commun. 2019;10(1):5029.

    Article  Google Scholar 

  • Jones MB, et al. Library preparation methodology can influence genomic and functional predictions in human microbiome research. Proc Natl Acad Sci U S A. 2015;112(45):14024–9.

    Article  CAS  Google Scholar 

  • Juntao L, et al. TransComb: genome-guided transcriptome assembly via combing junctions in splicing graphs. Genome Biol. 2016a;17:1.

    Google Scholar 

  • Juntao L, et al. BinPacker: packing-based De novo transcriptome assembly from RNA-seq data. PLoS Comput Biol. 2016b;12:2.

    Google Scholar 

  • Juntao L, et al. TransLiG: a de novo transcriptome assembler that uses line graph iteration. Genome Biol. 2019;20:1.

    Google Scholar 

  • Kelemen O, et al. Function of alternative splicing. Gene. 2013;514:1.

    Article  CAS  Google Scholar 

  • Kleftogiannis D, et al. Identification of single nucleotide variants using position-specific error estimation in deep sequencing data. BMC Med Genet. 2019;12(1):1–12.

    CAS  Google Scholar 

  • Kleinbaum DG, et al. Logistic regression. Springer; 2002.

    Google Scholar 

  • Knight R, et al. Best practices for analysing microbiomes. Nat Rev Microbiol. 2018;16(7):410–22.

    Article  CAS  Google Scholar 

  • Kodama Y, et al. The sequence read archive: explosive growth of sequencing data. Nucleic Acids Res. 2012;40(Database issue):D54–6.

    Article  CAS  Google Scholar 

  • Koren S, et al. Canu: scalable and accurate long-read assembly via adaptive k-mer weighting and repeat separation. Genome Res. 2017;27(5):722–36.

    Article  CAS  Google Scholar 

  • Lander ES. Initial impact of the sequencing of the human genome. Nature. 2011;470(7333):187–97.

    Article  CAS  Google Scholar 

  • Langille MG, et al. Predictive functional profiling of microbial communities using 16S rRNA marker gene sequences. Nat Biotechnol. 2013;31(9):814–21.

    Article  CAS  Google Scholar 

  • LaPierre N, et al. MetaPheno: a critical evaluation of deep learning and machine learning in metagenome-based disease prediction. Methods. 2019;166:74–82.

    Article  CAS  Google Scholar 

  • Lasse M, Andreas SJ, Anders K. Bayesian transcriptome assembly. Genome Biol. 2014;15:10.

    Google Scholar 

  • Li H. Minimap and miniasm: fast map** and de novo assembly for noisy long sequences. Bioinformatics. 2015;32(14)

    Google Scholar 

  • Li J, Tibshirani R. Finding consistent patterns: a nonparametric approach for identifying differential expression in RNA-Seq data. Stat Methods Med Res. 2013;22:5.

    Article  Google Scholar 

  • Lin T, et al. Label-free, rapid and quantitative phenoty** of stress response in E. coli via ramanome. Sci Rep. 2016;6:267.

    Google Scholar 

  • Liu N, et al. An improved approximation algorithm for scaffold filling to maximize the common adjacencies. IEEE/ACM Trans Comput Biol Bioinform. 2013;10(4):905–13.

    Article  Google Scholar 

  • Lixin C, et al. CrossNorm: a novel normalization strategy for microarray data in cancers. Sci Rep. 2016a;6:1.

    Google Scholar 

  • Lixin C, et al. ICN: a normalization method for gene expression data considering the over-expression of informative genes. Mol BioSyst. 2016b;12:10.

    Google Scholar 

  • Lo C, Marculescu R. MetaNN: accurate classification of host phenotypes from metagenomic data using neural networks. Bmc Bioinformatics. 2019;20(12):314.

    Article  Google Scholar 

  • Lovén J, et al. Revisiting global gene expression analysis. Cell. 2012;151:3.

    Article  Google Scholar 

  • Lozupone CA, et al. Meta-analyses of studies of the human microbiota. Genome Res. 2013;23(10):1704–14.

    Article  CAS  Google Scholar 

  • Lu J and Salzberg SL, Ultrafast and accurate 16S microbial community analysis using Kraken 2. bioRxiv, 2020: p. 2020.03.27.012047.

    Google Scholar 

  • Luo R, et al. SOAPdenovo2: an empirically improved memory-efficient short-read de novo assembler. Gigascience. 2012;1(1):2047-217X-1-18.

    Article  Google Scholar 

  • M. M.A, et al. iReckon: simultaneous isoform discovery and abundance estimation from RNA-seq data. Genome Res. 2013a;23:3.

    Google Scholar 

  • Ma J and Jiang H. Notes on the 6/5-Approximation Algorithm for One-Sided Scaffold Filling. in International Workshop on Frontiers in Algorithmics. Springer; 2016.

    Google Scholar 

  • Ma J, et al. On the solution bound of two-sided scaffold filling. Theor Comput Sci. 2021;873:47–63.

    Article  Google Scholar 

  • Macintyre G, Ylstra B, Brenton JD. Sequencing structural variants in cancer for precision therapeutics. Trends Genet. 2016;32(9):530–42.

    Article  CAS  Google Scholar 

  • Mackeh R, et al. Single-nucleotide variations of the human nuclear hormone receptor genes in 60,000 individuals. J Endocr Soc. 2018;2(1):77–90.

    Article  CAS  Google Scholar 

  • McDonald D, et al. American gut: an open platform for citizen science microbiome research. mSystems. 2018a;3:3.

    Article  Google Scholar 

  • McDonald D, et al. Striped UniFrac: enabling microbiome analysis at unprecedented scale. Nat Methods. 2018b;15(11):847–8.

    Article  CAS  Google Scholar 

  • McDonald D, et al. Redbiom: a rapid sample discovery and feature characterization system. mSystems. 2019;4(4)

    Google Scholar 

  • Meng Z, et al. Analysis of long noncoding RNAs highlights region-specific altered expression patterns and diagnostic roles in Alzheimer's disease. Brief Bioinform. 2019;20:2.

    Google Scholar 

  • Meyer F, et al. The metagenomics RAST server - a public resource for the automatic phylogenetic and functional analysis of metagenomes. BMC Bioinformatics. 2008;9:386.

    Article  CAS  Google Scholar 

  • Mihaela P, et al. StringTie enables improved reconstruction of a transcriptome from RNA-seq reads. Nat Biotechnol. 2015;33:3.

    Google Scholar 

  • Mingfu S, Carl K. Accurate assembly of transcripts through phase-preserving graph decomposition. Nat Biotechnol. 2017;35:12.

    Google Scholar 

  • Mitchell G, et al. Ab initio reconstruction of cell type-specific transcriptomes in mouse reveals the conserved multi-exonic structure of lincRNAs. Nat Biotechnol. 2010;28:5.

    Google Scholar 

  • Mo C, M.J. L. Mechanisms of alternative splicing regulation: insights from molecular and genomics approaches. Nat Rev Mol Cell Biol. 2009;10:11.

    Google Scholar 

  • Moritz A, et al. SplicingCompass: differential splicing detection using RNA-seq data, vol. 29. Oxford, England: Bioinformatics; 2013. p. 9.

    Google Scholar 

  • Mou L, Ghamisi P, Zhu XX. Deep recurrent neural networks for hyperspectral image classification. IEEE Trans Geosci Remote Sens. 2017;55(7):3639–55.

    Article  Google Scholar 

  • Muoz A, et al. Scaffold filling contig fusion and gene order comparison. BMC Bioinformatics. 2010;11:304.

    Article  Google Scholar 

  • Nakagawa H, Fujita M. Whole genome sequencing analysis for cancer genomics and precision medicine. Cancer Sci. 2018;109(3):513–22.

    Article  CAS  Google Scholar 

  • Nalbantoglu U, et al. Large direct repeats flank genomic rearrangements between a new clinical isolate of Francisella tularensis subsp. tularensis A1 and Schu S4. PLoS One. 2010;5(2):e9007.

    Article  Google Scholar 

  • Namkung J. Machine learning methods for microbiome studies. J Microbiol. 2020;58(3):206–16.

    Article  Google Scholar 

  • Norris AL, et al. Nanopore sequencing detects structural variants in cancer. Cancer Biol Ther. 2016;17(3):246–53.

    Article  CAS  Google Scholar 

  • Ozery-Flato M, Shamir R. Sorting cancer karyotypes by elementary operations. J Comput Biol. 2009;16(10):1445–60.

    Article  CAS  Google Scholar 

  • Pasolli E, et al. Machine learning meta-analysis of large metagenomic datasets: tools and biological insights. PLoS Comput Biol. 2016;12(7):e1004977.

    Article  Google Scholar 

  • Peng L, et al. Integrative analysis with ChIP-seq advances the limits of transcript quantification from RNA-seq. Genome Res. 2016;26:8.

    Google Scholar 

  • Peng Y, et al. IDBA-UD: a de novo assembler for single-cell and metagenomic sequencing data with highly uneven depth. Bioinformatics. 2012;28(11):1420–8.

    Article  CAS  Google Scholar 

  • Peterson LE. K-nearest neighbor. Scholarpedia. 2009;4(2):1883.

    Article  Google Scholar 

  • Philipp D, et al. Accurate detection of differential RNA processing. Nucleic Acids Res. 2013;41:10.

    Google Scholar 

  • Piazza A, Heyer W-D. Homologous recombination and the formation of complex genomic rearrangements. Trends Cell Biol. 2019;29(2):135–49.

    Article  CAS  Google Scholar 

  • Poirion O, et al. Using single nucleotide variations in single-cell RNA-seq to identify subpopulations and genotype-phenotype linkage. Nat Commun. 2018;9(1):1–13.

    Article  CAS  Google Scholar 

  • Polikar R. Ensemble learning, in Ensemble machine learning. 2012, Springer. p. 1–34.

    Google Scholar 

  • Poore GD, et al. Microbiome analyses of blood and tissues suggest cancer diagnostic approach. Nature. 2020;579(7800):567–74.

    Article  CAS  Google Scholar 

  • Pop M. Genome assembly reborn: recent computational challenges. Brief Bioinform. 2009;10(4):354–66.

    Article  CAS  Google Scholar 

  • Pouyanfar S, et al. A survey on deep learning: algorithms, techniques, and applications. ACM Computing Surveys (CSUR). 2018;51(5):1–36.

    Article  Google Scholar 

  • Proctor LM, et al. The integrative human microbiome project. Nature. 2019;569(7758):641–8.

    Article  Google Scholar 

  • Qi F, et al. Improved probe selection for DNA arrays using nonparametric kernel density estimation. Conference proceedings : ... Annual International Conference of the IEEE Engineering in Medicine and Biology Society. IEEE Engineering in Medicine and Biology Society. Annual Conference, 2005. 2006.

    Google Scholar 

  • Qiang et al. Structural variation in amyloid-beta fibrils from Alzheimer's disease clinical subtypes. Nature, 2017.

    Google Scholar 

  • Qin J, et al. A human gut microbial gene catalogue established by metagenomic sequencing. Nature. 2010;464(7285):59–65.

    Article  CAS  Google Scholar 

  • Randal J. The human genome project. Lancet. 1991;334(8678):1535–6.

    Google Scholar 

  • Rasko DA, et al. Bacillus anthracis comparative genome analysis in support of the Amerithrax investigation. Proc Natl Acad Sci. 2011;108(12):5027–32.

    Article  CAS  Google Scholar 

  • Ratan A, et al. Identification of indels in next-generation sequencing data. BMC Bioinformatics. 2015;16(1):1–8.

    Article  Google Scholar 

  • Ricotta C, Podani J. On some properties of the Bray-Curtis dissimilarity and their ecological meaning. Ecol Complex. 2017;31:201–5.

    Article  Google Scholar 

  • Rognes T, et al. VSEARCH: a versatile open source tool for metagenomics. PeerJ. 2016;4:e2584.

    Article  Google Scholar 

  • Ruan J, Li H. Fast and accurate long-read assembly with wtdbg2. Nat Methods. 2020;17(2):155–8.

    Article  CAS  Google Scholar 

  • Ruder S., An overview of gradient descent optimization algorithms. ar**v preprint ar**v:1609.04747, 2016.

    Google Scholar 

  • Ruolin L, L.A. E, D.J. A. Comparisons of computational methods for differential alternative splicing detection using RNA-seq in plant systems. BMC Bioinformatics. 2014;15:1.

    Google Scholar 

  • Sam K, et al. Transcriptome assembly from long-read RNA-seq alignments with StringTie2. Genome Biol. 2019;20:1.

    CAS  Google Scholar 

  • Sanchis-Juan A, et al. Complex structural variants in Mendelian disorders: identification and breakpoint resolution using short-and long-read genome sequencing. Genome Med. 2018;10(1):1–10.

    Article  Google Scholar 

  • Sankoff D, et al. Gene order comparisons for phylogenetic inference: evolution of the mitochondrial genome. Proc Natl Acad Sci. 1992;89(14):6575–9.

    Article  CAS  Google Scholar 

  • Schloss PD, et al. Introducing mothur: open-source, platform-independent, community-supported software for describing and comparing microbial communities. Appl Environ Microbiol. 2009;75(23):7537–41.

    Article  CAS  Google Scholar 

  • Scholz M, et al. Strain-level microbial epidemiology and population genomics from shotgun metagenomics. Nat Methods. 2016;13(5):435–8.

    Article  CAS  Google Scholar 

  • Sedlazeck FJ, et al. Accurate detection of complex structural variations using single-molecule sequencing. Nat Methods. 2018;15(6):461–8.

    Article  CAS  Google Scholar 

  • Segata N, et al. Metagenomic biomarker discovery and explanation. Genome Biol. 2011;12(6):R60.

    Article  Google Scholar 

  • Segata N, et al. Metagenomic microbial community profiling using unique clade-specific marker genes. Nat Methods. 2012;9(8):811–4.

    Article  CAS  Google Scholar 

  • Sharma D, Paterson AD, Xu W. TaxoNN: ensemble of neural networks on stratified microbiome data for disease prediction. Bioinformatics. 2020;

    Google Scholar 

  • Shi W, et al. gcMeta: a global catalogue of metagenomics platform to support the archiving, standardization and analysis of microbiome data. Nucleic Acids Res. 2019;47(D1):D637–48.

    Article  CAS  Google Scholar 

  • Shihao, S., et al., rMATS: robust and flexible detection of differential alternative splicing from replicate RNA-Seq data. Proc Natl Acad Sci U S A, 2014. 11151.

    Google Scholar 

  • Simon A, Alejandro R, Wolfgang H. Detecting differential usage of exons from RNA-seq data. Genome Res. 2012;22:10.

    Google Scholar 

  • Sindi S, et al. A geometric approach for classification and comparison of structural variants. Bioinformatics. 2009;25(12):i222–30.

    Article  CAS  Google Scholar 

  • Song B, et al. MetaSee: an interactive and extendable visualization toolbox for metagenomic sample analysis and comparison. PLoS One. 2017:7, 11.

    Google Scholar 

  • Song K, Wright F, Zhou Y-H. Systematic comparisons for composition profiles, taxonomic levels, and machine learning methods for microbiome-based disease prediction. Front Mol Biosci. 2020;7:423.

    Article  Google Scholar 

  • Sonia T, et al. NOIseq: a RNA-seq differential expression method robust for sequencing depth biases. EMBnetjournal. 2012;17(B)

    Google Scholar 

  • Stefan C, et al. CIDANE: comprehensive isoform discovery and abundance estimation. Genome Biol. 2016;17:1.

    Google Scholar 

  • Su X, et al. GPU-meta-storms: computing the structure similarities among massive amount of microbial community samples using GPU. Bioinformatics. 2014;30(7):1031–3.

    Article  CAS  Google Scholar 

  • Su X, et al. Identifying and predicting novelty in microbiome studies. MBio. 2018;9:6.

    Article  Google Scholar 

  • Su X, et al. Method development for cross-study microbiome data mining: challenges and opportunities. Comput Struct Biotechnol J. 2020a;

    Google Scholar 

  • Su X, et al. Multiple-disease detection and classification across cohorts via microbiome search. Msystems. 2020b;5:2.

    Article  Google Scholar 

  • Sudmant PH, et al. An integrated map of structural variation in 2,504 human genomes. Nature. 2015;526(7571):75–81.

    Article  CAS  Google Scholar 

  • Sunagawa S, et al. Metagenomic species profiling using universal phylogenetic marker genes. Nat Methods. 2013;10(12):1196.

    Article  CAS  Google Scholar 

  • Ten Hoopen P, et al. The metagenomic data life-cycle: standards and best practices. Gigascience. 2017;6(8):1–11.

    Google Scholar 

  • Thompson LR, et al. A communal catalogue reveals Earth's multiscale microbial diversity. Nature. 2017;551(7681):457–63.

    Article  CAS  Google Scholar 

  • Ting Y, et al., TransRef enables accurate transcriptome assembly by redefining accurate neo-splicing graphs. Briefings in bioinformatics, 2021.

    Google Scholar 

  • Topçuoğlu BD, et al. A framework for effective application of machine learning to microbiome-based classification problems. MBio. 2020;11:3.

    Article  Google Scholar 

  • Truong DT, et al. MetaPhlAn2 for enhanced metagenomic taxonomic profiling. Nat Methods. 2015;12(10):902–3.

    Article  CAS  Google Scholar 

  • Uritskiy GV, DiRuggiero J, Taylor J. MetaWRAP-a flexible pipeline for genome-resolved metagenomic data analysis. Microbiome. 2018;6(1):158.

    Article  Google Scholar 

  • Vangay P, Hillmann BM, Knights D. Microbiome learning repo (ML Repo): a public repository of microbiome regression and classification tasks. Gigascience. 2019;8:5.

    Article  Google Scholar 

  • Venter JC, et al. The sequence of the human genome. Science. 2001;291(5507):1304–51.

    Article  CAS  Google Scholar 

  • Vezzi F, Cattonaro F, Policriti A. E-RGA: enhanced reference guided assembly of complex genomes. EMBnet J. 2011;17(1):46–54.

    Article  Google Scholar 

  • Voigt AY, et al. Temporal and technical variability of human gut metagenomes. Genome Biol. 2015;16:73.

    Article  Google Scholar 

  • Wang W, et al. Identifying differentially spliced genes from two groups of RNA-seq samples. Gene. 2013;518:1.

    Article  Google Scholar 

  • Wenger AM, et al. Accurate circular consensus long-read sequencing improves variant detection and assembly of a human genome. Nat Biotechnol. 2019;37(10):1155–62.

    Article  CAS  Google Scholar 

  • Wen-** H, et al. Kernel density weighted loess normalization improves the performance of detection within asymmetrical data. BMC Bioinformatics. 2011;12:1.

    Google Scholar 

  • Wirbel J, et al. Meta-analysis of fecal metagenomes reveals global microbial signatures that are specific for colorectal cancer. Nat Med. 2019;25(4):679.

    Article  CAS  Google Scholar 

  • Wood DE, Salzberg SL. Kraken: ultrafast metagenomic sequence classification using exact alignments. Genome Biol. 2014;15(3):R46.

    Article  Google Scholar 

  • Wu S, et al. GMrepo: a database of curated and consistently annotated human gut metagenomes. Nucleic Acids Res. 2020;48(D1):D545–53.

    Article  CAS  Google Scholar 

  • ** W, C.M. J. SeqGSEA: a Bioconductor package for gene set enrichment analysis of RNA-Seq data integrating differential expression and splicing. Bioinformatics. (Oxford, England). 2014;30:12.

    Google Scholar 

  • **ao L, Zhang F, Zhao F. Large-scale microbiome data integration enables robust biomarker identification. Nat Comput Sci. 2022;2(5):307–16.

    Article  Google Scholar 

  • Yarza P, et al. Uniting the classification of cultured and uncultured bacteria and archaea using 16S rRNA gene sequences. Nat Rev Microbiol. 2014;12(9):635–45.

    Article  CAS  Google Scholar 

  • Ye SH, et al. Benchmarking metagenomics tools for taxonomic classification. Cell. 2019;178(4):779–94.

    Article  CAS  Google Scholar 

  • Yilmaz P, et al. Minimum information about a marker gene sequence (MIMARKS) and minimum information about any (x) sequence (MIxS) specifications. Nat Biotechnol. 2011;29(5):415–20.

    Article  CAS  Google Scholar 

  • Yinlong, X., et al., SOAPdenovo-trans: de novo transcriptome assembly with short RNA-Seq reads. Bioinformatics (Oxford, England), 2014. 30(12).

    Google Scholar 

  • Yu, P., et al., IDBA-tran: a more robust de novo de Bruijn graph assembler for transcriptomes with uneven expression levels. Bioinformatics Oxford, England, 2013. 29(13).

    Google Scholar 

  • Zhang T, et al. MPD: a pathogen genome and metagenome database. Database (Oxford). 2018;2018

    Google Scholar 

  • Zhou Q, Su X, Ning K. Assessment of quality control approaches for metagenomic data analysis. Sci Rep. 2014;4:6957.

    Article  CAS  Google Scholar 

  • Zhou Q, et al. RNA-QC-chain: comprehensive and fast quality control for RNA-Seq data. BMC Genomics. 2018;19(1):144.

    Article  Google Scholar 

  • Zhou Z-H. Ensemble learning. Encyclopedia of Biometrics. 2009;1:270–3.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to **aoquan Su .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Zhao, J., Zhang, S., Wu, S., Zhang, W., Su, X. (2023). Current Progress of Bioinformatics for Human Health. In: Ning, K. (eds) Methodologies of Multi-Omics Data Integration and Data Mining. Translational Bioinformatics, vol 19. Springer, Singapore. https://doi.org/10.1007/978-981-19-8210-1_8

Download citation

Publish with us

Policies and ethics

Navigation