Abstract
Massive biological data provides a broad view to understand the dynamics of human health status and disease from multiple aspects. During the past decade, the tremendous volume of biological data has been produced in different ways. How to analyze the high-volume data precisely and efficiently, and take advantage from it has become one of the most essential bottlenecks for precision medicine. Newly developed bioinformatics tools are bringing opportunities for these challenges, from sequence-based algorithms such as genome assembly and genome comparison, to disease classifiers like regular machine learning and neural network. In this chapter, we summarize the widely-used state-of-the-art computational approaches of multi-omics data to study human health and diseases, including bioinformatics methods and tools for genomics, transcriptomics, metagenomics, and single-cell data, as well as machine learning algorithms and strategies.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
A, S, et al. Aberrant RNA splicing in cancer; expression changes and driver mutations of splicing factor genes. Oncogene. 2016;35(19)
A, W.E, et al. Iterative rank-order normalization of gene expression microarray data. BMC Bioinformatics. 2013b;14:1.
Aanes H, et al. Normalization of RNA-sequencing data from samples with varying mRNA levels. PLoS One. 2014;9(2):e89158.
Abyzov A, Gerstein M. AGE: defining breakpoints of genomic structural variants at single-nucleotide resolution, through optimal alignments with gap excision. Bioinformatics. 2011;27(5):595–603.
Abyzov A, et al. CNVnator: an approach to discover, genotype, and characterize typical and atypical CNVs from family and population genome sequencing. Genome Res. 2011;21(6):974–84.
Amir A, et al. Deblur rapidly resolves single-nucleotide community sequence patterns. mSystems. 2017;2:2.
André FN, C.T. A. Pre-mRNA splicing and human disease. Genes Dev. 2003;17(4)
Armour CR, et al. A metagenomic meta-analysis reveals functional signatures of health and disease in the human gut microbiome. mSystems. 2019;4:4.
Asshauer KP, et al. Tax4Fun: predicting functional profiles from metagenomic 16S rRNA data. Bioinformatics. 2015;31(17):2882–4.
Bajaj JS, et al. Linkage of gut microbiome with cognition in hepatic encephalopathy. Am J Physiol Gastrointest Liver Physiol. 2012;302(1):G168-75.
Bankevich A, et al. SPAdes: a new genome assembly algorithm and its applications to single-cell sequencing. J Comput Biol. 2012;19(5):455–77.
Baralle FE, Jimena G. Alternative splicing as a regulator of development and tissue identity. Nat Rev Mol Cell Biol. 2017;18:7.
Batzoglou S, et al. ARACHNE: a whole-genome shotgun assembler. Genome Res. 2002;12(1):177–89.
Bisanz JE, et al. Meta-analysis reveals reproducible gut microbiome alterations in response to a high-fat diet. Cell Host Microbe. 2019;26(2):265–72. e4
Blaser MJ, et al. Toward a predictive understanding of Earth's microbiomes to address 21st century challenges. MBio. 2016;7:3.
Bolyen E, et al. Reproducible, interactive, scalable and extensible microbiome data science using QIIME 2 (vol 37, pg 852, 2019). Nat Biotechnol. 2019;37(9):1091.
Brandler WM, et al. Frequency and complexity of de novo structural mutation in autism. Am J Hum Genet. 2016;98(4):667–79.
Breiman L. Random forests. Mach Learn. 2001;45(1):5–32.
Buttigieg PL, et al. The environment ontology in 2016: bridging domains with increased scope, semantic density, and interoperation. J Biomed Semantics. 2016;7(1):57.
Callahan BJ, McMurdie PJ, Holmes SP. Exact sequence variants should replace operational taxonomic units in marker-gene data analysis. ISME J. 2017;11(12):2639–43.
Callahan BJ, et al. DADA2: high-resolution sample inference from Illumina amplicon data. Nat Methods. 2016;13(7):581–3.
Cammarota G, et al. Gut microbiome, big data and machine learning to promote precision medicine for cancer. Nat Rev Gastroenterol Hepatol. 2020;
Caporaso JG, et al. QIIME allows analysis of high-throughput community sequencing data. Nat Methods. 2010;7(5):335–6.
Carl P, et al. Global rank-invariant set normalization (GRSN) to reduce systematic distortions in microarray data. BMC Bioinformatics. 2008;9:1.
Carvalho CM, Lupski JR. Mechanisms underlying structural variant formation in genomic disorders. Nat Rev Genet. 2016;17(4):224–38.
Chen IA, et al. IMG/M v.5.0: an integrated data management and comparative analysis system for microbial genomes and microbiomes. Nucleic Acids Res. 2019;47(D1):D666–77.
Chen W, et al. Map** translocation breakpoints by next-generation sequencing. Genome Res. 2008;18(7):1143–9.
Chen Y, et al. Parallel-meta suite: interactive and rapid microbiome data analysis on multiple platforms. iMeta. 2022;1(1):e1.
Climente-González H, et al. The functional impact of alternative splicing in cancer. Cell Rep. 2017;20:9.
Cole T, et al. Differential gene and transcript expression analysis of RNA-seq experiments with TopHat and cufflinks. Nat Protoc. 2012;7:3.
Comin M, et al. Comparison of microbiome samples: methods and computational challenges. Brief Bioinform. 2020;
Cortes C, Vapnik V. Support-vector networks. Mach Learn. 1995;20(3):273–97.
Costea PI, et al. Towards standards for human fecal sample processing in metagenomic studies. Nat Biotechnol. 2017;35(11):1069–76.
D, R.M., M.D. J, and S.G. K, edgeR: a Bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics Oxford, England, 2010. 26(1).
Deng Y, et al. A hierarchical fused fuzzy deep neural network for data classification. IEEE Trans Fuzzy Syst. 2016;25(4):1006–12.
Di W, et al. The use of miRNA microarrays for the analysis of cancer samples with global miRNA decrease, vol. 19. New York, N.Y: RNA; 2013. p. 7.
Douglas GM, et al. PICRUSt2 for prediction of metagenome functions. Nat Biotechnol. 2020;38(6):685–8.
Duvallet C, et al. Meta-analysis of gut microbiome studies identifies disease-specific and shared responses. Nat Commun. 2017a;8(1):1784.
Duvallet C, et al. Meta-analysis of gut microbiome studies identifies disease-specific and shared responses. Nat Commun. 2017b;8(1):1–10.
Edgar RC. Search and clustering orders of magnitude faster than BLAST. Bioinformatics. 2010;26(19):2460–1.
Edgar RC. UPARSE: highly accurate OTU sequences from microbial amplicon reads. Nat Methods. 2013;10(10):996–8.
Edgar RC. UNOISE2: improved error-correction for Illumina 16S and ITS amplicon sequencing. bioRxiv. 2016:081257.
Edgar RC. Accuracy of taxonomy prediction for 16S rRNA and fungal ITS sequences. PeerJ. 2018;6:e4652.
Eglė J, Arvydas K. Alternative splicing and hypoxia puzzle in Alzheimer’s and Parkinson’s diseases. Genes. 2021;12:8.
Elena B, et al. rnaSPAdes: a de novo transcriptome assembler and its application to RNA-Seq data. GigaScience. 2019;8:9.
English AC, et al. Assessing structural variation in a personal genome—towards a human reference diploid genome. BMC Genomics. 2015;16(1):1–15.
Fatih O, M.P. M. RNA sequencing: advances, challenges and opportunities. Nat Rev Genet. 2011;12:2.
Ferlaino M, et al. An integrative approach to predicting the functional effects of small indels in non-coding regions of the human genome. BMC Bioinformatics. 2017;18(1):1–8.
Feuk L, Carson AR, Scherer SW. Structural variation in the human genome. Nat Rev Genet. 2006;7(2):85–97.
Forslund K, et al. Disentangling type 2 diabetes and metformin treatment signatures in the human gut microbiota. Nature. 2015;528(7581):262–6.
Franzosa EA, et al. Species-level functional profiling of metagenomes and metatranscriptomes. Nat Methods. 2018;15(11):962–8.
Friedman JH. Greedy function approximation: a gradient boosting machine. Ann Stat. 2001:1189–232.
Friedman JH. Stochastic gradient boosting. Comput Stat Data Anal. 2002;38(4):367–78.
Gevers D, et al. The treatment-naive microbiome in new-onset Crohn's disease. Cell Host Microbe. 2014;15(3):382–92.
Giuseppe B, et al. Alternative splicing in Alzheimer's disease. Aging Clin Exp Res. 2019;33(4)
Glasmachers T. Limits of end-to-end learning. In: Min-Ling Z, Yung-Kyun N, editors. Proceedings of the ninth Asian conference on machine learning; 2017., PMLR: Proceedings of Machine Learning Research. p. 17–32.
Gonzalez A, et al. Qiita: rapid, web-enabled microbiome meta-analysis. Nat Methods. 2018;15(10):796–8.
Gonzalez-Garay ML. The road from next-generation sequencing to personalized medicine. Pers Med. 2014;11(5):523–44.
Gu J, et al. Recent advances in convolutional neural networks. Pattern Recogn. 2018;77:354–77.
Gu W, et al. SVLR: genome structural variant detection using long-read sequencing data. J Comput Biol. 2021;
H, S.M. et al., Oases: robust de novo RNA-seq assembly across the dynamic range of expression levels. Bioinformatics Oxford, England, 2012. 28(8).
Hacquard S, et al. Microbiota and host nutrition across plant and animal kingdoms. Cell Host Microbe. 2015;17(5):603–16.
Halfvarson J, et al. Dynamics of the human gut microbiome in inflammatory bowel disease. Nat Microbiol. 2017;2:17004.
Harrison PW, et al. The European nucleotide archive in 2018. Nucleic Acids Res. 2019;47(D1):D84–8.
Hedges DJ, et al. Evidence of novel fine-scale structural variation at autism spectrum disorder candidate loci. Mol Autism. 2012;3(1):1–11.
Heller D, Vingron M. SVIM: structural variant identification using mapped long reads. Bioinformatics. 2019;35(17):2907–15.
Hillmann B, et al. Evaluating the information content of shallow shotgun metagenomics. Msystems. 2018;3:6.
Hood L, Rowen L. The human genome project: big science transforms biology and medicine. Genome Med. 2013;5(9):1–8.
Huang S, et al. Predictive modeling of gingivitis severity and susceptibility via oral microbiota. ISME J. 2014;8(9):1768–80.
Huang S, et al. Longitudinal multi-omics and microbiome meta-analysis identify an asymptomatic gingival state that links gingivitis, Periodontitis, and Aging. mBio. 2021;12:2.
Huang X, Madan A. CAP3: a DNA sequence assembly program. Genome Res. 1999;9(9):868–77.
Huiling X, et al. Using generalized procrustes analysis (GPA) for normalization of cDNA microarray data. BMC Bioinformatics. 2008;9:1.
Huson DH, Reinert K, Myers EW. The greedy path-merging algorithm for contig scaffolding. J ACM (JACM). 2002;49(5):603–15.
J, H.T, K.K. A. baySeq: empirical Bayesian methods for identifying differential expression in sequence count data. BMC Bioinformatics. 2010;11:1.
Jiang H, Zhong F, and Zhu B. Filling scaffolds with gene repetitions: maximizing the number of adjacencies. in Annual Symposium on Combinatorial Pattern Matching. 2011. Springer.
Jiang H, et al. Scaffold filling under the breakpoint distance. in RECOMB International Workshop on Comparative Genomics. Springer. 2010.
Jiang H, et al. Scaffold filling under the breakpoint and related distances. IEEE/ACM Trans Comput Biol Bioinform. 2012;9(4):1220–9.
** Z, et al. MultiTrans: an algorithm for path extraction through mixed integer linear programming for transcriptome assembly. IEEE/ACM transactions on computational biology and bioinformatics, 2021. PP.
**g G, et al. Parallel-META 3: comprehensive taxonomical and functional analysis platform for efficient comparison of microbial communities. Sci Rep. 2017;7:40371.
**g G, et al. Dynamic meta-storms enables comprehensive taxonomic and phylogenetic comparison of shotgun metagenomes at the species level. Bioinformatics. 2019;
**g G, et al. Microbiome search engine 2: a platform for taxonomic and functional search of global microbiomes on the whole-microbiome level. mSystems. 2021a;6:1.
**g G, et al. Meta-apo improves accuracy of 16S-amplicon-based prediction of microbiome function. BMC Genomics. 2021b;22(1):9.
Johnson JS, et al. Evaluation of 16S rRNA gene sequencing for species and strain-level microbiome analysis. Nat Commun. 2019;10(1):5029.
Jones MB, et al. Library preparation methodology can influence genomic and functional predictions in human microbiome research. Proc Natl Acad Sci U S A. 2015;112(45):14024–9.
Juntao L, et al. TransComb: genome-guided transcriptome assembly via combing junctions in splicing graphs. Genome Biol. 2016a;17:1.
Juntao L, et al. BinPacker: packing-based De novo transcriptome assembly from RNA-seq data. PLoS Comput Biol. 2016b;12:2.
Juntao L, et al. TransLiG: a de novo transcriptome assembler that uses line graph iteration. Genome Biol. 2019;20:1.
Kelemen O, et al. Function of alternative splicing. Gene. 2013;514:1.
Kleftogiannis D, et al. Identification of single nucleotide variants using position-specific error estimation in deep sequencing data. BMC Med Genet. 2019;12(1):1–12.
Kleinbaum DG, et al. Logistic regression. Springer; 2002.
Knight R, et al. Best practices for analysing microbiomes. Nat Rev Microbiol. 2018;16(7):410–22.
Kodama Y, et al. The sequence read archive: explosive growth of sequencing data. Nucleic Acids Res. 2012;40(Database issue):D54–6.
Koren S, et al. Canu: scalable and accurate long-read assembly via adaptive k-mer weighting and repeat separation. Genome Res. 2017;27(5):722–36.
Lander ES. Initial impact of the sequencing of the human genome. Nature. 2011;470(7333):187–97.
Langille MG, et al. Predictive functional profiling of microbial communities using 16S rRNA marker gene sequences. Nat Biotechnol. 2013;31(9):814–21.
LaPierre N, et al. MetaPheno: a critical evaluation of deep learning and machine learning in metagenome-based disease prediction. Methods. 2019;166:74–82.
Lasse M, Andreas SJ, Anders K. Bayesian transcriptome assembly. Genome Biol. 2014;15:10.
Li H. Minimap and miniasm: fast map** and de novo assembly for noisy long sequences. Bioinformatics. 2015;32(14)
Li J, Tibshirani R. Finding consistent patterns: a nonparametric approach for identifying differential expression in RNA-Seq data. Stat Methods Med Res. 2013;22:5.
Lin T, et al. Label-free, rapid and quantitative phenoty** of stress response in E. coli via ramanome. Sci Rep. 2016;6:267.
Liu N, et al. An improved approximation algorithm for scaffold filling to maximize the common adjacencies. IEEE/ACM Trans Comput Biol Bioinform. 2013;10(4):905–13.
Lixin C, et al. CrossNorm: a novel normalization strategy for microarray data in cancers. Sci Rep. 2016a;6:1.
Lixin C, et al. ICN: a normalization method for gene expression data considering the over-expression of informative genes. Mol BioSyst. 2016b;12:10.
Lo C, Marculescu R. MetaNN: accurate classification of host phenotypes from metagenomic data using neural networks. Bmc Bioinformatics. 2019;20(12):314.
Lovén J, et al. Revisiting global gene expression analysis. Cell. 2012;151:3.
Lozupone CA, et al. Meta-analyses of studies of the human microbiota. Genome Res. 2013;23(10):1704–14.
Lu J and Salzberg SL, Ultrafast and accurate 16S microbial community analysis using Kraken 2. bioRxiv, 2020: p. 2020.03.27.012047.
Luo R, et al. SOAPdenovo2: an empirically improved memory-efficient short-read de novo assembler. Gigascience. 2012;1(1):2047-217X-1-18.
M. M.A, et al. iReckon: simultaneous isoform discovery and abundance estimation from RNA-seq data. Genome Res. 2013a;23:3.
Ma J and Jiang H. Notes on the 6/5-Approximation Algorithm for One-Sided Scaffold Filling. in International Workshop on Frontiers in Algorithmics. Springer; 2016.
Ma J, et al. On the solution bound of two-sided scaffold filling. Theor Comput Sci. 2021;873:47–63.
Macintyre G, Ylstra B, Brenton JD. Sequencing structural variants in cancer for precision therapeutics. Trends Genet. 2016;32(9):530–42.
Mackeh R, et al. Single-nucleotide variations of the human nuclear hormone receptor genes in 60,000 individuals. J Endocr Soc. 2018;2(1):77–90.
McDonald D, et al. American gut: an open platform for citizen science microbiome research. mSystems. 2018a;3:3.
McDonald D, et al. Striped UniFrac: enabling microbiome analysis at unprecedented scale. Nat Methods. 2018b;15(11):847–8.
McDonald D, et al. Redbiom: a rapid sample discovery and feature characterization system. mSystems. 2019;4(4)
Meng Z, et al. Analysis of long noncoding RNAs highlights region-specific altered expression patterns and diagnostic roles in Alzheimer's disease. Brief Bioinform. 2019;20:2.
Meyer F, et al. The metagenomics RAST server - a public resource for the automatic phylogenetic and functional analysis of metagenomes. BMC Bioinformatics. 2008;9:386.
Mihaela P, et al. StringTie enables improved reconstruction of a transcriptome from RNA-seq reads. Nat Biotechnol. 2015;33:3.
Mingfu S, Carl K. Accurate assembly of transcripts through phase-preserving graph decomposition. Nat Biotechnol. 2017;35:12.
Mitchell G, et al. Ab initio reconstruction of cell type-specific transcriptomes in mouse reveals the conserved multi-exonic structure of lincRNAs. Nat Biotechnol. 2010;28:5.
Mo C, M.J. L. Mechanisms of alternative splicing regulation: insights from molecular and genomics approaches. Nat Rev Mol Cell Biol. 2009;10:11.
Moritz A, et al. SplicingCompass: differential splicing detection using RNA-seq data, vol. 29. Oxford, England: Bioinformatics; 2013. p. 9.
Mou L, Ghamisi P, Zhu XX. Deep recurrent neural networks for hyperspectral image classification. IEEE Trans Geosci Remote Sens. 2017;55(7):3639–55.
Muoz A, et al. Scaffold filling contig fusion and gene order comparison. BMC Bioinformatics. 2010;11:304.
Nakagawa H, Fujita M. Whole genome sequencing analysis for cancer genomics and precision medicine. Cancer Sci. 2018;109(3):513–22.
Nalbantoglu U, et al. Large direct repeats flank genomic rearrangements between a new clinical isolate of Francisella tularensis subsp. tularensis A1 and Schu S4. PLoS One. 2010;5(2):e9007.
Namkung J. Machine learning methods for microbiome studies. J Microbiol. 2020;58(3):206–16.
Norris AL, et al. Nanopore sequencing detects structural variants in cancer. Cancer Biol Ther. 2016;17(3):246–53.
Ozery-Flato M, Shamir R. Sorting cancer karyotypes by elementary operations. J Comput Biol. 2009;16(10):1445–60.
Pasolli E, et al. Machine learning meta-analysis of large metagenomic datasets: tools and biological insights. PLoS Comput Biol. 2016;12(7):e1004977.
Peng L, et al. Integrative analysis with ChIP-seq advances the limits of transcript quantification from RNA-seq. Genome Res. 2016;26:8.
Peng Y, et al. IDBA-UD: a de novo assembler for single-cell and metagenomic sequencing data with highly uneven depth. Bioinformatics. 2012;28(11):1420–8.
Peterson LE. K-nearest neighbor. Scholarpedia. 2009;4(2):1883.
Philipp D, et al. Accurate detection of differential RNA processing. Nucleic Acids Res. 2013;41:10.
Piazza A, Heyer W-D. Homologous recombination and the formation of complex genomic rearrangements. Trends Cell Biol. 2019;29(2):135–49.
Poirion O, et al. Using single nucleotide variations in single-cell RNA-seq to identify subpopulations and genotype-phenotype linkage. Nat Commun. 2018;9(1):1–13.
Polikar R. Ensemble learning, in Ensemble machine learning. 2012, Springer. p. 1–34.
Poore GD, et al. Microbiome analyses of blood and tissues suggest cancer diagnostic approach. Nature. 2020;579(7800):567–74.
Pop M. Genome assembly reborn: recent computational challenges. Brief Bioinform. 2009;10(4):354–66.
Pouyanfar S, et al. A survey on deep learning: algorithms, techniques, and applications. ACM Computing Surveys (CSUR). 2018;51(5):1–36.
Proctor LM, et al. The integrative human microbiome project. Nature. 2019;569(7758):641–8.
Qi F, et al. Improved probe selection for DNA arrays using nonparametric kernel density estimation. Conference proceedings : ... Annual International Conference of the IEEE Engineering in Medicine and Biology Society. IEEE Engineering in Medicine and Biology Society. Annual Conference, 2005. 2006.
Qiang et al. Structural variation in amyloid-beta fibrils from Alzheimer's disease clinical subtypes. Nature, 2017.
Qin J, et al. A human gut microbial gene catalogue established by metagenomic sequencing. Nature. 2010;464(7285):59–65.
Randal J. The human genome project. Lancet. 1991;334(8678):1535–6.
Rasko DA, et al. Bacillus anthracis comparative genome analysis in support of the Amerithrax investigation. Proc Natl Acad Sci. 2011;108(12):5027–32.
Ratan A, et al. Identification of indels in next-generation sequencing data. BMC Bioinformatics. 2015;16(1):1–8.
Ricotta C, Podani J. On some properties of the Bray-Curtis dissimilarity and their ecological meaning. Ecol Complex. 2017;31:201–5.
Rognes T, et al. VSEARCH: a versatile open source tool for metagenomics. PeerJ. 2016;4:e2584.
Ruan J, Li H. Fast and accurate long-read assembly with wtdbg2. Nat Methods. 2020;17(2):155–8.
Ruder S., An overview of gradient descent optimization algorithms. ar**v preprint ar**v:1609.04747, 2016.
Ruolin L, L.A. E, D.J. A. Comparisons of computational methods for differential alternative splicing detection using RNA-seq in plant systems. BMC Bioinformatics. 2014;15:1.
Sam K, et al. Transcriptome assembly from long-read RNA-seq alignments with StringTie2. Genome Biol. 2019;20:1.
Sanchis-Juan A, et al. Complex structural variants in Mendelian disorders: identification and breakpoint resolution using short-and long-read genome sequencing. Genome Med. 2018;10(1):1–10.
Sankoff D, et al. Gene order comparisons for phylogenetic inference: evolution of the mitochondrial genome. Proc Natl Acad Sci. 1992;89(14):6575–9.
Schloss PD, et al. Introducing mothur: open-source, platform-independent, community-supported software for describing and comparing microbial communities. Appl Environ Microbiol. 2009;75(23):7537–41.
Scholz M, et al. Strain-level microbial epidemiology and population genomics from shotgun metagenomics. Nat Methods. 2016;13(5):435–8.
Sedlazeck FJ, et al. Accurate detection of complex structural variations using single-molecule sequencing. Nat Methods. 2018;15(6):461–8.
Segata N, et al. Metagenomic biomarker discovery and explanation. Genome Biol. 2011;12(6):R60.
Segata N, et al. Metagenomic microbial community profiling using unique clade-specific marker genes. Nat Methods. 2012;9(8):811–4.
Sharma D, Paterson AD, Xu W. TaxoNN: ensemble of neural networks on stratified microbiome data for disease prediction. Bioinformatics. 2020;
Shi W, et al. gcMeta: a global catalogue of metagenomics platform to support the archiving, standardization and analysis of microbiome data. Nucleic Acids Res. 2019;47(D1):D637–48.
Shihao, S., et al., rMATS: robust and flexible detection of differential alternative splicing from replicate RNA-Seq data. Proc Natl Acad Sci U S A, 2014. 11151.
Simon A, Alejandro R, Wolfgang H. Detecting differential usage of exons from RNA-seq data. Genome Res. 2012;22:10.
Sindi S, et al. A geometric approach for classification and comparison of structural variants. Bioinformatics. 2009;25(12):i222–30.
Song B, et al. MetaSee: an interactive and extendable visualization toolbox for metagenomic sample analysis and comparison. PLoS One. 2017:7, 11.
Song K, Wright F, Zhou Y-H. Systematic comparisons for composition profiles, taxonomic levels, and machine learning methods for microbiome-based disease prediction. Front Mol Biosci. 2020;7:423.
Sonia T, et al. NOIseq: a RNA-seq differential expression method robust for sequencing depth biases. EMBnetjournal. 2012;17(B)
Stefan C, et al. CIDANE: comprehensive isoform discovery and abundance estimation. Genome Biol. 2016;17:1.
Su X, et al. GPU-meta-storms: computing the structure similarities among massive amount of microbial community samples using GPU. Bioinformatics. 2014;30(7):1031–3.
Su X, et al. Identifying and predicting novelty in microbiome studies. MBio. 2018;9:6.
Su X, et al. Method development for cross-study microbiome data mining: challenges and opportunities. Comput Struct Biotechnol J. 2020a;
Su X, et al. Multiple-disease detection and classification across cohorts via microbiome search. Msystems. 2020b;5:2.
Sudmant PH, et al. An integrated map of structural variation in 2,504 human genomes. Nature. 2015;526(7571):75–81.
Sunagawa S, et al. Metagenomic species profiling using universal phylogenetic marker genes. Nat Methods. 2013;10(12):1196.
Ten Hoopen P, et al. The metagenomic data life-cycle: standards and best practices. Gigascience. 2017;6(8):1–11.
Thompson LR, et al. A communal catalogue reveals Earth's multiscale microbial diversity. Nature. 2017;551(7681):457–63.
Ting Y, et al., TransRef enables accurate transcriptome assembly by redefining accurate neo-splicing graphs. Briefings in bioinformatics, 2021.
Topçuoğlu BD, et al. A framework for effective application of machine learning to microbiome-based classification problems. MBio. 2020;11:3.
Truong DT, et al. MetaPhlAn2 for enhanced metagenomic taxonomic profiling. Nat Methods. 2015;12(10):902–3.
Uritskiy GV, DiRuggiero J, Taylor J. MetaWRAP-a flexible pipeline for genome-resolved metagenomic data analysis. Microbiome. 2018;6(1):158.
Vangay P, Hillmann BM, Knights D. Microbiome learning repo (ML Repo): a public repository of microbiome regression and classification tasks. Gigascience. 2019;8:5.
Venter JC, et al. The sequence of the human genome. Science. 2001;291(5507):1304–51.
Vezzi F, Cattonaro F, Policriti A. E-RGA: enhanced reference guided assembly of complex genomes. EMBnet J. 2011;17(1):46–54.
Voigt AY, et al. Temporal and technical variability of human gut metagenomes. Genome Biol. 2015;16:73.
Wang W, et al. Identifying differentially spliced genes from two groups of RNA-seq samples. Gene. 2013;518:1.
Wenger AM, et al. Accurate circular consensus long-read sequencing improves variant detection and assembly of a human genome. Nat Biotechnol. 2019;37(10):1155–62.
Wen-** H, et al. Kernel density weighted loess normalization improves the performance of detection within asymmetrical data. BMC Bioinformatics. 2011;12:1.
Wirbel J, et al. Meta-analysis of fecal metagenomes reveals global microbial signatures that are specific for colorectal cancer. Nat Med. 2019;25(4):679.
Wood DE, Salzberg SL. Kraken: ultrafast metagenomic sequence classification using exact alignments. Genome Biol. 2014;15(3):R46.
Wu S, et al. GMrepo: a database of curated and consistently annotated human gut metagenomes. Nucleic Acids Res. 2020;48(D1):D545–53.
** W, C.M. J. SeqGSEA: a Bioconductor package for gene set enrichment analysis of RNA-Seq data integrating differential expression and splicing. Bioinformatics. (Oxford, England). 2014;30:12.
**ao L, Zhang F, Zhao F. Large-scale microbiome data integration enables robust biomarker identification. Nat Comput Sci. 2022;2(5):307–16.
Yarza P, et al. Uniting the classification of cultured and uncultured bacteria and archaea using 16S rRNA gene sequences. Nat Rev Microbiol. 2014;12(9):635–45.
Ye SH, et al. Benchmarking metagenomics tools for taxonomic classification. Cell. 2019;178(4):779–94.
Yilmaz P, et al. Minimum information about a marker gene sequence (MIMARKS) and minimum information about any (x) sequence (MIxS) specifications. Nat Biotechnol. 2011;29(5):415–20.
Yinlong, X., et al., SOAPdenovo-trans: de novo transcriptome assembly with short RNA-Seq reads. Bioinformatics (Oxford, England), 2014. 30(12).
Yu, P., et al., IDBA-tran: a more robust de novo de Bruijn graph assembler for transcriptomes with uneven expression levels. Bioinformatics Oxford, England, 2013. 29(13).
Zhang T, et al. MPD: a pathogen genome and metagenome database. Database (Oxford). 2018;2018
Zhou Q, Su X, Ning K. Assessment of quality control approaches for metagenomic data analysis. Sci Rep. 2014;4:6957.
Zhou Q, et al. RNA-QC-chain: comprehensive and fast quality control for RNA-Seq data. BMC Genomics. 2018;19(1):144.
Zhou Z-H. Ensemble learning. Encyclopedia of Biometrics. 2009;1:270–3.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this chapter
Cite this chapter
Zhao, J., Zhang, S., Wu, S., Zhang, W., Su, X. (2023). Current Progress of Bioinformatics for Human Health. In: Ning, K. (eds) Methodologies of Multi-Omics Data Integration and Data Mining. Translational Bioinformatics, vol 19. Springer, Singapore. https://doi.org/10.1007/978-981-19-8210-1_8
Download citation
DOI: https://doi.org/10.1007/978-981-19-8210-1_8
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-19-8209-5
Online ISBN: 978-981-19-8210-1
eBook Packages: Biomedical and Life SciencesBiomedical and Life Sciences (R0)