Skip to main content

and
  1. Article

    Open Access

    Unsupervised representation learning on high-dimensional clinical data improves genomic discovery and prediction

    Although high-dimensional clinical data (HDCD) are increasingly available in biobank-scale datasets, their use for genetic discovery remains challenging. Here we introduce an unsupervised deep learning model, ...

    Taedong Yun, Justin Cosentino, Babak Behsaz, Zachary R. McCaw in Nature Genetics (2024)

  2. No Access

    Chapter and Conference Paper

    Multimodal LLMs for Health Grounded in Individual-Specific Data

    Foundation large language models (LLMs) have shown an impressive ability to solve tasks across a wide range of fields including health. To effectively solve personalized health tasks, LLMs need the ability to ...

    Anastasiya Belyaeva, Justin Cosentino in Machine Learning for Multimodal Healthcare… (2024)

  3. No Access

    Article

    Inference of chronic obstructive pulmonary disease with deep learning on raw spirograms identifies new genetic loci and improves risk models

    Chronic obstructive pulmonary disease (COPD), the third leading cause of death worldwide, is highly heritable. While COPD is clinically defined by applying thresholds to summary measures of lung function, a qu...

    Justin Cosentino, Babak Behsaz, Babak Alipanahi, Zachary R. McCaw in Nature Genetics (2023)

  4. No Access

    Article

    DeepConsensus improves the accuracy of sequences with a gap-aware sequence transformer

    Circular consensus sequencing with Pacific Biosciences (PacBio) technology generates long (10–25 kilobases), accurate ‘HiFi’ reads by combining serial observations of a DNA molecule into a consensus sequence. ...

    Gunjan Baid, Daniel E. Cook, Kishwar Shafin, Taedong Yun in Nature Biotechnology (2023)

  5. Article

    Open Access

    DeepNull models non-linear covariate effects to improve phenotypic prediction and association power

    Genome-wide association studies (GWASs) examine the association between genotype and phenotype while adjusting for a set of covariates. Although the covariates may have non-linear or interactive effects, due t...

    Zachary R. McCaw, Thomas Colthurst, Taedong Yun in Nature Communications (2022)

  6. Article

    Open Access

    A population-specific reference panel for improved genotype imputation in African Americans

    There is currently a dearth of accessible whole genome sequencing (WGS) data for individuals residing in the Americas with Sub-Saharan African ancestry. We generated whole genome sequencing data at intermediat...

    Jared O’Connell, Taedong Yun, Meghan Moreno, Helen Li in Communications Biology (2021)

  7. No Access

    Article

    An open resource for accurately benchmarking small variant and reference calls

    Benchmark small variant calls are required for develo**, optimizing and assessing the performance of sequencing and bioinformatics methods. Here, as part of the Genome in a Bottle (GIAB) Consortium, we apply...

    Justin M. Zook, Jennifer McDaniel, Nathan D. Olson, Justin Wagner in Nature Biotechnology (2019)

  8. No Access

    Article

    A universal SNP and small-indel variant caller using deep neural networks

    DeepVariant uses convolutional neural networks to improve the accuracy of variant calling.

    Ryan Poplin, Pi-Chuan Chang, David Alexander, Scott Schwartz in Nature Biotechnology (2018)

  9. No Access

    Article

    Human-specific loss of regulatory DNA and the evolution of human-specific traits

    A computational survey of the human genome has identified more than 500 human-specific genomic deletions that remove sequences that are highly conserved between chimpanzees and other animals. These are genomic...

    Cory Y. McLean, Philip L. Reno, Alex A. Pollen, Abraham I. Bassan in Nature (2011)

  10. No Access

    Article

    GREAT improves functional interpretation of cis-regulatory regions

    ChIP-Seq data are usually analyzed with approaches developed for microarrays, which only consider binding events within a few kilobases of a gene. McLean et al. present an algorithm that takes into account more d...

    Cory Y McLean, Dave Bristor, Michael Hiller, Shoa L Clarke in Nature Biotechnology (2010)