Identification and Validation of Candidate Genes from Genome-Wide Association Studies

  • Protocol
  • First Online:
Genome-Wide Association Studies

Part of the book series: Methods in Molecular Biology ((MIMB,volume 2481))

  • 2565 Accesses

Abstract

Exploiting the statistical associations coming out from a GWAS experiment to identify and validate candidate genes may be potentially difficult and time consuming. To fill the gap between the identification of candidate genes toward their functional validation onto the trait performance, the prioritization of variants underlying the GWAS-associated regions is necessary. In parallel, recent developments in genomics and statistical methods have been achieved notably in human genetic and they are accordingly being adopted in plant breeding toward the study of the genetic architecture of traits to sustain genetic gains. In this chapter, we aim at providing both theoretical and practical aspects underlying three main options including (1) the MetaGWAS analysis, (2) the statistical fine map** and (3) the integration of functional data toward the identification and validation of candidate genes from a GWAS experiment.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Protocol
USD 49.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free ship** worldwide - see info
Hardcover Book
USD 249.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free ship** worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. The 1000 Genomes Project Consortium (2012) An integrated map of genetic variation from 1,092 human genomes. Nature 491:56–65. https://doi.org/10.1038/nature11632

    Article  CAS  PubMed Central  Google Scholar 

  2. Wood AR, Esko T, Yang J, Vedantam S, Pers TH, Gustafsson S, Chu AY, Estrada K, Luan J, Kutalik Z et al (2014) Defining the role of common variation in the genomic and biological architecture of adult human height. Nat Genet 46:1173–1186. https://doi.org/10.1038/ng.3097

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  3. Locke AE, Kahali B, Berndt SI, Justice AE, Pers TH, Day FR, Powell C, Vedantam S, Buchkovich ML, Yang J et al (2015) Genetic studies of body mass index yield new insights for obesity biology. Nature 518:197–206. https://doi.org/10.1038/nature14177

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  4. Atwell S, Huang YS, Vilhjálmsson BJ, Willems G, Li Y, Meng D, Platt A, Tarone AM, Hu TT, Muliyati NW et al (2010) Genome-wide association study of 107 phenotypes in a common set of Arabidopsis Thaliana inbred lines. Nature 465:627–631. https://doi.org/10.1038/nature08800.Genome-wide

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  5. Huang X, Wei X, Sang T, Zhao Q, Feng Q, Zhao Y, Li C, Zhu C, Lu T, Zhang Z et al (2010) Genome-wide association studies of 14 agronomic traits in Rice landraces. Nat Genet 42:961–967. https://doi.org/10.1038/ng.695

    Article  CAS  PubMed  Google Scholar 

  6. Li H, Peng Z, Yang X, Wang W, Fu J, Wang J, Han Y, Chai Y, Guo T, Yang N et al (2013) Genome-wide association study dissects the genetic architecture of oil biosynthesis in maize kernels. Nat Genet 45:43–50. https://doi.org/10.1038/ng.2484

    Article  CAS  PubMed  Google Scholar 

  7. Zhou X, Huang X (2019) Genome-wide association studies in Rice: how to solve the low power problems? Mol Plant 12:10–12. https://doi.org/10.1016/j.molp.2018.11.010

    Article  CAS  PubMed  Google Scholar 

  8. Peters U, Hutter CM, Hsu L, Schumacher FR, Conti DV, Carlson CS, Edlund CK, Haile RW, Gallinger S, Zanke BW et al (2012) Meta-analysis of new genome-wide association studies of colorectal cancer risk. Hum Genet 131:217–234. https://doi.org/10.1007/s00439-011-1055-0

    Article  PubMed  Google Scholar 

  9. Zhao J, Sauvage C, Zhao J, Bitton F, Bauchet G, Liu D, Huang S, Tieman DM, Klee HJ, Causse M (2019) Meta-analysis of genome-wide association studies provides insights into genetic control of tomato flavor. Nat Commun 10:1–12. https://doi.org/10.1038/s41467-019-09462-w

    Article  CAS  Google Scholar 

  10. Shook JM, Zhang J, Jones SE, Singh A, Diers BW, Singh AK (2021) Meta-GWAS for quantitative trait loci identification in soybean. G3 Genes Genom Genet 11:jkab 117. https://doi.org/10.1093/g3journal/jkab117

    Article  Google Scholar 

  11. Joukhadar R, Thistlethwaite R, Trethowan R, Keeble-Gagnère G, Hayden MJ, Ullah S, Daetwyler HD (2021) Meta-analysis of genome-wide association studies reveal common loci controlling agronomic and quality traits in a wide range of Normal and heat stressed environments. Theor Appl Genet 134:2113–2127. https://doi.org/10.1007/s00122-021-03809-y

    Article  CAS  PubMed  Google Scholar 

  12. Spain SL, Barrett JC (2015) Strategies for fine-map** complex traits. Hum Mol Genet 24:111–119. https://doi.org/10.1093/hmg/ddv260

    Article  CAS  Google Scholar 

  13. Evangelou E, Ioannidis JPA (2013) Meta-analysis methods for genome-wide association studies and beyond. Nat Rev Genet 14:379–389. https://doi.org/10.1038/nrg3472

    Article  CAS  PubMed  Google Scholar 

  14. Pasaniuc B, Price AL (2017) Dissecting the genetics of complex traits using summary association statistics. Nat Rev Genet 18:117–127. https://doi.org/10.1038/nrg.2016.142

    Article  CAS  PubMed  Google Scholar 

  15. Lin DY, Zeng D (2009) Meta-analysis of genome-wide association studies: no efficiency gain in using individual participant data. Genet Epidemiol 34:60–66. https://doi.org/10.1002/gepi.20435

    Article  Google Scholar 

  16. Zeggini E, Ioannidis JP (2009) Meta-analysis in genome-wide association studies. Pharmacogenomics 10:191–201. https://doi.org/10.2217/14622416.10.2.191

    Article  PubMed  Google Scholar 

  17. de Bakker PIW, Ferreira MAR, Jia X, Neale BM, Raychaudhuri S, Voight BF (2008) Practical aspects of imputation-driven meta-analysis of genome-wide association studies. Hum Mol Genet 17:122–128. https://doi.org/10.1093/hmg/ddn288

    Article  CAS  Google Scholar 

  18. Barrett JC, Clayton DG, Concannon P, Akolkar B, Cooper JD, Erlich HA, Julier C, Morahan G, Nerup J, Nierras C et al (2009) Genome-wide association study and meta-analysis find that over 40 loci affect risk of type 1 diabetes. Nat Genet 41:703–707. https://doi.org/10.1038/ng.381

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  19. Franke A, McGovern DPB, Barrett JC, Wang K, Radford-Smith GL, Ahmad T, Lees CW, Balschun T, Lee J, Roberts R et al (2010) Genome-wide meta-analysis increases to 71 the number of confirmed Crohn’s disease susceptibility loci. Nat Genet 42:1118–1125. https://doi.org/10.1038/ng.717

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  20. Han B, Eskin E (2012) Interpreting meta-analyses of genome-wide association studies. PLoS Genet 8:e1002555. https://doi.org/10.1371/journal.pgen.1002555

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  21. Howie BN, Donnelly P, Marchini J (2009) A flexible and accurate genotype imputation method for the next generation of genome-wide association studies. PLoS Genet 5:e1000529. https://doi.org/10.1371/journal.pgen.1000529

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  22. Browning SR, Browning BL (2007) Rapid and accurate haplotype phasing and missing-data inference for whole-genome association studies by use of localized haplotype clustering. Am J Hum Genet 81:1084–1097. https://doi.org/10.1086/521987

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  23. Do C, Waples RS, Peel D, Macbeth GM, Tillett BJ, Ovenden JR (2014) NeEstimator v2: re-implementation of software for the estimation of contemporary effective population size (Ne) from genetic data. Mol Ecol Resour 14:209–214. https://doi.org/10.1111/1755-0998.12157

    Article  CAS  PubMed  Google Scholar 

  24. Willer CJ, Li Y, Abecasis GR (2010) METAL: fast and efficient meta-analysis of genome wide association scans. Bioinformatics 26:2190–2191. https://doi.org/10.1093/bioinformatics/btq340

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  25. Han B, Eskin E (2011) Random-effects model aimed at discovering associations in meta-analysis of genome-wide association studies. Am J Hum Genet 88:586–598. https://doi.org/10.1016/j.ajhg.2011.04.014

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  26. Turner D, Qqman S (2018) An R package for visualizing GWAS results using Q-Q and Manhattan plots. J Open Source Softw 3:005165. https://doi.org/10.21105/joss.00731

    Article  Google Scholar 

  27. Segura V, Vilhjálmsson BJ, Platt A, Korte A, Seren Ü, Long Q, Nordborg M (2012) An efficient multi-locus mixed-model approach for genome-wide association studies in structured populations. Nat Genet 44:825–830. https://doi.org/10.1038/ng.2314

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  28. Chalhoub B, Denoeud F, Liu S, Parkin IAP, Tang H, Wang X, Chiquet J, Belcram H, Tong C, Samans B et al (2014) Early allopolyploid evolution in the post-Neolithic Brassica napus oilseed genome. Science 345:950–953. https://doi.org/10.1126/science.1253435

    Article  CAS  PubMed  Google Scholar 

  29. Schaid DJ, Chen W, Larson NB (2018) From genome-wide associations to candidate causal variants by statistical fine-map**. Nat Rev Genet 19:491–504. https://doi.org/10.1038/s41576-018-0016-z

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  30. Muños S, Ranc N, Botton E, Bérard A, Rolland S, Duffé P, Carretero Y, le Paslier M-C, Delalande C, Bouzayen M et al (2011) Increase in tomato Locule number is controlled by two single-nucleotide polymorphisms located near WUSCHEL. Plant Physiol 156:2244–2254. https://doi.org/10.1104/pp.111.173997

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  31. Pruim RJ, Welch RP, Sanna S, Teslovich TM, Chines PS, Gliedt TP, Boehnke M, Abecasis GR, Willer CJ (2010) LocusZoom: regional visualization of genome-wide association scan results. Bioinformatics 26:2336–2337. https://doi.org/10.1093/bioinformatics/btq419

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  32. Wilson MA, Iversen ES, Clyde MA, Schmidler SC, Schildkraut JM (2010) Bayesian model search and multilevel inference for SNP association studies. Ann Appl Stat 4:1342. https://doi.org/10.1214/09-AOAS322

    Article  PubMed  PubMed Central  Google Scholar 

  33. Guan Y, Stephens M (2011) Bayesian variable selection regression for genome-wide association studies and other large-scale problems. Ann Appl Stat 5:1780–1815. https://doi.org/10.1214/11-AOAS455

    Article  Google Scholar 

  34. Wang G, Sarkar A, Carbonetto P, Stephens M (2020) A simple new approach to variable selection in regression, with application to genetic fine map**. J R Stat Soc Series B Stat Methodology 82:1273–1300. https://doi.org/10.1111/rssb.12388

    Article  Google Scholar 

  35. Maurano MT, Humbert R, Rynes E, Thurman RE, Haugen E, Wang H, Reynolds AP, Sandstrom R, Qu H, Brody J et al (2012) Systematic localization of common disease-associated variation in regulatory DNA downloaded from. Science 337:1190–1195. https://doi.org/10.1126/science.1222794

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  36. Nicolae DL, Gamazon E, Zhang W, Duan S, Dolan ME, Cox NJ (2010) Trait-associated SNPs are more likely to be EQTLs: annotation to enhance discovery from GWAS. PLoS Genet 6:e1000888. https://doi.org/10.1371/journal.pgen.1000888

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  37. Kremling KAG, Diepenbrock CH, Gore MA, Buckler ES, Bandillo NB (2019) Transcriptome-wide association supplements genome-wide Association in Zea Mays. G3 Genes Genom Genet 9:3023–3033. https://doi.org/10.1534/g3.119.400549

    Article  CAS  Google Scholar 

  38. Cano-Gamez E, Trynka G (2020) From GWAS to function: using functional genomics to identify the mechanisms underlying complex diseases. Front Genet 11:424. https://doi.org/10.3389/fgene.2020.00424

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  39. Sullivan A, Purohit PK, Freese NH, Pasha A, Esteban E, Waese J, Wu A, Chen M, Chin CY, Song R et al (2019) An ‘EFP-Seq browser’ for visualizing and exploring RNA sequencing data. Plant J 100:641–654. https://doi.org/10.1111/tpj.14468

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  40. Zhu G, Wang S, Huang Z, Zhang S, Liao Q, Zhang C, Lin T, Qin M, Peng M, Yang C et al (2018) Rewiring of the fruit metabolome in tomato breeding. Cell 172:249–261.e12. https://doi.org/10.1016/j.cell.2017.12.019

    Article  CAS  PubMed  Google Scholar 

  41. Lin HY, Liu Q, Li X, Yang J, Liu S, Huang Y, Scanlon MJ, Nettleton D, Schnable PS (2017) Substantial contribution of genetic variation in the expression of transcription factors to phenotypic variation revealed by ERD-GWAS. Genome Biol 18:1–14. https://doi.org/10.1186/s13059-017-1328-6

    Article  CAS  Google Scholar 

  42. Alonso-Blanco C, Andrade J, Becker C, Bemm F, Bergelson J, Borgwardt KMM, Cao J, Chae E, Dezwaan TMM, Ding W et al (2016) 1,135 genomes reveal the global pattern of polymorphism in Arabidopsis Thaliana. Cell 166:481–491. https://doi.org/10.1016/j.cell.2016.05.063

    Article  CAS  Google Scholar 

  43. Liu B, Gloudemans MJ, Rao AS, Ingelsson E, Montgomery SB (2019) Abundant associations with gene expression complicate GWAS follow-up. Nat Genet 51:768–769. https://doi.org/10.1038/s41588-019-0404-0

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  44. Li D, Liu Q, Schnable PS (2021) TWAS results are complementary to and less affected by linkage disequilibrium than GWAS. Plant Physiol 186:1800–1811. https://doi.org/10.1093/plphys/kiab161

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  45. Conesa A, Madrigal P, Tarazona S, Gomez-Cabrero D, Cervera A, McPherson A, Szcześniak MW, Gaffney DJ, Elo LL, Zhang X et al (2016) A survey of best practices for RNA-Seq data analysis. Genome Biol 17:1–19. https://doi.org/10.1186/s13059-016-0881-8

    Article  CAS  Google Scholar 

  46. Stegle O, Parts L, Piipari M, Winn J, Durbin R (2012) Using probabilistic estimation of expression residuals (PEER) to obtain increased power and interpretability of gene expression analyses. Nat Protoc 7:500–507. https://doi.org/10.1038/nprot.2011.457

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  47. Lipka AE, Tian F, Wang Q, Peiffer J, Li M, Bradbury PJ, Gore MA, Buckler ES, Zhang Z (2012) GAPIT: genome association and prediction integrated tool. Bioinformatics 28:2397–2399. https://doi.org/10.1093/bioinformatics/bts444

    Article  CAS  PubMed  Google Scholar 

  48. Wainberg M, Sinnott-Armstrong N, Mancuso N, Barbeira AN, Knowles DA, Golan D, Ermel R, Ruusalepp A, Quertermous T, Hao K et al (2019) Opportunities and challenges for transcriptome-wide association studies. Nat Genet 51:592–599. https://doi.org/10.1038/s41588-019-0385-z

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  49. Grimm DG, Roqueiro D, Salomé PA, Kleeberger S, Greshake B, Zhu W, Liu C, Lippert C, Stegle O, Schölkopf B et al (2017) EasyGWAS: a cloud-based platform for comparing the results of genome-wide association studies. Plant Cell 29:5–19. https://doi.org/10.1105/tpc.16.00551

    Article  CAS  PubMed  Google Scholar 

  50. Umit Seren GWA-Portal (2018) Genome-wide association studies made easy. Methods Mol Biol 1761:303–319

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Science+Business Media, LLC, part of Springer Nature

About this protocol

Check for updates. Verify currency and authenticity via CrossMark

Cite this protocol

Albert, E., Sauvage, C. (2022). Identification and Validation of Candidate Genes from Genome-Wide Association Studies. In: Torkamaneh, D., Belzile, F. (eds) Genome-Wide Association Studies. Methods in Molecular Biology, vol 2481. Humana, New York, NY. https://doi.org/10.1007/978-1-0716-2237-7_15

Download citation

  • DOI: https://doi.org/10.1007/978-1-0716-2237-7_15

  • Published:

  • Publisher Name: Humana, New York, NY

  • Print ISBN: 978-1-0716-2236-0

  • Online ISBN: 978-1-0716-2237-7

  • eBook Packages: Springer Protocols

Publish with us

Policies and ethics

Navigation