Abstract
Genome-wide association studies (GWASs) and other high-throughput initiatives have led to an information explosion in human genetics and genetic epidemiology. Conversion of this wealth of new information about genomic variation to knowledge about public health and human biology will depend critically on the complexity of the genotype to phenotype map** relationship. We review here computational approaches to genetic analysis that embrace, rather than ignore, the complexity of human health. We focus on multifactor dimensionality reduction (MDR) as an approach for modeling one of these complexities: epistasis or gene–gene interaction.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Hirschhorn JN, Daly MJ (2005) Genome-wide association studies for common diseases and complex traits. Nat Rev Genet 6:95–108
Wang WY, Barratt BJ, Clayton DG, Todd JA (2005) Genome-wide association studies: theoretical and practical concerns. Nat Rev Genet 6:109–118
Manolio TA (2010) Genome-wide association studies and assessment of the risk of disease. N Engl J Med 363(2):166–176
Ritchie MD, Hahn LW, Roodi N, Bailey LR, Dupont WD, Parl FF, Moore JH (2001) Multifactor dimensionality reduction reveals high-order interactions among estrogen metabolism genes in sporadic breast cancer. Am J Hum Genet 69(1):138–147
Franke A et al (2010) Genome-wide meta-analysis increases to 71 the number of confirmed Crohn’s disease susceptibility loci. Nat Genet 42:1118–1125
Eichler EE et al (2010) Missing heritability and strategies for finding the underlying causes of complex disease. Nat Rev Genet 11:446–450
Williams SM, Canter JA, Crawford DC, Moore JH, Ritchie MD, Haines JL (2007) Problems with genome-wide association studies. Science 316:1840–1842
Moore JH, Williams SM (2009) Epistasis and its implications for personal genetics. Am J Hum Genet 85(3):309–320
Moore JH (2010) Bioinformatics challenges for genome-wide association studies. Bioinformatics 26(4):445–455
Bateson W, Saunders ER, Punnett RC, Hurst CC (1905) Reports to the Evolution Committee of the Royal Society, report II. Harrison and Sons, London
Thornton-Wells TA, Moore JH, Haines JL (2004) Genetics, statistics and human disease: analytical retooling for complexity. Trends Genet 20(12):640–647
Phillips PC (2008) Epistasis—the essential role of gene interactions in the structure and evolution of genetic systems. Nat Rev Genet 9:855–867
Cordell HJ (2002) Epistasis: what it means, what it doesn’t mean, and statistical methods to detect it in humans. Hum Mol Genet 11(20):2463–2468
Cordell HJ (2009) Detecting gene–gene interactions that underlie human diseases. Nat Rev Genet 10:392–404
Phillips PC (1998) The language of gene interaction. Genetics 149(3):1167–1171
Moore JH, Williams SW (2005) Traversing the conceptual divide between biological and statistical epistasis: systems biology and a more modern synthesis. Bioessays 27(6):637–646
Tyler AL, Asselbergs FW, Williams SM, Moore JH (2009) Shadows of complexity: what biological networks reveal about epistasis and pleiotropy. Bioessays 31(2):220–227
Gibson G (2009) Decanalization and the origin of complex disease. Nat Rev Genet 10:134–140
Moore JH (2005) A global view of epistasis. Nat Genet 37(1):13–14
Moore JH (2003) The ubiquitous nature of epistasis in determining susceptibility to common human diseases. Hum Hered 56(1–3):73–82
Teare MD, Barrett JH (2005) Genetic linkage studies. Lancet 336(9940):1036–1044
Cordell HJ, Clayton DG (2005) Genetic association studies. Lancet 336(9491):1121–1131
Moore JH, Ritchie MD (2004) The challenges of whole-genome approaches to common diseases. J Am Med Assoc 291(13):1642–1643
Clark AG, Boerwinkle E, Hixson J, Sing CF (2005) Determinants of the success of whole-genome association testing. Genome Res 15:1463–1467
McKinney BA, Reif DM, Ritchie MD, Moore JH (2006) Machine learning for detecting gene–gene interactions: a review. Appl Bioinformatics 5(2):77–88
Jiang R, Tang W, Wu X, Fu W (2009) A random forest approach to the detection of epistatic interactions in case–control studies. BMC Bioinformatics 10(Suppl 1):S65
Lunetta KL, Hayward LB, Segal J, Eerdewegh PV (2004) Screening large-scale association study data: exploiting interactions using random forest. BMC Genet 5:32
Bureau A, Dupuis J, Falls K, Lunetta KL, Hayward B, Keith TP, Eerdewegh PV (2005) Identifying SNPs predictive of phenotype using random forest. Genet Epidemiol 28(2):171–182
Breiman L, Friedman JH, Olshen RA, Stone CJ (1984) Classification and regression trees. Chapman and Hall, New York
Mitchell T (1997) Machine learning. McGraw-Hill, New York
Breiman L (2001) Random Forests. Machine Learning 45(1):5–32
Cook NR, Zee RY, Ridker PM (2004) Tree and spline based association analysis of gene–gene interaction models for ischemic stroke. Stat Med 23(9):1439–1453
McKinney BA, Crowe JE, Guo J, Tian D (2009) Capturing the spectrum of interaction effects in genetic association studies by simulated evaporative cooling network analysis. PLoS Genet 5:e1000432
Strobl C, Boulesteix A, Zeileis A, Hothorn T (2007) Bias in random forest variable importance measures: illustrations, sources and a solution. BMC Bioinformatics 8:25
Hahn LW, Ritchie MD, Moore JH (2003) Multifactor dimensionality reduction software for detecting gene–gene and gene–environment interactions. Bioinformatics 19(3):376–382
Ritchie MD, Hahn LW, Moore JH (2003) Power of multifactor dimensionality reduction for detecting gene–gene interactions in the presence of genoty** error, phenocopy, and genetic heterogeneity. Genet Epidemiol 24(2):150–157
Hahn LW, Moore JH (2004) Ideal discrimination of discrete clinical endpoints using multilocus genotypes. In Silico Biol 4:183–194
Moore JH (2004) Computational analysis of gene–gene interactions in common human diseases using multifactor dimensionality reduction. Expert Rev Mol Diagn 4(6):795–803
Moore JH et al (2006) A flexible computational framework for detecting characterizing, and interpreting statistical patterns of epistasis in genetic studies of human disease susceptibility. J Theor Biol 241:252–261
Moore JH et al (2007) Genome-wide analysis of epistasis using multifactor dimensionality reduction: feature selection and construction in domain of human genetics. In: Zhu X, Davidson I (eds) Knowledge Discovery and Data Mining: Challenges and Realities, IGI Global 17–30
Moore JH (2010) Detecting, characterizing, and interpreting nonlinear gene–gene interactions using multifactor dimensionality reduction. Adv Genet 72:101–116
Velez DR, White BC, Motsinger AA, Bush WS, Ritchie MD, Williams SM, Moore JH (2007) A balanced accuracy function for epistasis modeling in imbalanced datasets using multifactor dimensionality reduction. Genet Epidemiol 31(4):306–315
Greene CS, Himmelstein DS, Nelson HH, Kelsey KT, Williams SM, Andrew AS, Karagas MR, Moore JH (2010) Enabling personal genomics with an explicit test of epistasis. Pac Symp Biocomput 2010:327–336
Gui J, Andrew AS, Andrews P, Nelson HM, Kelsey KT, Karagas MR, Moore JH (2011) A robust multifactor dimensionality reduction method for detecting gene–gene interactions with application to the genetic analysis of bladder cancer susceptibility. Ann Hum Genet 75(1):20–28
Gui J, Moore JH, Kelsey KT, Marsit CJ, Karagas MR, Andrew AS (2011) A novel survival multifactor dimensionality reduction method for detecting gene–gene interactions with application to bladder cancer prognosis. Hum Genet 129(1):101–110
Gui J, Andrew AS, Andrews P, Nelson HM, Kelsey KT, Karagas MR, Moore JH (2010) A simple and computationally efficient sampling approach to covariate adjustment for multifactor dimensionality reduction analysis of epistasis. Hum Hered 70(3):219–225
Calle ML, Urrea V, Malats N, Van Steen K (2010) mbmdir: an R package for exploring gene–gene interactions associated with binary or quantitative traits. Bioinformatics 26(17):2198–2199
Cattaert T, Calle ML, Dudek SM, Mahachie John JM, Van Lishout F, Urrea V, Ritchie MD, Van Steen K (2011) Model-based multifactor dimensionality reduction for detecting epistasis in case–control data in the presence of noise. Ann Hum Genet 75(1):78–89
Lou XY, Chen GB, Yan L, Ma JZ, Zhou J, Elston RC, Li MD (2007) A generalized combinatorial approach for detecting gene-by-gene and gene-by-environment interactions with application to nicotine dependence. Am J Hum Genet 80(6):1125–1137
Kira K, Rendell LA (1992) A practical approach to feature selection. Proceedings of the ninth international workshop on machine learning, pp 249–256
Kononenko I (1994). Estimating attributes: analysis and extension of Relief. Proceedings of the European conference on machine learning, pp 171–182
Robnik-Siknja M, Kononenko I (2003) Theoretical and empirical analysis of ReliefF and RReliefF. Machine Learning 53:23–69
Robnik-Sikonja M, Kononenko I (2001) Comprehensible interpretation of Relief’s estimates. Proceedings of the eighteenth international conference on machine learning, pp 433–440
Moore JH, White BC (2007) Tuning ReliefF for genome-wide genetic analysis. Lect Notes Comput Sci 4447:166–175
McKinney BA, Reif DM, White BC, Crowe JE Jr, Moore JH (2007) Evaporative cooling feature selection for genotypic data involving interactions. Bioinformatics 23(16):2113–2120
Greene CS et al (2008) Spatially uniform ReliefF (SURF) for computationally-efficient filtering of gene–gene interactions. BioData Min 2:5
Greene CS, Himmelstein DS, Kiralis J, Moore JH (2010) The informative extremes: using both nearest and farthest individuals can improve Relief algorithms in the domain of human genetics. Lect Notes Comput Sci 6023:182–193
Pattin KA, Moore JH (2008) Exploiting the proteome to improve the genome-wide genetic analysis of epistasis in common human diseases. Hum Genet 124:19–29
Bush WS, Dudek SM, Ritchie MD (2009) Biofilter: a knowledge-integration system for the multi-locus analysis of genome-wide association studies. Pac Symp Biocomput 368–379
Askland K, Read C, Moore J (2009) Pathways-based analyses of whole-genome association study data in bipolar disorder reveal genes mediating ion channel activity and synaptic neurotransmission. Hum Genet 125:63–79
Michalewicz Z, Fogel DB (2004) How to solve it: modern heuristics. Springer, New York
Greene CS et al (2009) Optimal use of expert knowledge in ant colony optimization for the analysis of epistasis in human disease. Lect Notes Comput Sci 5483:92–103
Sinnott-Armstrong NA, Green CS, Cancare F, Moore JH (2009) Accelerating epistasis analysis in human genetics with consumer graphics hardware. BMC Res Notes 2:149
Payne JL, Sinnott-Armstrong NA, Moore JH (2010) Exploiting graphics processing units for computational biology and bioinformatics. Interdiscip Sci 2(3):213–220
Greene CS et al (2010) Multifactor dimensionality reduction for graphics processing units enables genome-wide testing of epistasis in sporadic ALS. Bioinformatics 26:694–695
Newman MEJ (2010) Networks: an introduction. Oxford University Press, New York
Strogatz SH (2001) Exploring complex networks. Nature 410:268–276
Andrei A, Kendziorski C (2009) An efficient method for identifying statistical interactors in gene association networks. Biostatistics 10:706–718
Chu JH et al (2009) A graphical model approach for inferring large-scale networks integrating gene expression and genetic polymorphism. BMC Syst Biol 3:55
Schafer J, Strimmer K (2005) An empirical Bayes approach to inferring large-scale gene association. Bioinformatics 21(6):754–764
Hu T, Sinnott-Armstrong NA, Kiralis JW, Andrew AS, Karagas MR, Moore JH (2011) Characterizing genetic interactions in human disease association studies using statistical epistasis networks. BMC Bioinformatics 12:364
Cover TM, Thomas JA (2006) Elements of information theory, 2nd edn. Wiley, New York
Jeong H et al (2001) Lethality and centrality in protein networks. Nature 411:41–42
Cowper-Sal lari R, Cole MD, Karagas MR, Lupien M, Moore JH (2011) Layers of epistasis: genome-wide regulatory networks and network approaches to genome-wide association studies. Wiley Interdiscip Rev Syst Biol Med 3(5):513–526
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer Science+Business Media, LLC
About this protocol
Cite this protocol
Pan, Q., Hu, T., Moore, J.H. (2013). Epistasis, Complexity, and Multifactor Dimensionality Reduction. In: Gondro, C., van der Werf, J., Hayes, B. (eds) Genome-Wide Association Studies and Genomic Prediction. Methods in Molecular Biology, vol 1019. Humana Press, Totowa, NJ. https://doi.org/10.1007/978-1-62703-447-0_22
Download citation
DOI: https://doi.org/10.1007/978-1-62703-447-0_22
Published:
Publisher Name: Humana Press, Totowa, NJ
Print ISBN: 978-1-62703-446-3
Online ISBN: 978-1-62703-447-0
eBook Packages: Springer Protocols