Abstract
Some methods aim to correct or test for relationships or to reconstruct the pedigree, or family tree. We show that these methods cannot resolve ties for correct relationships due to identifiability of the pedigree likelihood which is the probability of inheriting the data under the pedigree model. This means that no likelihood-based method can produce a correct pedigree inference with high probability. This lack of reliability is critical both for health and forensics applications.
Pedigree inference methods use a structured machine learning approach where the objective is to find the pedigree graph that maximizes the likelihood. Known pedigrees are useful for both association and linkage analysis which aim to find the regions of the genome that are associated with the presence and absence of a particular disease. This means that errors in pedigree prediction have dramatic effects on downstream analysis.
In this paper we present the first discussion of multiple typed individuals in non-isomorphic pedigrees, \(\mathcal{P}\) and \(\mathcal{Q}\), where the likelihoods are non-identifiable, \(Pr[G~|~\mathcal{P},\theta] = Pr[G~|~\mathcal{Q},\theta]\), for all input data G and all recombination rate parameters θ. While there were previously known non-identifiable pairs, we give an example having data for multiple individuals.
Additionally, deeper understanding of the general discrete structures driving these non-identifiability examples has been provided, as well as results to guide algorithms that wish to examine only identifiable pedigrees. This paper introduces a general criteria for establishing whether a pair of pedigrees is non-identifiable and two easy-to-compute criteria guaranteeing identifiability. Finally, we suggest a method for dealing with non-identifiable likelihoods: use Bayes rule to obtain the posterior from the likelihood and prior. We propose a prior guaranteeing that the posterior distinguishes all pairs of pedigrees.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Abecasis, G.R., Cherny, S.S., Cookson, W.O., et al.: Merlin-rapid analysis of dense genetic maps using sparse gene flow trees. Nature Genetics 30, 97–101 (2002)
Bourgain, C., Hoffjan, S., Nicolae, R., et al.: Novel case-control test in a founder population identifies p-selectin as an atopy-susceptibility locus. American Journal of Human Genetics 73(3), 612–626 (2003)
Browning, S., Browning, B.L.: On reducing the statespace of hidden Markov models for the identity by descent process. Theoretical Population Biology 62(1), 1–8 (2002)
Coop, G., Wen, X., Ober, C., et al.: High-Resolution Map** of Crossovers Reveals Extensive Variation in Fine-Scale Recombination Patterns Among Humans. Science 319(5868), 1395–1398 (2008)
Donnelly, K.P.: The probability that related individuals share some section of genome identical by descent. Theoretical Population Biology 23(1), 34–63 (1983)
Fishelson, M., Dovgolevsky, N., Geiger, D.: Maximum likelihood haploty** for general pedigrees. Human Heredity 59, 41–60 (2005)
Geiger, D., Meek, C., Wexler, Y.: Speeding up HMM algorithms for genetic linkage analysis via chain reductions of the state space. Bioinformatics 25(12), i196 (2009)
Kirkpatrick, B., Kirkpatrick, K.: Optimal State-Space Reduction for Pedigree Hidden Markov Models. Ar**v e-prints (February 2012)
Kirkpatrick, B., Li, S.C., Karp, R.M., Halperin, E.: Pedigree Reconstruction Using Identity by Descent. In: Bafna, V., Sahinalp, S.C. (eds.) RECOMB 2011. LNCS, vol. 6577, pp. 136–152. Springer, Heidelberg (2011)
Lauritzen, S.L., Sheehan, N.A.: Graphical models for genetic analysis. Statistical Science 18(4), 489–514 (2003)
McPeek, M.S.: Inference on pedigree structure from genome screen data. Statistica Sinica 12(1), 311–336 (2002)
McPeek, M.S., Sun, L.: Statistical tests for detection of misspecified relationships by use of genome-screen data. Amer. J. Human Genetics 66, 1076–1094 (2000)
Pinto, N., Silva, P.V., Amorim, A.: General derivation of the sets of pedigrees with the same kinship coefficients. Hum. Hered. 70(3), 194–204 (2010)
Sobel, E., Lange, K.: Descent graphs in pedigree analysis: Applications to haploty**, location scores, and marker-sharing statistics. American Journal of Human Genetics 58(6), 1323–1337 (1996)
Stankovich, J., Bahlo, M., Rubio, J.P., et al.: Identifying nineteenth century genealogical links from genotypes. Human Genetics 117(2-3), 188–199 (2005)
Sun, L., Wilder, K., McPeek, M.S.: Enhanced pedigree error detection. Hum. Hered. 54(2), 99–110 (2002)
Thatte, B.D.: Reconstructing pedigrees: some identifiability questions for a recombination-mutation model. Ar**v e-prints (August 2010)
Thompson, E.A.: The estimation of pairwise relationships. Annals of Human Genetics 39(2), 173–188 (1975)
Thompson, E.A.: Pedigree Analysis in Human Genetics. Johns Hopkins University Press, Baltimore (1985)
Thornton, T., McPeek, M.S.: Case-control association testing with related individuals: A more powerful quasi-likelihood score test. American Journal of Human Genetics 81, 321–337 (2007)
Thornton, T., McPeek, M.S.: ROADTRIPS: case-control association testing with partially or completely unknown population and pedigree structure. American Journal of Human Genetics 86(2), 172–184 (2010)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Kirkpatrick, B. (2012). Non-identifiable Pedigrees and a Bayesian Solution. In: Bleris, L., Măndoiu, I., Schwartz, R., Wang, J. (eds) Bioinformatics Research and Applications. ISBRA 2012. Lecture Notes in Computer Science(), vol 7292. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-30191-9_14
Download citation
DOI: https://doi.org/10.1007/978-3-642-30191-9_14
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-30190-2
Online ISBN: 978-3-642-30191-9
eBook Packages: Computer ScienceComputer Science (R0)