Combining principal component analysis with parameter line-searches to improve the efficacy of Metropolis–Hastings MCMC

Kennedy, David A.; Dukic, Vanja; Dwyer, Greg

doi:10.1007/s10651-014-0297-0

Combining principal component analysis with parameter line-searches to improve the efficacy of Metropolis–Hastings MCMC

Published: 27 September 2014

Volume 22, pages 247–274, (2015)
Cite this article

Environmental and Ecological Statistics Aims and scope Submit manuscript

David A. Kennedy^1,2,3,
Vanja Dukic⁴ &
Greg Dwyer¹

656 Accesses
9 Citations
Explore all metrics

Abstract

When Markov chain Monte Carlo (MCMC) algorithms are used with complex mechanistic models, convergence times are often severely compromised by poor mixing rates and a lack of computational power. Methods such as adaptive algorithms have been developed to improve mixing, but these algorithms are typically highly sophisticated, both mathematically and computationally. Here we present a nonadaptive MCMC algorithm, which we term line-search MCMC, that can be used for efficient tuning of proposal distributions in a highly parallel computing environment, but that nevertheless requires minimal skill in parallel computing to implement. We apply this algorithm to make inferences about dynamical models of the growth of a pathogen (baculovirus) population inside a host (gypsy moth, Lymantria dispar). The line-search MCMC appeal rests on its ease of implementation, and its potential for efficiency improvements over classical MCMC in a highly parallel setting, which makes it especially useful for ecological models.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

EUR 32.99 /Month

Get 10 units per month
Download Article/Chapter or Ebook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Markov Chain Monte Carlo Algorithms

A Scheme for Adaptive Selection of Population Sizes in Approximate Bayesian Computation - Sequential Monte Carlo

Efficient construction of Bayes optimal designs for stochastic process models

Article 19 September 2018

References

Alizon S, van Baalen M (2008) Acute or chronic? Within-host models with immune dynamics, infection outcome, and parasite evolution. Am Nat 172:E244–E256
Article PubMed Google Scholar
Antia R, Levin B, May R (1994) Within-host population-dynamics and the evolution and maintenance of microparasite virulence. Am Nat 144:457–472
Article Google Scholar
Armenian H, Lilienfeld A (1983) Incubation period of disease. Epidemiol Rev 5:1–15
CAS PubMed Google Scholar
Ashida M, Brey P (1998) Molecular mechanisms of immune responses in insects. Chapman & Hall, London
Google Scholar
Baldwin K, Hakim R (1991) Growth and differentiation of the larval midgut epithelium during molting in the moth, Manduca sexta. Tissue Cell 23:411–422
Article CAS PubMed Google Scholar
Beaumont M, Zhang W, Balding D (2002) Approximate Bayesian computation in population genetics. Genetics 162:2025–2035
PubMed Central PubMed Google Scholar
Bogich T, Shea K (2008) A state-dependent model for the optimal management of an invasive metapopulation. Ecol Appl 18:748–761
Article PubMed Google Scholar
Bolker B (2008) Ecological models and data in R. Princeton University Press, New Jersey
Google Scholar
Braun M (1983) Differential equations and their applications, an introduction to applied mathematics, 3rd edn. Springer, New York
Book Google Scholar
Brigham C, Power A, Hunter A (2002) Evaluating the internal consistency of recovery plans for federally endangered species. Ecol Appl 12:648–654
Article Google Scholar
Brockwell A (2006) Parallel Markov chain Monte Carlo simulation by pre-fetching. J Comput Graph Stat 15:246–261
Article Google Scholar
Chakerian J, Holmes S (2012) Computational tools for evaluating phylogenetic and hierarchical clustering trees. J Comput Graph Stat 21:581–599
Article Google Scholar
Comon P (1994) Independent component analysis, a new concept. Signal Proces 36:287–314
Article Google Scholar
Cory J, Myers J (2003) The ecology and evolution of insect baculoviruses. Annu Rev Ecol Evol Syst 34:239–272
Article Google Scholar
Cowles M, Carlin B (1996) Markov chain Monte Carlo convergence diagnostics: a comparative review. J Am Stat Assoc 91:883–904
Article Google Scholar
Craiu R, Rosenthal J, Yang C (2009) Learn from thy neighbor: parallel-chain and regional adaptive MCMC. J Am Stat Assoc 104:1454–1466
Article Google Scholar
Csillery K, Blum M, Gaggiotti O, Francois O (2010) Approximate Bayesian computation (ABC) in practice. Trends Ecol Evol 25:410–418
Doak DF, Morris WF (2010) Demographic compensation and tip** points in climate-induced range shifts. Nature 467:959–962
Article CAS PubMed Google Scholar
Doob J (1945) Markoff chains: denumerable case. Trans Am Math Soc 58:455–473
Google Scholar
Dukic V, Lopes H, Polson N (2012) Tracking epidemics with Google Flu trends data and a state-space SEIR model. J Am Stat Assoc 107:1410–1426
Article CAS Google Scholar
Feng H, Gould F, Huang Y, Jiang Y, Wu K (2010) Modeling the population dynamics of cotton bollworm Helicoverpa armigera (Hubner) (Lepidoptera: Noctuidae) over a wide area in northern China. Ecol Model 221:1819–1830
Article Google Scholar
Fuller E, Elderd B, Dwyer G (2012) Pathogen persistence in the environment and insect-baculovirus interactions: disease-density thresholds, epidemic burnout and insect outbreaks. Am Nat 179:E70–E96
Fuller S, Millet L (2011) Computing performance: Game over or next level? IEEE Comput 44:31–38
Article Google Scholar
Geer D (2005) Chip makers turn to multicore processors. IEEE Comput 38:11–13
Article Google Scholar
Gelman A, Rubin D (1992) Inference from iterative simulation using multiple sequences. Stat Sci 7:457–472
Article Google Scholar
Gilchrist M, Sasaki A (2002) Modeling host-parasite coevolution: a nested approach based on mechanistic models. J Theor Biol 218:289–308
Article PubMed Google Scholar
Gilks W, Roberts G (1996) Markov chain Monte Carlo in practice, chapter Introducing Markov chain Monte Carlo. Chapman & Hall, London
Google Scholar
Gillespie D (1977) Exact stochastic simulation of coupled chemical-reactions. J Phys Chem 81:2340–2361
Article CAS Google Scholar
Girolami M, Calderhead B (2011) Riemann manifold Langevin and Hamiltonian Monte Carlo methods. J R Stat Soc Ser B 73:123–214
Article Google Scholar
Grant A, Restif O, McKinley T, Sheppard M, Maskell D, Mastroeni P (2008) Modelling within-host spatiotemporal dynamics of invasive bacterial disease. PLoS Biol 6:757–770
Article CAS Google Scholar
Haario H, Saksman E, Tamminen J (2001) An adaptive Metropolis algorithm. Bernoulli 7:223–242
Article Google Scholar
Hartig F, Calabrese JM, Reineking B, Wiegand T, Huth A (2011) Statistical inference for stochastic simulation models—theory and application. Ecol Lett 14:816–827
Article PubMed Google Scholar
Heidelberger P, Welch P (1983) Simulation run length control in the presence on an initial transient. Oper Res 31:1109–1144
Article Google Scholar
Hoover K, Washburn J, Volkman L (2000) Midgut-based resistance of Heliothis virescens to baculovirus infection mediated by phytochemicals in cotton. J Insect Physiol 46:999–1007
Hunter-Fujita F, Entwistle P, Evans H, Crook N (1998) Insect viruses and pest management. Wiley, Chichester
Google Scholar
Ionides E, Breto C, King A (2006) Inference for nonlinear dynamical systems. Proc Natl Sci USA 103:18438–18443
Article CAS Google Scholar
Jacob P, Robert C, Smith M (2011) Using parallel computation to improve independent Metropolis-Hastings based estimation. J Comput Graph Stat 20:616–635
Article Google Scholar
Jolliffe I (1986) Principal component analysis. Springer, New York
Book Google Scholar
Karlin S, Taylor H (1975) A first course in stochastic processes. Academic, New York
Google Scholar
Kennedy DA, Dukic V, Dwyer G (2014) The mechanisms determining the within-host population dynamics of an insect pathogen. Am Nat 184:407–423
Khorsheed E, Hurn M, Jennison C (2011) Map** electron density in the ionosphere: a principal component MCMC algorithm. Comput Stat Data Anal 55:338–352
Article Google Scholar
Kimura M (1983) The neutral theory of molecular evolution. Cambridge University Press, New York
Book Google Scholar
King A, Shrestha S, Harvill E, Bjørnstad O (2009) Evolution of acute infections and the invasion-persistence trade-off. Am Nat 173:446–455
Article PubMed Central PubMed Google Scholar
Kot M (2001) Elements of mathematical ecology. Cambridge University Press, Cambridge
Book Google Scholar
Lele S, Dennis B, Lutscher F (2007) Data cloning: easy maximum likelihood estimation for complex ecological models using Bayesian Markov chain Monte Carlo methods. Ecol Lett 10:551–563
Article PubMed Google Scholar
Lele S, Nadeem K, Schmuland B (2010) Estimability and likelihood inference for generalized linear mixed models using data cloning. J Am Stat Assoc 105:1617–1625
Article CAS Google Scholar
Liu J (2001) Monte Carlo strategies in scientific computing. Springer, Berlin
Google Scholar
Luenberger D, Ye Y (2008) Linear and nonlinear programming, 3rd edn. Springer Science and Business Media, New York
Google Scholar
McNeil J, Cox-Foster D, Gardner M, Slavicek J, Thiem S, Hoover K (2010) Pathogenesis of Lymantria dispar multiple nucleopolyhedrovirus (LdMNPV) in L. dispar and mechanisms of developmental resistance. J Gen Virol 91:1590–1600
Meynell G (1957) The applicability of the hypothesis of independent action to fatal infections in mice given Salmonella typhimurium by mouth. J Gen Microbiol 16:396–404
Article CAS PubMed Google Scholar
Miller G (2010) Markov chain Monte Carlo calculations allowing parallel processing using a variant of the Metropolis algorithm. Open Numer Methods J 2:12–17
Article Google Scholar
Morgan B (1992) Analysis of quantal response data. Chapman & Hall, London
Book Google Scholar
Mudholkar G, Srivastava D, Kollia G (1996) A generalization of the Weibull distribution with application to the analysis of survival data. J Am Stat Assoc 91:1575–1583
Article Google Scholar
Plummer M, Best N, Cowles K, Vines K. (2009) coda: Output analysis and diagnostics for MCMC. R package version 0.13-4
Ponciano J, Burleigh J, Braun E, Taper M (2012) Assessing parameter identifiability in phylogenetic models using data cloning. Syst Biol 61:955–972
Article PubMed Central PubMed Google Scholar
Press W, Teukolsky S, Vetterling W, Flannery B (1992) Numerical recipes in C. Cambridge University Press, Cambridge
Google Scholar
Development Core Team R (2009) R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. ISBN 3-900051-07-0
Robert C, Cornuet J, Marin J, Pillai N (2011) Lack of confidence in approximate Bayesian computation model choice. Proc Natl Acad Sci USA 108:15112–15117
Article PubMed Central CAS PubMed Google Scholar
Roberts G, Gelman A, Gilks W (1997) Weak convergence and optimal scaling of random walk Metropolis algorithms. Ann Appl Probab 7:110–120
Article Google Scholar
Rosenthal J (2000) Parallel computing and Monte Carlo algorithms. Far East J Theor Stat 4:207–236
Google Scholar
Saaty T (1961) Some stochastic-processes with absorbing barriers. J R Stat Soc Ser B Stat Methodol 23:319–334
Google Scholar
Schmid-Hempel P (2005) Evolutionary ecology of insect immune defenses. Annu Rev Entomol 50:529–551
Article CAS PubMed Google Scholar
Schölkopf B, Smola A, Müller K (1998) Nonlinear component analysis as a kernel eigenvalue problem. Neural Comput 10:1299–1319
Article Google Scholar
Shapiro M, Farrar R Jr, Domek J, Javaid I (2002) Effects of virus concentration and ultraviolet irradiation on the activity of corn earworm and beet armyworm (Lepidoptera:Noctuidae) nucleopolyhedroviruses. J Econ Entomol 95:243–249
Article PubMed Google Scholar
Shapiro M, Robertson J, Bell R (1986) Quantitative and qualitative differences in gypsy moth (Lepidoptera: Lymantriidae) nucleopolyhedrosis virus produced in different-aged larvae. J Econ Entomol 79:1174–1177
Article Google Scholar
Shortley G (1965) A stochastic model for distributions of biological response times. Biometrics 21:562–582
Article CAS PubMed Google Scholar
Solonen A, Ollinaho P, Laine M, Haario H, Tamminen J, Jarvinen H (2012) Efficient MCMC for climate model parameter estimation: parallel adaptive chains and early rejection. Bayesian Anal 7:715–736
Article Google Scholar
Strid I (2010) Efficient parallelisation of Metropolis-Hastings algorithms using a prefetching approach. Comput Stat Data Anal 54:2814–2835
Article Google Scholar
Trudeau D, Washburn J, Volkman L (2001) Central role of hemocytes in Autographa californica M nucleopolyhedrovirus pathogenesis in Heliothis virescens and Helicoverpa zea. J Virol 75:996–1003
Article PubMed Central CAS PubMed Google Scholar
Turchin P (2003) Complex population dynamics: a theoretical/empirical synthesis. Princeton University Press, Princeton
Google Scholar
van Beek N, Flore P, Wood H, Hughes P (1990) Rate of increase of Autographa californica nuclear polyhedrosis virus in Trichoplusia ni larvae determined by DNA-DNA hybridization. J Invertebr Pathol 55:85–92
Article PubMed Google Scholar
van Beek N, Hughes P, Wood H (2000) Effects of incubation temperature on the dose-survival time relationship of Trichoplusia ni larvae infected with Autographa californica nucleopolyhedrovirus. J Invertebr Pathol 76:185–190
Article PubMed Google Scholar
van Beek N, Wood H, Hughes P (1988) Quantitative aspects of nuclear polyhedrosis virus infections in Lepidopterous larvae: the dose-survival time relationship. J Invertebr Pathol 51:58–63
Article Google Scholar
van den Berg S, Beem L, Boomsma D (2006) Fitting genetic models using Markov chain Monte Carlo algorithms with BUGS. Twin Res Hum Genet 9:334–342
Article PubMed Google Scholar
Vaughan T, Drummond P, Drummond A (2012) Within-host demographic fluctuations and correlations in early retroviral infection. J Theor Biol 295:86–99
Article CAS PubMed Google Scholar
Wilkinson D (2005) Handbook of Parallel computing and statistics, chapter parallel Bayesian computation. Dekker/CRC Press, New York
Google Scholar
Yan J, Cowles M, Wang S, Armstrong M (2007) Parallelizing MCMC for Bayesian spatiotemporal geostatistical models. Stat Comput 17:323–335
Article Google Scholar
Zwart M, Hemerik L, Cory J, de Visser J, Bianchi F, Van Oers M, Vlak J, Hoekstra R, Van der Werf W (2009) An experimental test of the independent action hypothesis in virus-insect pathosystems. Proc R Soc Lond Ser B-Biol Sci 276:2233–2242
Article Google Scholar

Download references

Acknowledgments

DAK was supported by an ARCS fellowship, a GAANN training grant while at the University of Chicago, and the RAPIDD program of the Science and Technology Directorate, Department of Homeland Security and Fogarty International Center, National Institutes of Health (NIH). GD and VD were supported by NIH Grant R01GM096655. VD was also supported by Grants NSF-DEB 1316334 and NSF-GEO 1211668. We thank two anonymous reviewers for comments that substantially improved the manuscript.

Author information

Authors and Affiliations

Department of Ecology and Evolution, University of Chicago, Chicago, IL, USA
David A. Kennedy & Greg Dwyer
Center for Infectious Disease Dynamics, Pennsylvania State University, University Park, PA, USA
David A. Kennedy
Fogarty International Center, National Institutes of Health, Bethesda, MD, USA
David A. Kennedy
Department of Applied Mathematics, University of Colorado - Boulder, Boulder, CO, USA
Vanja Dukic

Authors

David A. Kennedy
View author publications
You can also search for this author in PubMed Google Scholar
Vanja Dukic
View author publications
You can also search for this author in PubMed Google Scholar
Greg Dwyer
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Greg Dwyer.

Additional information

Handling Editor: Pierre Dutilleul.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (txt 137 Bytes)

Supplementary material 2 (txt 2.86 KB)

Supplementary material 3 (txt 14.9 KB)

Supplementary material 4 (txt 18.1 KB)

Supplementary material 5 (txt 15.7 KB)

Supplementary material 6 (txt 16.1 KB)

Supplementary material 7 (txt 1008 Bytes)

Supplementary material 8 (txt 127 Bytes)

Supplementary material 9 (txt 18.8 KB)

Supplementary material 10 (txt 19.5 KB)

Supplementary material 11 (txt 793 Bytes)

Appendices

Appendix 1: Sampling–importance–resampling

Directly simulating many realizations of a birth–death process is computationally expensive. To avoid this cost for the linear birth–death model, we instead sample directly from the distribution of first passage times, using a sampling-importance-resampling algorithm. This method is possible because the function that describes the first passage time for a linear birth–death model can be evaluated point-wise (Shortley 1965).

We began our algorithm, for a given parameter set, by first numerically integrating the first-passage-time function in the C programming language, using the ‘gsl_integration_qag’ function from the GNU Scientific Library, over the range $[0, 612]$, matching the range of observation times in our experiment. Because the linear birth–death model has an absorbing boundary if the population size hits zero, not every trajectory will cross the upper threshold that leads to host death, and so the integral of this function will be in the range $[0,1]$, with the integral value, $p_d$, corresponding to the probability of host death occurring by hour 612. For a given number of model trajectories $\nu $, the number of host deaths is then a number drawn from a binomial distribution with parameters $\nu $ and $p_d$.

To generate these first passage times from our target distribution we used a sampling-importance-resampling algorithm. First, we generated $10^4$ potential first passage times, $u_i$, from a uniform distribution on the interval $[0, 612]$. This interval was chosen so that these points would span the range of our data. Second, we calculated weights $W(u_i)$ for each of these time points, using the density function for first passage time $Q_d(\cdot )$ proposed by Shortley (1965). Weights were thus calculated as

$$\begin{aligned} W(u_i) = Q_d(u_i)/ \underset{u_i}{\text {argmax}}(Q_d(u_i)). \end{aligned}$$

(17)

Third, we generated the first passage times from our target distribution by resampling $u_i$ according to the respective weights $W(u_i)$.

Appendix 2: Sensitivity to priors

In the main text of the paper, we ran our MCMC routine using improper priors, which can sometimes lead to improper posterior distributions. We believe that an improper posterior is unlikely in our case, because each of our multiple MCMC chains seem to have converged on the same stationary distribution. As an additional test, however, we further examined the model behavior under different sets of priors. To do this, we re-ran our analysis using half-normal priors that are vague but proper. Thus for each parameter

$$\begin{aligned} \pi (\theta ) = \left\{ \begin{array}{ll} \frac{2}{\sqrt{2\pi \times 10^{14}}} e^{-\frac{\theta ^2}{2 \times 10^{14}}} &{}\quad \text{ if } \ \theta \ge 0;\\ 0 &{}\quad \text{ otherwise }.\end{array} \right. \end{aligned}$$

(18)

where $\theta $ is defined as $\beta , \phi , c_1, c_2, \text {or } m$.

We also re-ran this analysis using a set of half-normal priors similar to the above, but with parameter-specific variances for the priors of $\beta $, $\phi $, $c_1$, $c_2$, and $m$ of $10^0$, $10^0$, $10^4$, $10^6$, and $10^{14}$ respectively.

From this analysis, we observe that the resulting marginal posteriors are very similar to those achieved in our earlier analysis using improper priors (Fig. 10), providing additional evidence that the data are informative about the model parameters, and that the results are robust to the choice of priors.

Appendix 3: Bias in posterior estimates

Although the MCMC routine used in this paper appears to converge to a stationary distribution, the distribution is not exactly equal to the posterior distribution. Proposed parameter sets can be accepted at an inflated rate, because of uncertainty in our estimate of the likelihood, and our MCMC chains tend to over-accept proposed jumps. The result of this over-acceptance is a stationary distribution that is biased towards the proposal distribution.

The uncertainty in our estimate of the likelihood depends on the number of realizations used to parameterize Eq. (6), and so an obvious way to eliminate this bias would be to increase the number of realizations. Our precision is thus directly related to computing time, and so in the face of limited computing resources, we are forced to allow for at least some bias. At the number of realizations we used ($3\times 10^3$), however, the bias in our realized posterior distribution is minimal. To show this, we re-ran our analyses for a range of numbers of realizations. As Fig. 11 demonstrates, increasing the number of realizations at first leads to dramatic changes in the posterior, but as we approach $3\times 10^3$ realizations, further increases have essentially no effect. This suggests that our stationary distribution is probably close to the true posterior distribution.

Appendix 4: Implications of the results for the nonlinear dynamical model

Our estimates of the parameters of the nonlinear model are listed in Table 2, but here we place these estimates in the context of baculovirus biology. First, our doubling-time estimate of 3.04 h is similar to a doubling-time estimate of 2.53 h for the cabbage looper Trichoplusia ni calculated using DNA-DNA hybridization (Beek et al. 1990). We did not necessarily expect close agreement between these estimates because the two insects and their associated baculoviruses are not closely related, but the rough similarity suggests that our estimate is biologically reasonable.

Table 2 Parameter estimates

Full size table

Second, our estimate of the half-saturation constant of the virus-dose function is $c_{2} \approx 10^{3}$, which is much lower than the $2 \times 10^{9}$ virus particles that are produced by a virus-killed, fourth-instar gypsy-moth cadaver (Shapiro et al. 1986). It thus appears that virus doses in nature are nearly saturated, so that small changes in dose have little effect on host times of death. This is surprising because virus strains could presumably kill faster if they produced fewer virus particles. We would therefore expect that natural selection would favor virus strains with shorter speeds of kill, because the cost of producing fewer virus particles appears to be very low. In nature, however, the virus is rapidly rendered inactive by ultraviolet light (Fuller et al. 2012), and so consumed doses of infectious virus may often be quite small. The slow speed of kill of this virus may therefore be an adaptation to high virus-inactivation rates, because slow-killing virus strains produce large numbers of particles that help to reduce the risk that all particles will be inactivated (Shapiro et al. 2002). Our estimate of $c_{2}$ therefore suggests that selective forces acting within hosts may oppose selective forces acting between hosts, as has often been suggested by mathematical theories of pathogen evolution (Antia et al. 1994; Gilchrist and Sasaki 2002).

Our best estimate of the largest average number of virus particles that could initiate an infection is $c_{1} \approx 35$. Given that the highest virus dose used was $1.35 \times 10^4$ particles, our estimate of $c_{1}$ suggests that the vast majority of consumed virus particles play no role in infection, even though larvae almost certainly have many more than $35$ midgut epithelial cells (Baldwin and Hakim 1991). This observation can be explained by cell sloughing, in which cells of the larval midgut are removed and subsequently replaced by new cells (Baldwin and Hakim 1991). Our estimate of $c_{1}$ thus supports previous research suggesting that cell sloughing is an important line of defense against baculovirus infection (McNeil et al. 2010; Hoover et al. 2000). Our estimate of $c_1$ also implies that severe population bottlenecks occur at the beginning of each new infection, in turn suggesting that genetic drift may be an important evolutionary force sha** the virus population.

Our estimate of the number of immune cells in a healthy larva is $m = 7 \times 10^4$. Examination of the posterior distribution revealed that this estimate is actually highly uncertain, because of a strong, negative log-linear correlation with the immune-cell attack rate $\beta $ (Fig. 8). The strong correlation between these two parameters might be expected given that in the deterministic version of the model, these two parameters are individually non-identifiable.

Mechanistic models of within-host pathogen growth have a long history (Antia et al. 1994; Alizon and Baalen 2008; Shortley 1965), but few of these models have been challenged with data, because of the computational difficulties associated with fitting nonlinear, dynamic models. Although fitting static or deterministic models to response data has provided useful insights into the infection process of some pathogens (Meynell 1957), including baculoviruses (Beek et al. 2000; Zwart et al. 2009), a growing literature strongly suggests that within-host pathogen population growth is stochastic (Grant et al. 2008; Kennedy et al. 2014; Vaughan et al. 2012). Incorporating this stochasticity and using the entire distribution of speeds of kill to make inference is superior to basing the inference simply on the mean quantities. Our work therefore demonstrates the usefulness of nonlinear stochastic models in understanding within-host pathogen growth. Moreover, nonlinear dynamic models are becoming increasingly popular in ecology, highlighting the need for easy-to-implement statistical algorithms suitable for use with such models.

For baculoviruses in particular, survival-time data are widely available, but are usually used only to calibrate parametric phenomenological models such as those based on the Weibull distribution (Mudholkar et al. 1996; Morgan 1992). By instead using speed-of-kill data to fit a more mechanistic model, we have gained useful insights into the underlying biological processes, which in turn has allowed us to make inferences about virus evolution. In particular, our results suggest that genetic drift likely plays an important role in the evolution of the virus, which is important partly because drift may oppose the effects of natural selection (Kimura 1983). The occurrence of drift also has implications for the use of baculoviruses in pest control, because control programs often use only a single strain of virus (Hunter-Fujita et al. 1998). This has led to concerns that virus sprays will reduce natural diversity, and our results suggest that such reductions may be exacerbated by the drift inherent in the infection process.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Kennedy, D.A., Dukic, V. & Dwyer, G. Combining principal component analysis with parameter line-searches to improve the efficacy of Metropolis–Hastings MCMC. Environ Ecol Stat 22, 247–274 (2015). https://doi.org/10.1007/s10651-014-0297-0

Download citation

Received: 13 May 2013
Revised: 23 April 2014
Published: 27 September 2014
Issue Date: June 2015
DOI: https://doi.org/10.1007/s10651-014-0297-0

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

EUR 32.99 /Month

Get 10 units per month
Download Article/Chapter or Ebook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Combining principal component analysis with parameter line-searches to improve the efficacy of Metropolis–Hastings MCMC

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Markov Chain Monte Carlo Algorithms

A Scheme for Adaptive Selection of Population Sizes in Approximate Bayesian Computation - Sequential Monte Carlo

Efficient construction of Bayes optimal designs for stochastic process models

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Electronic supplementary material

Supplementary material 1 (txt 137 Bytes)

Supplementary material 2 (txt 2.86 KB)

Supplementary material 3 (txt 14.9 KB)

Supplementary material 4 (txt 18.1 KB)

Supplementary material 5 (txt 15.7 KB)

Supplementary material 6 (txt 16.1 KB)

Supplementary material 7 (txt 1008 Bytes)

Supplementary material 8 (txt 127 Bytes)

Supplementary material 9 (txt 18.8 KB)

Supplementary material 10 (txt 19.5 KB)

Supplementary material 11 (txt 793 Bytes)

Appendices

Appendix 1: Sampling–importance–resampling

Appendix 2: Sensitivity to priors

Appendix 3: Bias in posterior estimates

Appendix 4: Implications of the results for the nonlinear dynamical model

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Combining principal component analysis with parameter line-searches to improve the efficacy of Metropolis–Hastings MCMC

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Electronic supplementary material

Appendices

Appendix 1: Sampling–importance–resampling

Appendix 2: Sensitivity to priors

Appendix 3: Bias in posterior estimates

Appendix 4: Implications of the results for the nonlinear dynamical model

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation