Log in

Combining principal component analysis with parameter line-searches to improve the efficacy of Metropolis–Hastings MCMC

  • Published:
Environmental and Ecological Statistics Aims and scope Submit manuscript

Abstract

When Markov chain Monte Carlo (MCMC) algorithms are used with complex mechanistic models, convergence times are often severely compromised by poor mixing rates and a lack of computational power. Methods such as adaptive algorithms have been developed to improve mixing, but these algorithms are typically highly sophisticated, both mathematically and computationally. Here we present a nonadaptive MCMC algorithm, which we term line-search MCMC, that can be used for efficient tuning of proposal distributions in a highly parallel computing environment, but that nevertheless requires minimal skill in parallel computing to implement. We apply this algorithm to make inferences about dynamical models of the growth of a pathogen (baculovirus) population inside a host (gypsy moth, Lymantria dispar). The line-search MCMC appeal rests on its ease of implementation, and its potential for efficiency improvements over classical MCMC in a highly parallel setting, which makes it especially useful for ecological models.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

References

  • Alizon S, van Baalen M (2008) Acute or chronic? Within-host models with immune dynamics, infection outcome, and parasite evolution. Am Nat 172:E244–E256

    Article  PubMed  Google Scholar 

  • Antia R, Levin B, May R (1994) Within-host population-dynamics and the evolution and maintenance of microparasite virulence. Am Nat 144:457–472

    Article  Google Scholar 

  • Armenian H, Lilienfeld A (1983) Incubation period of disease. Epidemiol Rev 5:1–15

    CAS  PubMed  Google Scholar 

  • Ashida M, Brey P (1998) Molecular mechanisms of immune responses in insects. Chapman & Hall, London

    Google Scholar 

  • Baldwin K, Hakim R (1991) Growth and differentiation of the larval midgut epithelium during molting in the moth, Manduca sexta. Tissue Cell 23:411–422

    Article  CAS  PubMed  Google Scholar 

  • Beaumont M, Zhang W, Balding D (2002) Approximate Bayesian computation in population genetics. Genetics 162:2025–2035

    PubMed Central  PubMed  Google Scholar 

  • Bogich T, Shea K (2008) A state-dependent model for the optimal management of an invasive metapopulation. Ecol Appl 18:748–761

    Article  PubMed  Google Scholar 

  • Bolker B (2008) Ecological models and data in R. Princeton University Press, New Jersey

    Google Scholar 

  • Braun M (1983) Differential equations and their applications, an introduction to applied mathematics, 3rd edn. Springer, New York

    Book  Google Scholar 

  • Brigham C, Power A, Hunter A (2002) Evaluating the internal consistency of recovery plans for federally endangered species. Ecol Appl 12:648–654

    Article  Google Scholar 

  • Brockwell A (2006) Parallel Markov chain Monte Carlo simulation by pre-fetching. J Comput Graph Stat 15:246–261

    Article  Google Scholar 

  • Chakerian J, Holmes S (2012) Computational tools for evaluating phylogenetic and hierarchical clustering trees. J Comput Graph Stat 21:581–599

    Article  Google Scholar 

  • Comon P (1994) Independent component analysis, a new concept. Signal Proces 36:287–314

    Article  Google Scholar 

  • Cory J, Myers J (2003) The ecology and evolution of insect baculoviruses. Annu Rev Ecol Evol Syst 34:239–272

    Article  Google Scholar 

  • Cowles M, Carlin B (1996) Markov chain Monte Carlo convergence diagnostics: a comparative review. J Am Stat Assoc 91:883–904

    Article  Google Scholar 

  • Craiu R, Rosenthal J, Yang C (2009) Learn from thy neighbor: parallel-chain and regional adaptive MCMC. J Am Stat Assoc 104:1454–1466

    Article  Google Scholar 

  • Csillery K, Blum M, Gaggiotti O, Francois O (2010) Approximate Bayesian computation (ABC) in practice. Trends Ecol Evol 25:410–418

  • Doak DF, Morris WF (2010) Demographic compensation and tip** points in climate-induced range shifts. Nature 467:959–962

    Article  CAS  PubMed  Google Scholar 

  • Doob J (1945) Markoff chains: denumerable case. Trans Am Math Soc 58:455–473

    Google Scholar 

  • Dukic V, Lopes H, Polson N (2012) Tracking epidemics with Google Flu trends data and a state-space SEIR model. J Am Stat Assoc 107:1410–1426

    Article  CAS  Google Scholar 

  • Feng H, Gould F, Huang Y, Jiang Y, Wu K (2010) Modeling the population dynamics of cotton bollworm Helicoverpa armigera (Hubner) (Lepidoptera: Noctuidae) over a wide area in northern China. Ecol Model 221:1819–1830

    Article  Google Scholar 

  • Fuller E, Elderd B, Dwyer G (2012) Pathogen persistence in the environment and insect-baculovirus interactions: disease-density thresholds, epidemic burnout and insect outbreaks. Am Nat 179:E70–E96

  • Fuller S, Millet L (2011) Computing performance: Game over or next level? IEEE Comput 44:31–38

    Article  Google Scholar 

  • Geer D (2005) Chip makers turn to multicore processors. IEEE Comput 38:11–13

    Article  Google Scholar 

  • Gelman A, Rubin D (1992) Inference from iterative simulation using multiple sequences. Stat Sci 7:457–472

    Article  Google Scholar 

  • Gilchrist M, Sasaki A (2002) Modeling host-parasite coevolution: a nested approach based on mechanistic models. J Theor Biol 218:289–308

    Article  PubMed  Google Scholar 

  • Gilks W, Roberts G (1996) Markov chain Monte Carlo in practice, chapter Introducing Markov chain Monte Carlo. Chapman & Hall, London

    Google Scholar 

  • Gillespie D (1977) Exact stochastic simulation of coupled chemical-reactions. J Phys Chem 81:2340–2361

    Article  CAS  Google Scholar 

  • Girolami M, Calderhead B (2011) Riemann manifold Langevin and Hamiltonian Monte Carlo methods. J R Stat Soc Ser B 73:123–214

    Article  Google Scholar 

  • Grant A, Restif O, McKinley T, Sheppard M, Maskell D, Mastroeni P (2008) Modelling within-host spatiotemporal dynamics of invasive bacterial disease. PLoS Biol 6:757–770

    Article  CAS  Google Scholar 

  • Haario H, Saksman E, Tamminen J (2001) An adaptive Metropolis algorithm. Bernoulli 7:223–242

    Article  Google Scholar 

  • Hartig F, Calabrese JM, Reineking B, Wiegand T, Huth A (2011) Statistical inference for stochastic simulation models—theory and application. Ecol Lett 14:816–827

    Article  PubMed  Google Scholar 

  • Heidelberger P, Welch P (1983) Simulation run length control in the presence on an initial transient. Oper Res 31:1109–1144

    Article  Google Scholar 

  • Hoover K, Washburn J, Volkman L (2000) Midgut-based resistance of Heliothis virescens to baculovirus infection mediated by phytochemicals in cotton. J Insect Physiol 46:999–1007

  • Hunter-Fujita F, Entwistle P, Evans H, Crook N (1998) Insect viruses and pest management. Wiley, Chichester

    Google Scholar 

  • Ionides E, Breto C, King A (2006) Inference for nonlinear dynamical systems. Proc Natl Sci USA 103:18438–18443

    Article  CAS  Google Scholar 

  • Jacob P, Robert C, Smith M (2011) Using parallel computation to improve independent Metropolis-Hastings based estimation. J Comput Graph Stat 20:616–635

    Article  Google Scholar 

  • Jolliffe I (1986) Principal component analysis. Springer, New York

    Book  Google Scholar 

  • Karlin S, Taylor H (1975) A first course in stochastic processes. Academic, New York

    Google Scholar 

  • Kennedy DA, Dukic V, Dwyer G (2014) The mechanisms determining the within-host population dynamics of an insect pathogen. Am Nat 184:407–423

  • Khorsheed E, Hurn M, Jennison C (2011) Map** electron density in the ionosphere: a principal component MCMC algorithm. Comput Stat Data Anal 55:338–352

    Article  Google Scholar 

  • Kimura M (1983) The neutral theory of molecular evolution. Cambridge University Press, New York

    Book  Google Scholar 

  • King A, Shrestha S, Harvill E, Bjørnstad O (2009) Evolution of acute infections and the invasion-persistence trade-off. Am Nat 173:446–455

    Article  PubMed Central  PubMed  Google Scholar 

  • Kot M (2001) Elements of mathematical ecology. Cambridge University Press, Cambridge

    Book  Google Scholar 

  • Lele S, Dennis B, Lutscher F (2007) Data cloning: easy maximum likelihood estimation for complex ecological models using Bayesian Markov chain Monte Carlo methods. Ecol Lett 10:551–563

    Article  PubMed  Google Scholar 

  • Lele S, Nadeem K, Schmuland B (2010) Estimability and likelihood inference for generalized linear mixed models using data cloning. J Am Stat Assoc 105:1617–1625

    Article  CAS  Google Scholar 

  • Liu J (2001) Monte Carlo strategies in scientific computing. Springer, Berlin

    Google Scholar 

  • Luenberger D, Ye Y (2008) Linear and nonlinear programming, 3rd edn. Springer Science and Business Media, New York

    Google Scholar 

  • McNeil J, Cox-Foster D, Gardner M, Slavicek J, Thiem S, Hoover K (2010) Pathogenesis of Lymantria dispar multiple nucleopolyhedrovirus (LdMNPV) in L. dispar and mechanisms of developmental resistance. J Gen Virol 91:1590–1600

  • Meynell G (1957) The applicability of the hypothesis of independent action to fatal infections in mice given Salmonella typhimurium by mouth. J Gen Microbiol 16:396–404

    Article  CAS  PubMed  Google Scholar 

  • Miller G (2010) Markov chain Monte Carlo calculations allowing parallel processing using a variant of the Metropolis algorithm. Open Numer Methods J 2:12–17

    Article  Google Scholar 

  • Morgan B (1992) Analysis of quantal response data. Chapman & Hall, London

    Book  Google Scholar 

  • Mudholkar G, Srivastava D, Kollia G (1996) A generalization of the Weibull distribution with application to the analysis of survival data. J Am Stat Assoc 91:1575–1583

    Article  Google Scholar 

  • Plummer M, Best N, Cowles K, Vines K. (2009) coda: Output analysis and diagnostics for MCMC. R package version 0.13-4

  • Ponciano J, Burleigh J, Braun E, Taper M (2012) Assessing parameter identifiability in phylogenetic models using data cloning. Syst Biol 61:955–972

    Article  PubMed Central  PubMed  Google Scholar 

  • Press W, Teukolsky S, Vetterling W, Flannery B (1992) Numerical recipes in C. Cambridge University Press, Cambridge

    Google Scholar 

  • Development Core Team R (2009) R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. ISBN 3-900051-07-0

  • Robert C, Cornuet J, Marin J, Pillai N (2011) Lack of confidence in approximate Bayesian computation model choice. Proc Natl Acad Sci USA 108:15112–15117

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  • Roberts G, Gelman A, Gilks W (1997) Weak convergence and optimal scaling of random walk Metropolis algorithms. Ann Appl Probab 7:110–120

    Article  Google Scholar 

  • Rosenthal J (2000) Parallel computing and Monte Carlo algorithms. Far East J Theor Stat 4:207–236

    Google Scholar 

  • Saaty T (1961) Some stochastic-processes with absorbing barriers. J R Stat Soc Ser B Stat Methodol 23:319–334

    Google Scholar 

  • Schmid-Hempel P (2005) Evolutionary ecology of insect immune defenses. Annu Rev Entomol 50:529–551

    Article  CAS  PubMed  Google Scholar 

  • Schölkopf B, Smola A, Müller K (1998) Nonlinear component analysis as a kernel eigenvalue problem. Neural Comput 10:1299–1319

    Article  Google Scholar 

  • Shapiro M, Farrar R Jr, Domek J, Javaid I (2002) Effects of virus concentration and ultraviolet irradiation on the activity of corn earworm and beet armyworm (Lepidoptera:Noctuidae) nucleopolyhedroviruses. J Econ Entomol 95:243–249

    Article  PubMed  Google Scholar 

  • Shapiro M, Robertson J, Bell R (1986) Quantitative and qualitative differences in gypsy moth (Lepidoptera: Lymantriidae) nucleopolyhedrosis virus produced in different-aged larvae. J Econ Entomol 79:1174–1177

    Article  Google Scholar 

  • Shortley G (1965) A stochastic model for distributions of biological response times. Biometrics 21:562–582

    Article  CAS  PubMed  Google Scholar 

  • Solonen A, Ollinaho P, Laine M, Haario H, Tamminen J, Jarvinen H (2012) Efficient MCMC for climate model parameter estimation: parallel adaptive chains and early rejection. Bayesian Anal 7:715–736

    Article  Google Scholar 

  • Strid I (2010) Efficient parallelisation of Metropolis-Hastings algorithms using a prefetching approach. Comput Stat Data Anal 54:2814–2835

    Article  Google Scholar 

  • Trudeau D, Washburn J, Volkman L (2001) Central role of hemocytes in Autographa californica M nucleopolyhedrovirus pathogenesis in Heliothis virescens and Helicoverpa zea. J Virol 75:996–1003

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  • Turchin P (2003) Complex population dynamics: a theoretical/empirical synthesis. Princeton University Press, Princeton

    Google Scholar 

  • van Beek N, Flore P, Wood H, Hughes P (1990) Rate of increase of Autographa californica nuclear polyhedrosis virus in Trichoplusia ni larvae determined by DNA-DNA hybridization. J Invertebr Pathol 55:85–92

    Article  PubMed  Google Scholar 

  • van Beek N, Hughes P, Wood H (2000) Effects of incubation temperature on the dose-survival time relationship of Trichoplusia ni larvae infected with Autographa californica nucleopolyhedrovirus. J Invertebr Pathol 76:185–190

    Article  PubMed  Google Scholar 

  • van Beek N, Wood H, Hughes P (1988) Quantitative aspects of nuclear polyhedrosis virus infections in Lepidopterous larvae: the dose-survival time relationship. J Invertebr Pathol 51:58–63

    Article  Google Scholar 

  • van den Berg S, Beem L, Boomsma D (2006) Fitting genetic models using Markov chain Monte Carlo algorithms with BUGS. Twin Res Hum Genet 9:334–342

    Article  PubMed  Google Scholar 

  • Vaughan T, Drummond P, Drummond A (2012) Within-host demographic fluctuations and correlations in early retroviral infection. J Theor Biol 295:86–99

    Article  CAS  PubMed  Google Scholar 

  • Wilkinson D (2005) Handbook of Parallel computing and statistics, chapter parallel Bayesian computation. Dekker/CRC Press, New York

    Google Scholar 

  • Yan J, Cowles M, Wang S, Armstrong M (2007) Parallelizing MCMC for Bayesian spatiotemporal geostatistical models. Stat Comput 17:323–335

    Article  Google Scholar 

  • Zwart M, Hemerik L, Cory J, de Visser J, Bianchi F, Van Oers M, Vlak J, Hoekstra R, Van der Werf W (2009) An experimental test of the independent action hypothesis in virus-insect pathosystems. Proc R Soc Lond Ser B-Biol Sci 276:2233–2242

    Article  Google Scholar 

Download references

Acknowledgments

DAK was supported by an ARCS fellowship, a GAANN training grant while at the University of Chicago, and the RAPIDD program of the Science and Technology Directorate, Department of Homeland Security and Fogarty International Center, National Institutes of Health (NIH). GD and VD were supported by NIH Grant R01GM096655. VD was also supported by Grants NSF-DEB 1316334 and NSF-GEO 1211668. We thank two anonymous reviewers for comments that substantially improved the manuscript.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Greg Dwyer.

Additional information

Handling Editor: Pierre Dutilleul.

Electronic supplementary material

Appendices

Appendix 1: Sampling–importance–resampling

Directly simulating many realizations of a birth–death process is computationally expensive. To avoid this cost for the linear birth–death model, we instead sample directly from the distribution of first passage times, using a sampling-importance-resampling algorithm. This method is possible because the function that describes the first passage time for a linear birth–death model can be evaluated point-wise (Shortley 1965).

We began our algorithm, for a given parameter set, by first numerically integrating the first-passage-time function in the C programming language, using the ‘gsl_integration_qag’ function from the GNU Scientific Library, over the range \([0, 612]\), matching the range of observation times in our experiment. Because the linear birth–death model has an absorbing boundary if the population size hits zero, not every trajectory will cross the upper threshold that leads to host death, and so the integral of this function will be in the range \([0,1]\), with the integral value, \(p_d\), corresponding to the probability of host death occurring by hour 612. For a given number of model trajectories \(\nu \), the number of host deaths is then a number drawn from a binomial distribution with parameters \(\nu \) and \(p_d\).

To generate these first passage times from our target distribution we used a sampling-importance-resampling algorithm. First, we generated \(10^4\) potential first passage times, \(u_i\), from a uniform distribution on the interval \([0, 612]\). This interval was chosen so that these points would span the range of our data. Second, we calculated weights \(W(u_i)\) for each of these time points, using the density function for first passage time \(Q_d(\cdot )\) proposed by Shortley (1965). Weights were thus calculated as

$$\begin{aligned} W(u_i) = Q_d(u_i)/ \underset{u_i}{\text {argmax}}(Q_d(u_i)). \end{aligned}$$
(17)

Third, we generated the first passage times from our target distribution by resampling \(u_i\) according to the respective weights \(W(u_i)\).

Appendix 2: Sensitivity to priors

In the main text of the paper, we ran our MCMC routine using improper priors, which can sometimes lead to improper posterior distributions. We believe that an improper posterior is unlikely in our case, because each of our multiple MCMC chains seem to have converged on the same stationary distribution. As an additional test, however, we further examined the model behavior under different sets of priors. To do this, we re-ran our analysis using half-normal priors that are vague but proper. Thus for each parameter

$$\begin{aligned} \pi (\theta ) = \left\{ \begin{array}{ll} \frac{2}{\sqrt{2\pi \times 10^{14}}} e^{-\frac{\theta ^2}{2 \times 10^{14}}} &{}\quad \text{ if } \ \theta \ge 0;\\ 0 &{}\quad \text{ otherwise }.\end{array} \right. \end{aligned}$$
(18)

where \(\theta \) is defined as \(\beta , \phi , c_1, c_2, \text {or } m\).

We also re-ran this analysis using a set of half-normal priors similar to the above, but with parameter-specific variances for the priors of \(\beta \), \(\phi \), \(c_1\), \(c_2\), and \(m\) of \(10^0\), \(10^0\), \(10^4\), \(10^6\), and \(10^{14}\) respectively.

From this analysis, we observe that the resulting marginal posteriors are very similar to those achieved in our earlier analysis using improper priors (Fig. 10), providing additional evidence that the data are informative about the model parameters, and that the results are robust to the choice of priors.

Fig. 10
figure 10

Sensitivity of posterior to priors. Posterior distributions for \(log_{10}\) values of each parameter are shown above. The black line shows the posterior distribution using the improper priors from the main text. The red line shows the posterior distribution when using the vague proper priors in Eq. (18). The blue line shows the posterior distribution when using the parameter-specific variances given in the text. We achieve very similar posterior distributions using each sets of priors (Color figure online)

Appendix 3: Bias in posterior estimates

Although the MCMC routine used in this paper appears to converge to a stationary distribution, the distribution is not exactly equal to the posterior distribution. Proposed parameter sets can be accepted at an inflated rate, because of uncertainty in our estimate of the likelihood, and our MCMC chains tend to over-accept proposed jumps. The result of this over-acceptance is a stationary distribution that is biased towards the proposal distribution.

The uncertainty in our estimate of the likelihood depends on the number of realizations used to parameterize Eq. (6), and so an obvious way to eliminate this bias would be to increase the number of realizations. Our precision is thus directly related to computing time, and so in the face of limited computing resources, we are forced to allow for at least some bias. At the number of realizations we used (\(3\times 10^3\)), however, the bias in our realized posterior distribution is minimal. To show this, we re-ran our analyses for a range of numbers of realizations. As Fig. 11 demonstrates, increasing the number of realizations at first leads to dramatic changes in the posterior, but as we approach \(3\times 10^3\) realizations, further increases have essentially no effect. This suggests that our stationary distribution is probably close to the true posterior distribution.

Fig. 11
figure 11

Bias in the realized posterior distribution. Posterior distributions for \(log_{10}\) values of each parameter are shown above. Each line shows the results when using a particular number of realizations to estimate the likelihood, with the solid black line showing the posterior distribution at the number of realizations used in the paper. By comparing the change in the posterior distribution at different numbers of realizations, we get a sense of how bias is affecting our results. While bias seems to be important at low numbers of realizations, at higher numbers the distributions seem to stabilize, suggesting that little bias remains at \(3\times 10^3\) realizations

Appendix 4: Implications of the results for the nonlinear dynamical model

Our estimates of the parameters of the nonlinear model are listed in Table 2, but here we place these estimates in the context of baculovirus biology. First, our doubling-time estimate of 3.04 h is similar to a doubling-time estimate of 2.53 h for the cabbage looper Trichoplusia ni calculated using DNA-DNA hybridization (Beek et al. 1990). We did not necessarily expect close agreement between these estimates because the two insects and their associated baculoviruses are not closely related, but the rough similarity suggests that our estimate is biologically reasonable.

Table 2 Parameter estimates

Second, our estimate of the half-saturation constant of the virus-dose function is \(c_{2} \approx 10^{3}\), which is much lower than the \(2 \times 10^{9}\) virus particles that are produced by a virus-killed, fourth-instar gypsy-moth cadaver (Shapiro et al. 1986). It thus appears that virus doses in nature are nearly saturated, so that small changes in dose have little effect on host times of death. This is surprising because virus strains could presumably kill faster if they produced fewer virus particles. We would therefore expect that natural selection would favor virus strains with shorter speeds of kill, because the cost of producing fewer virus particles appears to be very low. In nature, however, the virus is rapidly rendered inactive by ultraviolet light (Fuller et al. 2012), and so consumed doses of infectious virus may often be quite small. The slow speed of kill of this virus may therefore be an adaptation to high virus-inactivation rates, because slow-killing virus strains produce large numbers of particles that help to reduce the risk that all particles will be inactivated (Shapiro et al. 2002). Our estimate of \(c_{2}\) therefore suggests that selective forces acting within hosts may oppose selective forces acting between hosts, as has often been suggested by mathematical theories of pathogen evolution (Antia et al. 1994; Gilchrist and Sasaki 2002).

Our best estimate of the largest average number of virus particles that could initiate an infection is \(c_{1} \approx 35\). Given that the highest virus dose used was \(1.35 \times 10^4\) particles, our estimate of \(c_{1}\) suggests that the vast majority of consumed virus particles play no role in infection, even though larvae almost certainly have many more than \(35\) midgut epithelial cells (Baldwin and Hakim 1991). This observation can be explained by cell sloughing, in which cells of the larval midgut are removed and subsequently replaced by new cells (Baldwin and Hakim 1991). Our estimate of \(c_{1}\) thus supports previous research suggesting that cell sloughing is an important line of defense against baculovirus infection (McNeil et al. 2010; Hoover et al. 2000). Our estimate of \(c_1\) also implies that severe population bottlenecks occur at the beginning of each new infection, in turn suggesting that genetic drift may be an important evolutionary force sha** the virus population.

Our estimate of the number of immune cells in a healthy larva is \(m = 7 \times 10^4\). Examination of the posterior distribution revealed that this estimate is actually highly uncertain, because of a strong, negative log-linear correlation with the immune-cell attack rate \(\beta \) (Fig. 8). The strong correlation between these two parameters might be expected given that in the deterministic version of the model, these two parameters are individually non-identifiable.

Mechanistic models of within-host pathogen growth have a long history (Antia et al. 1994; Alizon and Baalen 2008; Shortley 1965), but few of these models have been challenged with data, because of the computational difficulties associated with fitting nonlinear, dynamic models. Although fitting static or deterministic models to response data has provided useful insights into the infection process of some pathogens (Meynell 1957), including baculoviruses (Beek et al. 2000; Zwart et al. 2009), a growing literature strongly suggests that within-host pathogen population growth is stochastic (Grant et al. 2008; Kennedy et al. 2014; Vaughan et al. 2012). Incorporating this stochasticity and using the entire distribution of speeds of kill to make inference is superior to basing the inference simply on the mean quantities. Our work therefore demonstrates the usefulness of nonlinear stochastic models in understanding within-host pathogen growth. Moreover, nonlinear dynamic models are becoming increasingly popular in ecology, highlighting the need for easy-to-implement statistical algorithms suitable for use with such models.

For baculoviruses in particular, survival-time data are widely available, but are usually used only to calibrate parametric phenomenological models such as those based on the Weibull distribution (Mudholkar et al. 1996; Morgan 1992). By instead using speed-of-kill data to fit a more mechanistic model, we have gained useful insights into the underlying biological processes, which in turn has allowed us to make inferences about virus evolution. In particular, our results suggest that genetic drift likely plays an important role in the evolution of the virus, which is important partly because drift may oppose the effects of natural selection (Kimura 1983). The occurrence of drift also has implications for the use of baculoviruses in pest control, because control programs often use only a single strain of virus (Hunter-Fujita et al. 1998). This has led to concerns that virus sprays will reduce natural diversity, and our results suggest that such reductions may be exacerbated by the drift inherent in the infection process.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Kennedy, D.A., Dukic, V. & Dwyer, G. Combining principal component analysis with parameter line-searches to improve the efficacy of Metropolis–Hastings MCMC. Environ Ecol Stat 22, 247–274 (2015). https://doi.org/10.1007/s10651-014-0297-0

Download citation

  • Received:

  • Revised:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10651-014-0297-0

Keywords

Navigation