Log in

Reproduction of secondary data in projection pursuit transformation

  • Original Paper
  • Published:
Stochastic Environmental Research and Risk Assessment Aims and scope Submit manuscript

Abstract

A challenge when working with multivariate data in a geostatistical context is that the data are rarely Gaussian. Multivariate distributions may include nonlinear features, clustering, long tails, functional boundaries, spikes, and heteroskedasticity. Multivariate transformations account for such features so that they are reproduced in geostatistical models. Projection pursuit as developed for high dimensional data exploration can also be used to transform a multivariate distribution into a multivariate Gaussian distribution with an identity covariance matrix. Its application within a geostatistical modeling context is called the projection pursuit multivariate transform (PPMT). An approach to incorporate exhaustive secondary variables in the PPMT is introduced. With this approach the PPMT can incorporate any number of secondary variables with any number of primary variables. A necessary alteration to the approach to make this numerically practical was the implementation of a continuous probability estimator that relies on Bernstein polynomials for the transformation that takes place in the projections. Stop** criteria were updated to incorporate a bootstrap t test that compares data sampled from a multivariate Gaussian distribution with the data undergoing transformation.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18

Similar content being viewed by others

References

  • Aitchison J (1982) The statistical analysis of compositional data. J R Stat Soc B 44(2):139–177

    Google Scholar 

  • Aitchison J (1986) The statistical analysis of compositional data. Chapman & Hall, London

    Book  Google Scholar 

  • Asmussen S, Glynn PW (2007) Stochastic simulation algorithms and analysis. Springer Science+Business Media, New York

    Google Scholar 

  • Babak O, Deutsch CV (2009) Collocated cokriging based on merged secondary attributes. Math Geosci 41:921–926

    Article  Google Scholar 

  • Babak O, Machuca-Mory DF, Deutsch CV (2010) An approximate method for joint sequential simulation of multiple spatial variables. Stoch Environ Res Risk Assess 24:327–336

    Article  Google Scholar 

  • Barnett RM, Manchuk JG, Deutsch CV (2014) Projection pursuit multivariate transform. Math Geosci 46(3):337–359

    Article  Google Scholar 

  • Barnett RM, Manchuk JG, Deutsch CV (2016) The projection-pursuit multivariate transform for improved continuous variable modeling. SPE J. doi:10.2118/184388-PA

  • Bernstein S (1912) Démonstration du théorème de Weierstrass fondée sur le calcul des probabilities. Commun Soc Math Kharkov 13:1–2

    Google Scholar 

  • Bishop CM (1995) Neural networks for pattern recognition. Oxford University Press, New York

    Google Scholar 

  • Boardman RC, Vann, JE (2011). A review of the application of copulas to improve modelling of non-bigaussian bivariate relationships (with an example using geological data). In Chan F, Marinova D, Anderssen RS (eds.) 19th International Congress on Modeling and Simulation (MODSIM), Perth, Australia, December 12–16, 627–633

  • Burrough PA, McDonnell RA (1998) Principles of geographical information systems, 2nd edn. Oxford University Press, Oxford

    Google Scholar 

  • Chiles JP, Delfiner P (2012) Geostatistics: modeling spatial uncertainty. Wiley, New Jersey

    Book  Google Scholar 

  • Desbarats AJ, Dimitrakopoulos R (2000) Geostatistical simulation of regionalized poresize distributions using min/max autocorrelations factors. Math Geol 32:919–942. doi:10.1023/A:1007570402430

    Article  Google Scholar 

  • Deutsch CV, Journel AG (1998) GSLIB: geostatistical software library and user’s guide. Oxford University Press, New York

    Google Scholar 

  • Efron B (1982) The Jackknife, the Bootstrap, and other resampling plans. Soc Ind Appl Math 26:197–204

    Google Scholar 

  • Efron B, Tibshirani RJ (1993) An introduction to the bootstrap. Chapman & Hall, New York

    Book  Google Scholar 

  • Egozcue JJ, Pawlowsky-Glahn V, Mateu-Figueras G, Barcelo-Vidal C (2003) Isometric logratio transformations for compositional data analysis. Math Geol 35(3):279–300

    Article  Google Scholar 

  • Friedman JH (1987) Exploratory projection pursuit. J Am Stat Assoc 82(397):249–266

    Article  Google Scholar 

  • Hotelling H (1933) Analysis of a complex of statistical variables into principal components. J Educ Psychol 24:417–441. doi:10.1037/h0071325

    Article  Google Scholar 

  • Hwang J-N, Lay S-R, Lippman A (1994) Nonparametric multivariate density estimation: a comparative study. IEEE Trans Signal Process 42(10):2795–2810

    Article  Google Scholar 

  • Johnson RJ, Wichern DW (1998) Applied multivariate statistical analysis, 4th edn. Prentice Hall, New Jersey

    Google Scholar 

  • Jones MC (1989) Discretized and interpolated kernel density estimates. J Am Stat Assoc 84(407):733–741

    Article  Google Scholar 

  • Journel AG, Huijbregts CJ (1978) Mining geostatistics. Academic Press, London

    Google Scholar 

  • Leuangthong O, Deutsch CV (2003) Stepwise conditional transform for simulation of multiple variables. Math Geol 35(2):155–173

    Article  Google Scholar 

  • Li G, Zhang J (1998) Sphering and its properties. Indian J Stat 60:119–133

    Google Scholar 

  • Lorentz GG (1986) Bernstein polynomials. American Mathematical Society, New York

    Google Scholar 

  • Manchuk JG, Deutsch CV (2008) Sequential simulation of geologic variables with non-Gaussian correlation. In Ortiz JM, Emery X (eds.) Geostats 2008, VIII International Geostatistics Congress, 1–5 Dec, Santiago, Chile

  • Oman SD, Vakulenko-Lagun B, Zilberbrand M (2015) Methods for descriptive factor analysis of multivariate geostatistical data: a case comparison. Stoch Environ Res Risk Assess 29:1103–1116

    Article  Google Scholar 

  • Pawlowsky-Glahn V, Egozcue JJ (2016) Spatial analysis of compositional data: a historical review. J Geochem Explor 164:28–32

    Article  CAS  Google Scholar 

  • Pawlowsky-Glawn V, Egozcue JJ (2006) Compositional data and their analysis: an introduction. In: Buccianti A, Mateu-Figueras G, Pawlowsky-Glahn V (eds) Compositional data analysis in the geosciences: from theory to practice. Geological Society of London, Special Publications, London

    Google Scholar 

  • Pyrcz MJ, Deutsch CV (2014) Geostatistical reservoir modeling. Oxford University Press, New York

    Google Scholar 

  • Reddy MJ, Singh VP (2014) Multivariate modeling of droughts using copulas and meta-heuristic methods. Stoch Environ Res Risk Assess 28:475–489

    Article  Google Scholar 

  • Rosenblatt M (1952) Remarks on a multivariate transformation. Ann Math Stat 23(3):470–472

    Article  Google Scholar 

  • Sun W, Yaun Y-X (2006) Optimization theory and methods: nonlinear programming. Springer Science+Business Media, New York

    Google Scholar 

  • Switzer P, Green AA (1984) Min/max autocorrelation factors for multivariate spatial imaging. Department of Statistics Technical Report 6. Stanford University, Stanford, USA

  • Tong Q, Karunamuni RJ (2016) Fast and accurate computation for kernel estimators. Comput Stat Data Anal 94:49–62

    Article  Google Scholar 

  • Vargas-Guzman JA (2004) Fast modeling of cross-covariance in the LMC: a tool for data integration. Stoch Environ Res 18:91–99

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to John G. Manchuk.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Manchuk, J.G., Barnett, R.M. & Deutsch, C.V. Reproduction of secondary data in projection pursuit transformation. Stoch Environ Res Risk Assess 31, 2585–2605 (2017). https://doi.org/10.1007/s00477-016-1363-y

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00477-016-1363-y

Keywords

Navigation