Log in

A Riemannian Tool for Clustering of Geo-Spatial Multivariate Data

  • Special Issue
  • Published:
Mathematical Geosciences Aims and scope Submit manuscript

Abstract

Geological modeling is essential for the characterization of natural phenomena and can be done in two steps: (1) clustering the data into consistent groups and (2) modeling the extent of these groups in space to define domains, honoring the labels defined in the previous step. The clustering step can be based on the information of continuous multivariate data in space instead of relying on the geological logging provided. However, extracting coherent spatial multivariate information is challenging when the variables show complex relationships, such as nonlinear correlation, heteroscedastic behavior, or spatial trends. In this work, we propose a method for clustering data, valid for domaining when multiple continuous variables are available and robust enough to deal with cases where complex relationships are found. The method looks at the local correlation matrix between variables at sample locations inferred in a local neighborhood. Changes in the local correlation between these attributes in space can be used to characterize the domains. By endowing the space of correlation matrices with a manifold structure, matrices are then clustered by adapting the K-means algorithm to this manifold context, using Riemannian geometry tools. A real case study illustrates the methodology. This example demonstrates how the clustering methodology proposed honors the spatial configuration of data delivering spatially connected clusters even when complex nonlinear relationships in the attribute space are shown.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1

Taken from Riquelme and Ortiz (2023)

Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14

Similar content being viewed by others

References

  • Ayadi MA, Ben-Ameur H, Channouf N, Tran QK (2019) Norta for portfolio credit risk. Ann Oper Res 281(1):99–119

    Article  Google Scholar 

  • Bourgault G (2014) Revisiting multi-Gaussian kriging with the Nataf transformation or the Bayes’ rule for the estimation of spatial distributions. Math Geosci 46(7):841–868

    Article  Google Scholar 

  • Bourgault G, Marcotte D (1991) Multivariable variogram and its application to the linear model of coregionalization. Math Geol 23(7):899–928

    Article  Google Scholar 

  • Bourgault G, Marcotte D, Legendre P (1992) The multivariate (co) variogram as a spatial weighting function in classification methods. Math Geol 24(5):463–478

    Article  Google Scholar 

  • Cario MC, Nelson BL (1997) Modeling and generating random vectors with arbitrary marginal distributions and correlation matrix. Technical report, Citeseer

  • Charu CA, Chandan KR (2013) Data clustering: algorithms and applications

  • Chilès JP, Delfiner P (2012) Geostatistics: modeling spatial uncertainty. Wiley series in probability and statistics

  • Cowan EJ, Beatson RK, Ross HJ, Fright WR, McLennan TJ, Evans TR, Carr JC, Lane RG, Bright DV, Gillman AJ, Oshust PA, Titley M (2003) Practical implicit geological modelling. In: Fifth international mining geology conference, Australian Institute of Mining and Metallurgy Bendigo, Victoria, pp 17–19

  • David P (2019) A Riemannian quotient structure for correlation matrices with applications to data science. Ph.D. thesis, The Claremont Graduate University

  • David P, Gu W (2019) A Riemannian structure for correlation matrices. Oper Matrices 13:607–627

    Article  Google Scholar 

  • David P, Gu W (2022) Anomaly detection of time series correlations via a novel lie group structure. Stat 11:e494

    Article  Google Scholar 

  • Deutsch CV, Journel AG (1998) GSLIB: geostatistical software library and user’s guide. Oxford University Press, Oxford

    Google Scholar 

  • Dryden IL, Koloydenko A, Zhou D (2009) Non-Euclidean statistics for covariance matrices, with applications to diffusion tensor imaging. Ann Appl Stat 3(3):1102–1123

    Article  Google Scholar 

  • Faraj F, Ortiz JM (2021) A simple unsupervised classification workflow for defining geological domains using multivariate data. Min Metall Explor 38(3):1609–1623

    Google Scholar 

  • Fouedjio F (2016) A hierarchical clustering method for multivariate geostatistical data. Spat Stat 18:333–351

    Article  Google Scholar 

  • Fouedjio F (2018) A fully non-stationary linear coregionalization model for multivariate random fields. Stoch Env Res Risk Assess 32(6):1699–1721

    Article  Google Scholar 

  • Gelfand AE, Kim HJ, Sirmans C, Banerjee S (2003) Spatial modeling with spatially varying coefficient processes. J Am Stat Assoc 98(462):387–396

    Article  Google Scholar 

  • Goh A, Vidal R (2008) Clustering and dimensionality reduction on Riemannian manifolds. In: 2008 IEEE conference on computer vision and pattern recognition. IEEE, pp 1–7

  • Grubišić I, Pietersz R (2007) Efficient rank reduction of correlation matrices. Linear Algebra Appl 422(2–3):629–653

    Article  Google Scholar 

  • Hiriart-Urruty JB, Malick J (2012) A fresh variational-analysis look at the positive semidefinite matrices world. J Optim Theory Appl 153(3):551–577

    Article  Google Scholar 

  • Janas M, Cuffaro ME, Janssen M (2022) Understanding quantum Raffles. Springer, Cham

    Book  Google Scholar 

  • Jayasumana S, Hartley R, Salzmann M, Li H, Harandi M (2015) Kernel methods on Riemannian manifolds with gaussian RBF kernels. IEEE Trans Pattern Anal Mach Intell 37(12):2464–2477

    Article  Google Scholar 

  • Lajaunie C, Courrioux G, Manuel L (1997) Foliation fields and 3D cartography in geology: principles of a method based on potential interpolation. Math Geol 29(4):571–584

    Article  Google Scholar 

  • Lee JM (2013) Smooth manifolds. In: Introduction to smooth manifolds. Springer, pp 1–31

  • Li ST, Hammond JL (1975) Generation of pseudorandom numbers with specified univariate distributions and correlation coefficients. IEEE Trans Syst Man Cybern SMC-5(5):557–561

  • Lloyd S (1982) Least squares quantization in pcm. IEEE Trans Inf Theory 28(2):129–137

    Article  Google Scholar 

  • MacQueen J (1967) Some methods for classification and analysis of multivariate observations. In: Proceedings of the fifth Berkeley symposium on mathematical statistical and probability, Oakland, CA, USA, vol 1, pp 281–297

  • Matheron G (1971) Theory of regionalized variables and its applications. Ecole Natl Super des Mines 5:211

    Google Scholar 

  • Moakher M (2005) A differential geometric approach to the geometric mean of symmetric positive-definite matrices. SIAM J Matrix Anal Appl 26(3):735–747

    Article  Google Scholar 

  • Moakher M (2006) On the averaging of symmetric positive-definite tensors. J Elast 82(3):273–296

    Article  Google Scholar 

  • Moakher M, Zéraï M (2011) The Riemannian geometry of the space of positive-definite matrices and its application to the regularization of positive-definite matrix-valued data. J Math Imaging Vis 40(2):171–187

    Article  Google Scholar 

  • Oliver M, Webster R (1989) A geostatistical basis for spatial weighting in multivariate classification. Math Geol 21(1):15–35

    Article  Google Scholar 

  • Pennec X, Fillard P, Ayache N (2006) A Riemannian framework for tensor computing. Int J Comput Vis 66(1):41–66

    Article  Google Scholar 

  • Pinto FC, Manchuk JG, Deutsch CV (2021) Decomposition of multivariate spatial data into latent factors. Comput Geosci 153:104773

    Article  Google Scholar 

  • Riquelme ÁI, Ortiz JM (2023) Multivariate simulation using a locally varying coregionalization model. Math Geosci (submitted)

  • Sepúlveda E, Dowd P, Xu C (2018) Fuzzy clustering with spatial correction and its application to geometallurgical domaining. Math Geosci 50(8):895–928

    Article  Google Scholar 

  • Solow AR (1986) Map** by simple indicator kriging. Math Geol 18(3):335–352

    Article  Google Scholar 

  • Thanwerdas Y, Pennec X (2021) Geodesics and curvature of the quotient-affine metrics on full-rank correlation matrices. In: International conference on geometric science of information. Springer, pp 93–102

  • Thanwerdas Y, Pennec X (2022) Theoretically and computationally convenient geometries on full-rank correlation matrices. ar**v preprint ar**v:2201.06282

  • Wackernagel H (2013) Multivariate geostatistics: an introduction with applications. Springer, New York

    Google Scholar 

  • **ao Q (2014) Evaluating correlation coefficient for Nataf transformation. Probab Eng Mech 37:1–6

    Article  Google Scholar 

  • **e W, Sun H, Li C (2015) Quantifying statistical uncertainty for dependent input models with factor structure. In: 2015 winter simulation conference (WSC). IEEE, pp 667–678

  • You K, Park HJ (2021) Re-visiting Riemannian geometry of symmetric positive definite matrices for the analysis of functional connectivity. Neuroimage 225:117464

    Article  Google Scholar 

Download references

Acknowledgements

The authors acknowledge the funding provided by the Natural Sciences and Engineering Research Council of Canada (NSERC), funding reference numbers RGPIN-2017-04200 and RGPAS-2017-507956, and by the International Association for Mathematical Geosciences (IAMG) student grant, funding reference number MG-2020-14. The authors are grateful to two anonymous reviewers for their valuable comments on an earlier version of this paper.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Álvaro I. Riquelme.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest that could influence the work reported in this paper.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Riquelme, Á.I., Ortiz, J.M. A Riemannian Tool for Clustering of Geo-Spatial Multivariate Data. Math Geosci 56, 121–141 (2024). https://doi.org/10.1007/s11004-023-10085-7

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11004-023-10085-7

Keywords

Navigation