Abstract
This paper considers the problem of clustering n observed time series \(\mathbf{x}_{k} =\{\ x_{k}(t)\ \vert \ t \in \mathcal{T}\}\), k = 1, …, n, with time points t in a suitable time range \(\mathcal{T}\), into a suitable number m of clusters \(C_{1},\ldots,C_{m} \subset \{ 1,\ldots,n\}\) each one comprising time series with a ‘similar’ structure. Classical approaches might typically proceed by first computing a dissimilarity matrix and then applying a traditional, possibly hierarchical clustering method. In contrast, here we will present a brief survey about various approaches that start by defining probabilistic clustering models for the time series, i.e., with class-specific distribution models, and then determine a suitable (hopefully optimum) clustering by statistical tools like maximum likelihood and optimization algorithms. In particular, we will consider models with class-specific Gaussian processes and Markov chains.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Akaike H (1974) A new look at the statistical model identification. IEEE Trans Autom Control 19(6):716–723
Banfield JD, Raftery AE (1993) Model-based Gaussian and non-Gaussian clustering. Biometrics 49(3):803–821
Biernacki C, Govaert G (1997) Using the classification likelihood to choose the number of clusters. Comput Sci Stat 29(2):451–457
Biernacki C, Celeux G, Govaert G (2000) Assessing a mixture model for clustering with the integrated completed likelihood. IEEE Trans Pattern Anal Mach Intell 22(7):719–725
Bouveyron C, Jacques J (2011) Model-based clustering of time series in group-specific functional subspaces. Adv Data Anal Classif 5:281–300
Chiou JM, Li PL (2007) Functional clustering and identifying substructures of longitudinal data. J R Stat Soc B (Stat Methodol) 69(4):679–699
Chouakria AD, Nagabhushan PN (2007) Adaptive dissimilarity index for measuring time series proximity. Adv Data Anal Classif 1:5–21
Claeskens G, Hjort NL (2003) “The focused information criterion” (with discussion). J Am Stat Assoc 98:879–899
Claeskens G, Hjort NL (2008) Model selection and model averaging. Cambridge University Press, Cambridge/New York
De la Cruz-Mesía R, Quintana FA, Marshall G (2008) Model-based clustering for longitudinal data. Comput Stat Data Anal 52(3):1441–1457
Delaigle A, Hall P (2010) Defining probability density for a distribution of random functions. Ann Stat 38:1171–1193
Delaigle A, Hall P, Bathia N (2012) Componentwise classification and clustering of functional data. Biometrika 99(2):299–313
Ferraty F, Vieu P (2010) Nonparametric functional data analysis: theory and practice. Springer, New York
Ferrazzi F, Magni P, Bellazzi R (2005) Random walk models for Bayesian clustering of gene expression profiles. Appl Bioinf 4:263–276
Frühwirth-Schnatter S (2006) Finite mixture and Markov switching models. Springer series in statistics. Springer, New York
Frühwirth-Schnatter S (2011) Panel data analysis: a survey on model-based clustering of time series. Adv Data Anal Classif 5(4):251–280
Horenko I (2010) Finite element approach to clustering of multidimensional time series. SIAM J Sci Comput 32(1):62–83
Jacques J, Preda C (2012) Functional data clustering using density approximation. In: Journées de Statistique de la SFdS, Université Libre de Bruxelles, pp 21–25
Kalpakis K, Gada D, Puttagunta V (2001) Distance measures for effective clustering of ARIMA time-series. In: Proceedings IEEE international conference on data mining, San Jose, pp 273–280
Liao TW (2005) Clustering of time series data – a survey. Pattern Recognit 38(11):1857–1874
McNicholas PD, Murphy TB (2010) Model-based clustering of longitudinal data. Can J Stat 38(1):153–168
Pamminger C, Frühwirth-Schnatter S (2010) Model-based clustering of time series. Bayesian Anal 5:345–368
Peng J, Müller HG (2008) Distance-based clustering of sparsely observed stochastic processes, with applications to online auctions. Ann Appl Stat 2:1056–1077
Ramsay J, Silverman BW (2005) Functional data analysis, 2nd edn. Springer series in statistics, Springer, New York
Samé A, Chamroukhi F, Govaert G, Aknin P (2011) Model-based clustering and segmentation of time series with changes in regime. Adv Data Anal Classif 5(4):301–321
Schwarz G (1978) Estimating the dimension of a model. Ann Stat 6(2):461–464
Sebastiani P, Ramoni M, Cohen P, Warwick J, Davis J (1999) Discovering dynamics using Bayesian clustering. In: Hand D, Kok J, Berthold M (eds) Advances in intelligent data analysis. Lecture notes in computer science, vol 1642. Springer, Berlin, pp 199–209
Song X, Jermaine C, Ranka S, Gums J (2008) A Bayesian mixture model with linear regression mixing proportions. In: Proceedings of the 14th ACM SIGKDD international conference on knowledge discovery and data mining, KDD’08, Las Vegas. ACM, New York, pp 659–667
Spiegelhalter DJ, Best NG, Carlin BP, van der Linde A (2002) Bayesian measures of model complexity and fit. J R Stat Soc 64(4):583–639
Vilar JA, Pértega S (2004) Discriminant and cluster analysis for Gaussian stationary processes: local linear fitting approach. J Nonparametr Stat 16:443–462
Wasserman L (2000) Bayesian model selection and model averaging. J Math Psychol 44(1):92–107
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Bock, HH. (2014). Model-Based Clustering Methods for Time Series. In: Gaul, W., Geyer-Schulz, A., Baba, Y., Okada, A. (eds) German-Japanese Interchange of Data Analysis Results. Studies in Classification, Data Analysis, and Knowledge Organization. Springer, Cham. https://doi.org/10.1007/978-3-319-01264-3_1
Download citation
DOI: https://doi.org/10.1007/978-3-319-01264-3_1
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-01263-6
Online ISBN: 978-3-319-01264-3
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)