Fast estimation of multivariate spatiotemporal Hawkes processes and network reconstruction

Yuan, Baichuan; Schoenberg, Frederic P.; Bertozzi, Andrea L.

doi:10.1007/s10463-020-00780-1

Fast estimation of multivariate spatiotemporal Hawkes processes and network reconstruction

Published: 01 January 2021

Volume 73, pages 1127–1152, (2021)
Cite this article

Annals of the Institute of Statistical Mathematics Aims and scope Submit manuscript

Baichuan Yuan¹,
Frederic P. Schoenberg² &
Andrea L. Bertozzi¹

585 Accesses
4 Citations
Explore all metrics

Abstract

We present a fast, accurate estimation method for multivariate Hawkes self-exciting point processes widely used in seismology, criminology, finance and other areas. There are two major ingredients. The first is an analytic derivation of exact maximum likelihood estimates of the nonparametric triggering density. We develop this for the multivariate case and add regularization to improve stability and robustness. The second is a moment-based method for the background rate and triggering matrix estimation, which is extended here for the spatiotemporal case. Our method combines them together in an efficient way, and we prove the consistency of this new approach. Extensive numerical experiments, with synthetic data and real-world social network data, show that our method improves the accuracy, scalability and computational efficiency of prevailing estimation approaches. Moreover, it greatly boosts the performance of Hawkes process-based models on social network reconstruction and helps to understand the spatiotemporal triggering dynamics over social media.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

EUR 32.99 /Month

Get 10 units per month
Download Article/Chapter or Ebook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Multivariate Hawkes processes with spatial covariates for spatiotemporal event data analysis

Article 29 January 2024

Modelling and Inferring the Triggering Function in a Self-Exciting Point Process

Improvements on scalable stochastic Bayesian inference methods for multivariate Hawkes process

Article 27 February 2024

Notes

We obtain latitude and longitude coordinates from https://www.flickr.com/places/info.

References

Achab, M., Bacry, E., Gaïffas, S., Mastromatteo, I., Muzy, J.-F. (2017). Uncovering causality from multivariate Hawkes integrated cumulants. The Journal of Machine Learning Research, 18(1), 6998–7025.
MathSciNet MATH Google Scholar
Bacry, E., Bompaire, M., Gaïffas, S., Poulsen, S. (2017). Tick: A python library for statistical learning, with a particular emphasis on time-dependent modelling. ar**v preprint ar**v:1707.03003.
Bacry, E., Mastromatteo, I., Muzy, J.-F. (2015). Hawkes processes in finance. Market Microstructure and Liquidity, 1(01), 1550005.
Article Google Scholar
Bacry, E., Muzy, J.-F. (2016). First-and second-order statistics characterization of Hawkes processes and non-parametric estimation. IEEE Transactions on Information Theory, 62(4), 2184–2202.
Article MathSciNet Google Scholar
Balderama, E., Schoenberg, F. P., Murray, E., Rundel, P. W. (2012). Application of branching models in the study of invasive species. Journal of the American Statistical Association, 107(498), 467–476.
Article MathSciNet Google Scholar
Bao, J., Zheng, Y., Mokbel, M. F. (2012). Location-based and preference-aware recommendation using sparse geo-social networking data. In Proceedings of the 20th international conference on advances in geographic information systems (pp. 199–208).
Brantingham, P. J., Yuan, B., Herz, D. (2020a). Is gang violent crime more contagious than non-gang violent crime? Journal of Quantitative Criminology, https://doi.org/10.1007/s10940-020-09479-1.
Article Google Scholar
Brantingham, P. J., Yuan, B., Sundback, N., Schoenberg, F. P., Bertozzi, A. L., Gordon, J., et al. (2020b). Does violence interruption work? UCLA preprint, www.stat.ucla.edu/~frederic/papers/brantingham2.pdf.
Brillinger, D. R., Guttorp, P. M., Schoenberg, F. P., El-Shaarawi, A. H., Piegorsch, W. W. (2002). Point processes, temporal. Encyclopedia of Environmetrics, 3, 1577–1581.
Google Scholar
Bronstein, M. M., Bruna, J., LeCun, Y., Szlam, A., Vandergheynst, P. (2017). Geometric deep learning: Going beyond Euclidean data. IEEE Signal Processing Magazine, 34(4), 18–42.
Article Google Scholar
Chen, S., Shojaie, A., Shea-Brown, E., Witten, D. (2017). The multivariate hawkes process in high dimensions: Beyond mutual excitation. ar**v preprint ar**v:1707.04928.
Chiang, W.-H., Yuan, B., Li, H., Wang, B., Bertozzi, A., Carter, J., Ray, B., Mohler, G. (2019). Sos-EW: System for overdose spike early warning using drug mover’s distance-based Hawkes processes. In Joint European conference on machine learning and knowledge discovery in databases (pp. 538–554). Berlin: Springer.
Cho, E., Myers, S. A., Leskovec, J. (2011). Friendship and mobility: User movement in location-based social networks. In Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining (pp. 1082–1090). ACM.
Daley, D. J., Vere-Jones, D. (2003). An introduction to the theory of point processes: Volume I: Probability and its Applications. New York: Springer.
MATH Google Scholar
Daley, D. J., Vere-Jones, D. (2007). An introduction to the theory of point processes: Volume II: General theory and structure. New York: Springer.
MATH Google Scholar
Du, N., Farajtabar, M., Ahmed, A., Smola, A. J., Song, L. (2015). Dirichlet–Hawkes processes with applications to clustering continuous-time document streams. In Proceedings of the 21th ACM SIGKDD international conference on knowledge discovery and data mining (pp. 219–228). ACM.
Duchi, J., Hazan, E., Singer, Y. (2011). Adaptive subgradient methods for online learning and stochastic optimization. Journal of Machine Learning Research, 12, 2121–2159.
MathSciNet MATH Google Scholar
Eichler, M., Dahlhaus, R., Dueck, J. (2017). Graphical modeling for multivariate Hawkes processes with nonparametric link functions. Journal of Time Series Analysis, 38(2), 225–242.
Article MathSciNet Google Scholar
Farajtabar, M., Wang, Y., Rodriguez, M. G., Li, S., Zha, H., Song, L. (2015). Coevolve: A joint point process model for information diffusion and network co-evolution. Advances in Neural Information Processing Systems, 1954–1962.
Fox, E. W., Short, M. B., Schoenberg, F. P., Coronges, K. D., Bertozzi, A. L. (2016). Modeling e-mail networks and inferring leadership using self-exciting point processes. Journal of the American Statistical Association, 111(514), 564–584.
Article MathSciNet Google Scholar
Granger, C. W. (1969). Investigating causal relations by econometric models and cross-spectral methods. Econometrica: Journal of the Econometric Society, 37, 424–438.
Article Google Scholar
Hall, E. C., Willett, R. M. (2016). Tracking dynamic point processes on networks. IEEE Transactions on Information Theory, 62(7), 4327–4346.
Article MathSciNet Google Scholar
Hawkes, A. G. (1971). Spectra of some self-exciting and mutually exciting point processes. Biometrika, 58(1), 83–90.
Article MathSciNet Google Scholar
Kaipio, J., Somersalo, E. (2006). Statistical and computational inverse problems, Vol. 160. New York: Springer.
MATH Google Scholar
Kingma, D. P., Ba, J. (2015). Adam: A method for stochastic optimization. In International conference on learning representations.
Lai, E. L., Moyer, D., Yuan, B., Fox, E., Hunter, B., Bertozzi, A. L., Brantingham, P. J. (2016). Topic time series analysis of microblogs. IMA Journal of Applied Mathematics, 81(3), 409–431.
Article MathSciNet Google Scholar
Lee, D. D., Seung, H. S. (1999). Learning the parts of objects by non-negative matrix factorization. Nature, 401(6755), p. 788.
Article Google Scholar
Lewis, E., Mohler, G. (2011). A nonparametric EM algorithm for multiscale Hawkes processes. Journal of Nonparametric Statistics, 1(1), 1–20.
Google Scholar
Linderman, S., Adams, R. (2014). Discovering latent network structure in point process data. In International conference on machine learning (pp. 1413–1421). Bei**g, China: JMLR: W&C.
Malinverno, A. (2002). Parsimonious Bayesian Markov chain Monte Carlo inversion in a nonlinear geophysical problem. Geophysical Journal International, 151(3), 675–688.
Article Google Scholar
Mark, B., Raskutti, G., Willett, R. (2018). Network estimation from point process data. IEEE Transactions on Information Theory, 65, 2953–2975.
Article MathSciNet Google Scholar
Marsan, D., Lengline, O. (2008). Extending earthquakes’ reach through cascading. Science, 319(5866), 1076–1079.
Article Google Scholar
Mohler, G. O. (2014). Marked point process hotspot maps for homicide and gun crime prediction in Chicago. International Journal of Forecasting, 30(3), 491–497.
Article Google Scholar
Mohler, G. O., Short, M. B., Brantingham, P. J., Schoenberg, F. P., Tita, G. E. (2011). Self-exciting point process modeling of crime. Journal of the American Statistical Association, 106(493), 100–108.
Article MathSciNet Google Scholar
Neumaier, A. (1998). Solving ill-conditioned and singular linear systems: A tutorial on regularization. SIAM Review, 40(3), 636–666.
Article MathSciNet Google Scholar
Ogata, Y. (1978). The asymptotic behaviour of maximum likelihood estimators for stationary point processes. Annals of the Institute of Statistical Mathematics, 30(1), 243–261.
Article MathSciNet Google Scholar
Ogata, Y. (1998). Space-time point-process models for earthquake occurrences. Annals of the Institute of Statistical Mathematics, 50(2), 379–402.
Article Google Scholar
Porter, M. D., White, G., et al. (2012). Self-exciting hurdle models for terrorist activity. The Annals of Applied Statistics, 6(1), 106–124.
Article MathSciNet Google Scholar
Reinhart, A. (2018). A review of self-exciting spatio-temporal point processes and their applications. Statistical Science, 33(3), 299–318.
MathSciNet MATH Google Scholar
Schoenberg, F. P. (2006). On non-simple marked point processes. Annals of the Institute of Statistical Mathematics, 58(2), 223–233.
Article MathSciNet Google Scholar
Schoenberg, F. P. (2013). Facilitated estimation of ETAS. Bulletin of the seismological Society of America, 103(1), 601–605.
Article Google Scholar
Schoenberg, F. P., Brillinger, D. R., Guttorp, P. (2013). Point processes, spatial-temporal. Encyclopedia of Environmetrics, 4, 1573–1578.
Google Scholar
Schoenberg, F. P., et al. (2018a). Comment on “A review of self-exciting spatio-temporal point processes and their applications” by Alex Reinhart. Statistical Science, 33(3), 325–326.
Article MathSciNet Google Scholar
Schoenberg, F. P., Gordon, J. S., Harrigan, R. J. (2018b). Analytic computation of nonparametric Marsan–Lengliné estimates for Hawkes point processes. Journal of Nonparametric Statistics, 30(3), 742–775.
Article MathSciNet Google Scholar
Veen, A., Schoenberg, F. P. (2008). Estimation of space-time branching process models in seismology using an EM-type algorithm. Journal of the American Statistical Association, 103(482), 614–624.
Article MathSciNet Google Scholar
Wang, B., Luo, X., Zhang, F., Yuan, B., Bertozzi, A. L., Brantingham, P. J. (2018). Graph-based deep modeling and real time forecasting of sparse spatio-temporal data. ar**v preprint ar**v:1804.00684.
Yuan, B., Li, H., Bertozzi, A. L., Brantingham, P. J., Porter, M. A. (2019). Multivariate spatiotemporal Hawkes processes and network reconstruction. SIAM Journal on Mathematics of Data Science, 1(2), 356–382.
Article MathSciNet Google Scholar
Yuan, B., Wang, X., Ma, J., Zhou, C., Bertozzi, A. L., Yang, H. (2020). Variational autoencoders for highly multivariate spatial point processes intensities. In International conference on learning (representations).
Zhu, S., **e, Y. (2019). Spatial–temporal–textual point processes with applications in crime linkage detection. ar**v preprint ar**v:1902.00440.
Zhuang, J., Ogata, Y., Vere-Jones, D. (2002). Stochastic declustering of space-time earthquake occurrences. Journal of the American Statistical Association, 97(458), 369–380.
Article MathSciNet Google Scholar

Download references

Acknowledgements

This work was supported by the City of Los Angeles Gang Reduction Youth Development Project, by NSF grant DMS-2027277 and by NSF grant DMS-1737770. Baichuan Yuan gratefully acknowledges the fellowship support of the National Institute of Justice (NIJ) under Award Number 2018-R2-CX-0013.

Author information

Authors and Affiliations

Department of Mathematics, University of California, Los Angeles, 7619D Math-Science Building, Los Angeles, CA, 90095-1555, USA
Baichuan Yuan & Andrea L. Bertozzi
Department of Statistics, University of California, 8142, Math-Science Building, Los Angeles, CA, 90095-1554, USA
Frederic P. Schoenberg

Authors

Baichuan Yuan
View author publications
You can also search for this author in PubMed Google Scholar
Frederic P. Schoenberg
View author publications
You can also search for this author in PubMed Google Scholar
Andrea L. Bertozzi
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Frederic P. Schoenberg.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Fast estimation of Hawkes processes.

Appendices

Appendix 1: Simulation data

1.1 \(U=1\) data

We simulate a univariate ST-Hawkes process with \(K=1/6\), \(\mu =0.01\), \(T=2.1\times 10^5\), \(X,Y \in (0,10)\), \(f(r)=\frac{1}{2\pi \sigma ^2}\exp (-r^2/2\sigma ^2)\) (\(\sigma ^2=0.2\)) and \(h(t)=\omega \exp (-\omega t)\) (\(\omega =10\)). The regularization parameter \(\alpha =0.5\).

1.2 \(U=100\) data

Using the same triggering densities, this data set has the following parameters: \(U=100\), the background rate \(\varvec{\mu }=(0.01,\ldots ,0.01)\). \(T=10^5\), \(X,Y \in (0,10)\), \(\sigma ^2=0.2\) and \(\omega =10\) with 172,943 events. For the triggering matrix in Fig. 2, each yellow pixel is 1/20, cyan pixel is 1/40 and dark pixel is 0.

1.3 \(U=10\) data

With the same densities, the parameters are \(U=10\), \(\varvec{\mu }=(0.01,\ldots ,0.01)\), \(T=1e6\), \(X,Y \in (0,10)\), \(\sigma ^2=0.2\), \(\omega =10\) and \(\varvec{K}\) is shown in Fig. 3. Here, each yellow pixel is 1/6 and dark pixel is 0. The regularization parameter \(\alpha =0.55\).

1.4 \(U=10\) data with a Pareto triggering density in time

We keep the same parameters as the \(U=10\) above. The changes on the densities are on the temporal density \(h(t)=(p-1)c^{p-1}/(t+c)^p\) with \(c=2\) and \(p=2.5\) and the same spatial triggering density with \(\sigma ^2=0.1\). The regularization parameter \(\alpha =0.38\).

1.5 \(U=10\) data with a uniform triggering density in time

Similar to the section above, here we change the temporal densities to be uniform \(h(t)=0.1\) and the spatial triggering density with \(\sigma ^2=0.1\). The regularization parameter \(\alpha =0.4\). We threshold the estimated \(\varvec{{\tilde{K}}}\) with \(\epsilon = 0.01\) to remove noise.

1.6 \(U=10\) data with a power-law triggering density in space

Similarly, we use the power-law density \(f(r)=\frac{1}{(r^2+1)^2}\) in space and the exponential triggering density in time with \(\omega =10\). The regularization parameter \(\alpha =0.28\). We threshold the estimated \(\varvec{{\tilde{K}}}\) with \(\epsilon = 0.02\) to remove noise.

1.7 \(U=10\) data with a uniform triggering density in space

Given the same parameters as above, we change the spatial density to \(f(r)=0.25\) and keep the exponential triggering density in time with \(\omega =10\). The regularization parameter \(\alpha =0.36\). We threshold the estimated \(\varvec{{\tilde{K}}}\) with \(\epsilon = 0.01\) to remove noise (Fig. 9).

Appendix 2: Gowalla and Brightkite data sets

In this section, we describe the preprocessing procedure for Gowalla and Brightkite data sets. We focus on various local friendship subnetworks within different US cities, including San Diego (SD), Chicago (CHI), Los Angeles (LA) and San Francisco (SF). They have diverse network sizes and ST patterns within the same time period.

1.1 Brightkite-SD

We study check-ins in SD for Brightkite data set. We use a bounding box (with a north latitude of 33.1142, a south latitude of 32.5348, an east longitude of \(-\,116.9058\), and a west longitude of \(-\,117.2824\))^{Footnote 1} to locate check-ins in SD. We consider “active” users, who have more than 300 check-ins during the period. This gives us a small subnetwork with 25 “active” users and a total of 13,760 check-ins in SD.

1.2 Gowalla-CHI

We apply the same procedure as in "Appendix 2" on the Gowalla check-in data for CHI. The bounding box for CHI has a north latitude of 42.0229, a south latitude of 41.6446, an east longitude of \(-\,87.5245\) and a west longitude of \(-\,87.9395\). After selecting only active users (with more than 100 check-ins) users, we have a medium-sized subnetwork with 96 users and 27,326 check-ins.

1.3 Brightkite-LA

We apply the same procedure as in "Appendix 2" on the Brightkite check-in data in LA. The bounding box for LA has a north latitude of 34.34, a south latitude of 33.70, an east longitude of \(-\,118.16\) and a west longitude of \(-\,118.67\). After selecting only active users (with more than 150 check-ins) users, we have a medium-sized subnetwork with 168 users and 89,127 check-ins.

1.4 Gowalla-SF

We apply the same procedure as in "Appendix 2" on the Gowalla check-in data in SF. The bounding box for SF has a north latitude of 37.93, a south latitude of 37.64, an east longitude of \(-\,122.28\) and a west longitude of \(-\,123.17\). After selecting only active users (with more than 65 check-ins) users, we have a large subnetwork with 515 users and 102,673 check-ins.

Appendix 3: Assumptions for Theorem 1

There are two separate sets of general assumptions for the consistency of GMM and MLE in Hawkes processes. We only list assumptions that are relevant to our proof.

The first set of assumptions is from Ogata (1978) about the point process and intensity functions.

Assumption 1

(Consistency of MLE estimation)

Multivariate Hawkes process \((\varvec{N}_{t,x,y})\) is stationary, ergodic and absolutely continuous with respect to the standard Poisson process.
The conditional intensity function \(\lambda _{\Theta }\) with parameters \(\Theta \) is predictable for all compact metric spaces and continuous in \(\Theta \).
When \(t=0\), \(\lambda _{\Theta }\)is positive almost surely and \(\lambda _{\Theta _1}= \lambda _{\Theta _2}\)almost surely if and only if \(\Theta _1=\Theta _2\); for any \(\Theta \) from a compact metric space, there exists a neighborhood \(U(\Theta )\) of \(\Theta \) such that for all \(\Theta ' \in U(\Theta )\), \(|\lambda _{\Theta '}|\) and \(|\log \lambda _{\Theta '}|\) are bounded by random variables with finite second moments.
For any \(\Theta \) from a compact metric space, there is a neighborhood \(U(\Theta )\) of \(\Theta \) such that \(\sup _{\Theta ' \in U(\Theta )}|\lambda (\Theta ')-{\mathbb {E}}(\lambda (\Theta '))| \rightarrow 0\) in probability as \(t \rightarrow \infty \) and (for some \(\alpha >0\)) \(\sup _{\Theta ' \in U(\Theta )}|\log {\mathbb {E}}(\lambda (\Theta '))|\) has finite \((2+\alpha ){\text{th}}\) moment uniform bounded with respect to t.

On top of Assumption 1, we also need GMM-related assumptions from Achab et al. (2017).

Assumption 2

(Consistency of GMM estimation)

For (25), the GMM approximation error \(L(\varvec{R})=0\) if and only if \(\varvec{R} = (\varvec{I-K^{\rm T}})^{-1}\).
For (22–24), the supports of the triggering density X, Y, H satisfy \({\tilde{X}}^2/X\), \({\tilde{Y}}^2/Y\), \({\tilde{H}}^2/T \rightarrow 0\) separately as \(X,Y,H \rightarrow \infty \).

About this article

Cite this article

Yuan, B., Schoenberg, F.P. & Bertozzi, A.L. Fast estimation of multivariate spatiotemporal Hawkes processes and network reconstruction. Ann Inst Stat Math 73, 1127–1152 (2021). https://doi.org/10.1007/s10463-020-00780-1

Download citation

Received: 30 March 2020
Revised: 26 October 2020
Accepted: 06 November 2020
Published: 01 January 2021
Issue Date: December 2021
DOI: https://doi.org/10.1007/s10463-020-00780-1

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

EUR 32.99 /Month

Get 10 units per month
Download Article/Chapter or Ebook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fast estimation of multivariate spatiotemporal Hawkes processes and network reconstruction

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Multivariate Hawkes processes with spatial covariates for spatiotemporal event data analysis

Modelling and Inferring the Triggering Function in a Self-Exciting Point Process

Improvements on scalable stochastic Bayesian inference methods for multivariate Hawkes process

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendices

Appendix 1: Simulation data

1.1 \(U=1\) data

1.2 \(U=100\) data

1.3 \(U=10\) data

1.4 \(U=10\) data with a Pareto triggering density in time

1.5 \(U=10\) data with a uniform triggering density in time

1.6 \(U=10\) data with a power-law triggering density in space

1.7 \(U=10\) data with a uniform triggering density in space

Appendix 2: Gowalla and Brightkite data sets

1.1 Brightkite-SD

1.2 Gowalla-CHI

1.3 Brightkite-LA

1.4 Gowalla-SF

Appendix 3: Assumptions for Theorem 1

Assumption 1

Assumption 2

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Fast estimation of multivariate spatiotemporal Hawkes processes and network reconstruction

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Multivariate Hawkes processes with spatial covariates for spatiotemporal event data analysis

Modelling and Inferring the Triggering Function in a Self-Exciting Point Process

Improvements on scalable stochastic Bayesian inference methods for multivariate Hawkes process

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendices

Appendix 1: Simulation data

1.1 \(U=1\) data

1.2 \(U=100\) data

1.3 \(U=10\) data

1.4 \(U=10\) data with a Pareto triggering density in time

1.5 \(U=10\) data with a uniform triggering density in time

1.6 \(U=10\) data with a power-law triggering density in space

1.7 \(U=10\) data with a uniform triggering density in space

Appendix 2: Gowalla and Brightkite data sets

1.1 Brightkite-SD

1.2 Gowalla-CHI

1.3 Brightkite-LA

1.4 Gowalla-SF

Appendix 3: Assumptions for Theorem 1

Assumption 1

Assumption 2

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation