Log in

Scan Statistics for Normal Data with Outliers

  • Original Article
  • Published:
Methodology and Computing in Applied Probability Aims and scope Submit manuscript

A Correction to this article was published on 15 May 2021

This article has been updated

Abstract

In this article we investigate the performance of scan statistics based on moving medians, as test statistics for detecting a local change in population mean, for one and two dimensional normal data, in presence of outliers, when the population variance is unknown. For fixed window scan statistics, both the training sample and parametric bootstrap methods are employed for one and two dimensional normal data, in presence of one or two outliers. Multiple window scan statistics are implemented via the parametric bootstrap method for one and two dimensional normal data, in presence of one or two outliers. Numerical results are presented via simulation to evaluate the power of these scan statistics for detecting the local change in the population mean, for selected parameters of the models characterizing the local change in the population mean and models characterizing the occurrence of one or two outliers in the data. When the window size where the local change of the population mean has occurred is unknown, the multiple window scan statistics, implemented via the bootstrap method, performed quite well.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price includes VAT (Germany)

Instant access to the full article PDF.

Similar content being viewed by others

Change history

  • 21 November 2020

    Springer Nature’s version of this paper was updated to represent the correct Tables citations.

  • 15 May 2021

    A Correction to this paper has been published: https://doi.org/10.1007/s11009-021-09868-4

References

  • Ahmad M, Sundararajan D (1987) A fast algorithm for two dimensional median filtering. IEEE Trans Circ Syst 34(11):1364–1374

    Article  Google Scholar 

  • Alm S E (1999) Approximations of the distributions of scan statistics of Poisson processes. In: Scan statistics and applications, pp. 113–139. Springer, New York

  • Balakrishnan N, Koutras M V (2011) Runs and scans with applications, vol 764. Wiley, New York

  • Bauer P, Hackl P (1978) The use of MOSUMS for quality control. Technometrics 20(4):431–436

    Article  MATH  Google Scholar 

  • Bauer P, Hackl P (1980) An extension of the MOSUM technique for quality control. Technometrics 22(1):1–7

    Article  MATH  Google Scholar 

  • Boutsikas M V, Koutras M V (2000) Reliability approximation for Markov chain imbeddable systems. Methodol Comput Appl Probab 2(4):393–411

    Article  MathSciNet  MATH  Google Scholar 

  • Buzzi-Ferraris G, Manenti F (2011) Outlier detection in large data sets. Comput Chem Eng 35(2):388–390

    Article  Google Scholar 

  • Cressie N (2015) Statistics for spatial data. Wiley, New York

    MATH  Google Scholar 

  • Darling R, Waterman M S (1986) Extreme value distribution for the largest cube in a random lattice. SIAM J Appl Math 46(1):118–132

    Article  MathSciNet  MATH  Google Scholar 

  • Do Lago C L, Juliano V F, Kascheres C (1995) Applying moving median digital filter to mass spectrometry and potentiometric titration. Anal Chim Acta 310(2):281–288

    Article  Google Scholar 

  • Frank R, ** W, Ester M (2007) Efficiently mining regional outliers in spatial data. In: International Symposium on spatial and temporal databases. Springer, pp 112–129

  • Fu J, Koutras M (1994) Distribution theory of runs: A Markov chain approach. J Am Stat Assoc 89(427):1050–1058

    Article  MathSciNet  MATH  Google Scholar 

  • Fu J C, Lou W W (2003) Distribution theory of runs and patterns and its applications: A Finite Markov Chain Imbedding Approach. World Scientific, Singapore

    Book  MATH  Google Scholar 

  • Fuchs S, Ornetsmüller C, Totschnig R (2012) Spatial scan statistics in vulnerability assessment: an application to mountain hazards. Nat Hazards 64(3):2129–2151

    Article  Google Scholar 

  • Glaz J, Johnson B (1988) Boundary crossing for moving sums. J Appl Probab:81–88

  • Glaz J, Naus J (1991) Tight bounds and approximations for scan statistic probabilities for discrete data. Ann Appl Probab:306–318

  • Glaz J, Balakrishnan N (1999) Introduction to scan statistics. In: Scan statistics and applications, pp. 3–24. Springer, New York

  • Glaz J, Naus J I, Wallenstein S (2001) Scan statistics. Springer, New York

    Book  MATH  Google Scholar 

  • Glaz J, Pozdnyakov V, Wallenstein S (2009) Scan statistics: methods and applications. Springer Science & Business Media, Berlin

    Book  MATH  Google Scholar 

  • Glaz J, Naus J, Wang X (2012) Approximations and inequalities for moving sums. Methodol Comput Appl Probab 14(3):597–616

    Article  MathSciNet  MATH  Google Scholar 

  • Glaz J, Koutras MV (2019) Handbook of scan statistics. Springer

  • Guerriero M, Willett P, Glaz J (2009) Distributed target detection in sensor networks using scan statistics. IEEE Trans Signal Process 57(7):2629–2639

    Article  MathSciNet  MATH  Google Scholar 

  • Haiman G (1999) First passage time for some stationary processes. Stoch Process Appl 80(2):231–248

    Article  MathSciNet  MATH  Google Scholar 

  • Haiman G (2007) Estimating the distribution of one-dimensional discrete scan statistics viewed as extremes of 1-dependent stationary sequences. J Stat Plann Inference 137(3):821–828

    Article  MathSciNet  MATH  Google Scholar 

  • He Z, Xu B, Buxbaum J, Ionita-Laza I (2019) A genome-wide scan statistic framework for whole-genome sequence data analysis. Nat Commun 10 (1):1–11

    Article  Google Scholar 

  • Hoh J, Ott J (2009) Scan statistics in genome-wide scan for complex trait loci. In: Scan statistics. Springer, pp 195–202

  • Huang D, Dunsmuir W T (1998) Computing joint distributions of 2d moving median filters with applications to detection of edges. IEEE Trans Pattern Anal Mach Intell 20(3):340–343

    Article  Google Scholar 

  • Justusson B (1981) Median filtering: Statistical properties. In: Two-dimensional digital signal prcessing II. Springer, pp 161–196

  • Karlin S, Blaisdell B E, Mocarski E S, Brendel V (1989) A method to identify distinctive charge configurations in protein sequences, with application to human herpesvirus polypeptides. J Mol Biol 205(1):165–177

    Article  Google Scholar 

  • Kim J H, Lee H, Shin JY (2020) Bacillus calmette–guérin (bcg) vaccine safety surveillance in the korea adverse event reporting system using the tree-based scan statistic and conventional disproportionality-based algorithms. Vaccine

  • Kleinman K, Abrams A, Kulldorff M, Platt R (2005) A model-adjusted space–time scan statistic with an application to syndromic surveillance. Epidemiol infection 133(3):409–419

    Article  Google Scholar 

  • Kulldorff M (1997) A spatial scan statistic. Commun Stat-Theory Methods 26(6):1481–1496

    Article  MathSciNet  MATH  Google Scholar 

  • Kulldorff M, Huang L, Konty K (2009) A scan statistic for continuous data based on the normal probability model. Int J Health Geogr 8(1):1

    Article  Google Scholar 

  • Malinowski J, Preuss W (1995) Reliability of circular consecutively-connected systems with multistate components. IEEE Trans Reliab 44(3):532–534

    Article  Google Scholar 

  • Moore A W, Jorgenson J W (1993) Median filtering for removal of low-frequency background drift. Anal Chem 65(2):188–191

    Article  Google Scholar 

  • Noonan J, Zhigljavsky A (2019) Approximations for the boundary crossing probabilities of moving sums of normal random variables. Commun Stat-Simul Comput:1–22

  • Noonan J, Zhigljavsky A (2020) Approximations for the boundary crossing probabilities of moving sums of random variables. Methodology and Computing in Applied Probability

  • Patil GP, Joshi SW, Myers WL, Koli RE (2009) Uls scan statistic for hotspot detection with continuous gamma response. In: Scan statistics. Springer, pp 251–270

  • Peng C H (2009) Maxima of moving sums in a Poisson random field. Adv Appl Probab 41(3):647–663

    Article  MathSciNet  MATH  Google Scholar 

  • Shafie K, Sigal B, Siegmund D, Worsley K, et al. (2003) Rotation space random fields with an application to fmri data. Ann Stat 31(6):1732–1771

    Article  MathSciNet  MATH  Google Scholar 

  • Wang X, Zhao B, Glaz J (2014) A multiple window scan statistic for time series models. Stat Probab Lett 94:196–203

    Article  MathSciNet  MATH  Google Scholar 

  • Wu E, Liu W, Chawla S (2008) Spatio-temporal outlier detection in precipitation data. In: International Workshop on knowledge discovery from sensor data. Springer, pp 115–133

  • Wu Q, Glaz J (2019) Robust scan statistics for detecting a local change in population mean for normal data. Methodol Comput Appl Probab 21 (1):295–314

    Article  MathSciNet  MATH  Google Scholar 

Download references

Acknowledgments

The authors would like to thank the Guest Editor and the referees for their valuable comments that improved the presentation of the results in this article.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Qianzhu Wu.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wu, Q., Glaz, J. Scan Statistics for Normal Data with Outliers. Methodol Comput Appl Probab 23, 429–458 (2021). https://doi.org/10.1007/s11009-020-09837-3

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11009-020-09837-3

Keywords

Navigation