Abstract
Beyond the challenge of kee** up to date with current best practices regarding the diagnosis and treatment of outliers, an additional difficulty arises concerning the mathematical implementation of the recommended methods. Here, we provide an overview of current recommendations and best practices and demonstrate how they can easily and conveniently be implemented in the R statistical computing software, using the {performance} package of the easystats ecosystem. We cover univariate, multivariate, and model-based statistical outlier detection methods, their recommended threshold, standard output, and plotting methods. We conclude by reviewing the different theoretical types of outliers, whether to exclude or winsorize them, and the importance of transparency. A preprint of this paper is available at: 10.31234/osf.io/bu6nt.
Similar content being viewed by others
Data availability
This paper first appeared as a preprint (https://doi.org/10.31234/osf.io/bu6nt) and is also available as an online vignette at: https://easystats.github.io/performance/articles/check_outliers. All data used in this paper uses data included with base R.
Code availability
The performance package is available at the package official website (https://easystats.github.io/performance), on CRAN (https://cran.r-project.org/package=performance), and on the R-Universe (https://easystats.r-universe.dev/performance). The source code is available on GitHub (https://github.com/easystats/performance/), and the package can be installed from CRAN with install.packages("performance"). The code to reproduce figures and all analyses in this paper is available at https://osf.io/eqja6/.
Notes
Note that check_outliers() only checks numeric variables.
3.29 is an approximation of the two-tailed critical value for p < .001, obtained through qnorm(p = 1 – 0.001 / 2). We chose this threshold for consistency with the thresholds of all our other methods.
Our default threshold for the MCD method is defined by stats::qchisq(p = 1 – 0.001, df = ncol(x)), which again is an approximation of the critical value for p < .001 consistent with the thresholds of our other methods.
Our default threshold for the Cook method is defined by stats::qf(0.5, ncol(x), nrow(x) - ncol(x)), which again is an approximation of the critical value for p < .001 consistent with the thresholds of our other methods. In this case, the value 0.5 represents the median of the implied F distribution for D, which allows us to flag D values that are “above average”.
References
Aguinis, H., Gottfredson, R. K., & Joo, H. (2013). Best-practice recommendations for defining, identifying, and handling outliers. Organizational Research Methods, 16(2), 270–301. https://doi.org/10.1177/1094428112470848
Anders, R., Alario, F., Van Maanen, L., et al. (2016). The shifted Wald distribution for response time data analysis. Psychological Methods, 21(3), 309. https://doi.org/10.1037/met0000066
Aruguete, M. S., Huynh, H., Browne, B. L., Jurs, B., Flint, E., & McCutcheon, L. E. (2019). How serious is the ‘carelessness’ problem on Mechanical Turk? International Journal of Social Research Methodology, 22(5), 441–449. https://doi.org/10.1080/13645579.2018.1563966
Brown, S. D., & Heathcote, A. (2008). The simplest complete model of choice response time: Linear ballistic accumulation. Cognitive Psychology, 57(3), 153–178. https://doi.org/10.1016/j.cogpsych.2007.12.002
Cao, N., Lin, Y. R., Gotz, D., & Du, F. (2018). Z-Glyph: Visualizing outliers in multivariate data. Information Visualization, 17(1), 22–40. https://doi.org/10.1177/1473871616686635
Chaloner, K., & Brant, R. (1988). A Bayesian approach to outlier detection and residual analysis. Biometrika, 75(4), 651–659. https://doi.org/10.1093/biomet/75.4.651
Ciccione, L., Dehaene, G., & Dehaene, S. (2023). Outlier detection and rejection in scatterplots: Do outliers influence intuitive statistical judgments? Journal of Experimental Psychology: Human Perception and Performance, 49(1), 129–144. https://doi.org/10.1037/xhp0001065
Cook, R. D. (1977). Detection of influential observation in linear regression. Technometrics, 19(1), 15–18. https://doi.org/10.1080/00401706.1977.10489493
Curran, P. G. (2016). Methods for the detection of carelessly invalid responses in survey data. Journal of Experimental Social Psychology, 66, 4–19. https://doi.org/10.1016/j.jesp.2015.07.006
Gnanadesikan, R., & Kettenring, J. R. (1972). Robust estimates, residuals, and outlier detection with multiresponse data. Biometrics, 28(1), 81–124. https://doi.org/10.2307/2528963
Goldammer, P., Annen, H., Stöckli, P. L., & Jonas, K. (2020). Careless responding in questionnaire measures: Detection, impact, and remedies. The Leadership Quarterly, 31(4), 101384. https://doi.org/10.1016/j.leaqua.2020.101384
Leys, C., Ley, C., Klein, O., Bernard, P., & Licata, L. (2013). Detecting outliers: Do not use standard deviation around the mean, use absolute deviation around the median. Journal of Experimental Social Psychology, 49(4), 764–766. https://doi.org/10.1016/j.jesp.2013.03.013
Leys, C., Klein, O., Dominicy, Y., & Ley, C. (2018). Detecting multivariate outliers: Use a robust variant of the Mahalanobis distance. Journal of Experimental Social Psychology, 74, 150–156. https://doi.org/10.1016/j.jesp.2017.09.011
Leys, C., Delacre, M., Mora, Y. L., Lakens, D., & Ley, C. (2019). How to classify, detect, and manage univariate and multivariate outliers, with emphasis on pre-registration. International Review of Social Psychology. https://doi.org/10.5334/irsp.289
Lüdecke, D., Ben-Shachar, M. S., Patil, I., Waggoner, P., & Makowski, D. (2021). performance: An R package for assessment, comparison and testing of statistical models. Journal of Open Source Software, 6(60), 3139. https://doi.org/10.21105/joss.03139
Lüdecke, D., Makowski, D., Ben-Shachar, M. S., Patil, I., Wiernik, B. M., Bacher, E., & Thériault, R. (2023). easystats: Streamline model interpretation, visualization, and reporting. R package version 0.7.0. Retrieved February 26, 2024, from https://easystats.github.io/easystats/
McElreath, R. (2020). Statistical rethinking: A Bayesian course with examples in R and stan. CRC Press.
McNeil, D. R. (1977). Interactive Data Analysis: A Practical Primer. Wiley.
Miller, J. (2023). Outlier exclusion procedures for reaction time analysis: The cures are generally worse than the disease. Journal of Experimental Psychology: General. https://doi.org/10.1037/xge0001450
Patil, I., Makowski, D., Ben-Shachar, M. S., Wiernik, B. M., Bacher, E., & Lüdecke, D. (2022). datawizard: An R package for easy data preparation and statistical transformations. Journal of Open Source Software, 7(78), 4684. https://doi.org/10.21105/joss.04684
Ratcliff, R. (1993). Methods for dealing with reaction time outliers. Psychological Bulletin, 114(3), 510. https://doi.org/10.1037/0033-2909.114.3.510
Ratcliff, R., Smith, P. L., Brown, S. D., & McKoon, G. (2016). Diffusion decision model: Current issues and history. Trends in Cognitive Sciences, 20(4), 260–281. https://doi.org/10.1016/j.tics.2016.01.007
Rouder, J. N., Province, J. M., Morey, R. D., Gomez, P., & Heathcote, A. (2015). The lognormal race: A cognitive-process model of choice and latency with desirable psychometric properties. Psychometrika, 80, 491–513. https://doi.org/10.1007/s11336-013-9396-3
Schramm, P., & Rouder, J. N. (2019). Are reaction time transformations really beneficial? PsyAr**v. https://doi.org/10.31234/osf.io/9ksa6
Simmons, J. P., Nelson, L. D., & Simonsohn, U. (2011). False-positive psychology: Undisclosed flexibility in data collection and analysis allows presenting anything as significant. Psychological Science, 22(11), 1359–1366. https://doi.org/10.1177/0956797611417632
Smiti, A. (2020). A critical overview of outlier detection methods. Computer Science Review, 38, 100306. https://doi.org/10.1016/j.cosrev.2020.100306
Tukey, J. W., & McLaughlin, D. H. (1963). Less vulnerable confidence and significance procedures for location based on a single sample: Trimming/winsorization 1. Sankhyā: The Indian Journal of Statistics, Series A, 331–352.
Van Zandt, T., & Ratcliff, R. (1995). Statistical mimicking of reaction time data: Single-process models, parameter variability, and mixtures. Psychonomic Bulletin & Review, 2(1), 20–54. https://doi.org/10.3758/BF03214411
Ward, M. K., & Meade, A. W. (2023). Dealing with careless responding in survey data: Prevention, identification, and recommended best practices. Annual Review of Psychology, 74(1), 577–596. https://doi.org/10.1146/annurev-psych-040422-045007
Yentes R.D., & Wilhelm, F. (2023). careless: Procedures for computing indices of careless responding. R package version 1.2.2. Retrieved February 26, 2024, from https://cran.r-project.org/package=careless
Zijlstra, W. P., van der Ark, L. A., & Sijtsma, K. (2011). Outliers in questionnaire data: Can they be detected and should they be removed? Journal of Educational and Behavioral Statistics, 36(2), 186–212. https://doi.org/10.3102/1076998610366263
Acknowledgements
{performance} is part of the collaborative easystats ecosystem (Lüdecke et al., 2023). Thus, we thank all members of easystats, contributors, and users alike.
Funding
This research received no external funding.
Author information
Authors and Affiliations
Contributions
Writing- Original draft preparation: RT. Writing- Reviewing and Editing, Software: RT, MSB-S, IP, DL, BMW, and DM.
Corresponding author
Ethics declarations
Competing interests
The authors declare no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Thériault, R., Ben-Shachar, M.S., Patil, I. et al. Check your outliers! An introduction to identifying statistical outliers in R with easystats. Behav Res 56, 4162–4172 (2024). https://doi.org/10.3758/s13428-024-02356-w
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.3758/s13428-024-02356-w