Cohen’s Linearly Weighted Kappa is a Weighted Average of 2×2 Kappas

Warrens, Matthijs J.

doi:10.1007/s11336-011-9210-z

Cohen’s Linearly Weighted Kappa is a Weighted Average of 2×2 Kappas

Published: 30 March 2011

Volume 76, pages 471–486, (2011)
Cite this article

Psychometrika Aims and scope Submit manuscript

Matthijs J. Warrens¹

448 Accesses
24 Citations
Explore all metrics

Abstract

An agreement table with n∈ℕ_≥3 ordered categories can be collapsed into n−1 distinct 2×2 tables by combining adjacent categories. Vanbelle and Albert (Stat. Methodol. 6:157–163, 2009c) showed that the components of Cohen’s weighted kappa with linear weights can be obtained from these n−1 collapsed 2×2 tables. In this paper we consider several consequences of this result. One is that the weighted kappa with linear weights can be interpreted as a weighted arithmetic mean of the kappas corresponding to the 2×2 tables, where the weights are the denominators of the 2×2 kappas. In addition, it is shown that similar results and interpretations hold for linearly weighted kappas for multiple raters.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

EUR 32.99 /Month

Get 10 units per month
Download Article/Chapter or Ebook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Agresti, A. (1990). Categorical data analysis. New York: Wiley.
Google Scholar
Artstein, R., & Poesio, M. (2005). NLE technical note: Vol. 05-1. Kappa ³ = alpha (or beta). Colchester: University of Essex.
Google Scholar
Berry, K.J., & Mielke, P.W. (1988). A generalization of Cohen’s kappa agreement measure to interval measurement and multiple raters. Educational and Psychological Measurement, 48, 921–933.
Article Google Scholar
Brennan, R.L., & Prediger, D.J. (1981). Coefficient kappa: some uses, misuses, and alternatives. Educational and Psychological Measurement, 41, 687–699.
Article Google Scholar
Brenner, H., & Kliebsch, U. (1996). Dependence of weighted kappa coefficients on the number of categories. Epidemiology, 7, 199–202.
Article PubMed Google Scholar
Cicchetti, D., & Allison, T. (1971). A new procedure for assessing reliability of scoring EEG sleep recordings. The American Journal of EEG Technology, 11, 101–109.
Google Scholar
Cohen, J. (1960). A coefficient of agreement for nominal scales. Educational and Psychological Measurement, 20, 213–220.
Article Google Scholar
Cohen, J. (1968). Weighted kappa: Nominal scale agreement with provision for scaled disagreement or partial credit. Psychological Bulletin, 70, 213–220.
Article PubMed Google Scholar
Conger, A.J. (1980). Integration and generalization of kappas for multiple raters. Psychological Bulletin, 88, 322–328.
Article Google Scholar
Davies, M., & Fleiss, J.L. (1982). Measuring agreement for multinomial data. Biometrics, 38, 1047–1051.
Article Google Scholar
Fleiss, J.L. (1971). Measuring nominal scale agreement among many raters. Psychological Bulletin, 76, 378–382.
Article Google Scholar
Fleiss, J.L. (1981). Statistical methods for rates and proportions. New York: Wiley.
Google Scholar
Fleiss, J.L., & Cohen, J. (1973). The equivalence of weighted kappa and the intraclass correlation coefficient as measures of reliability. Educational and Psychological Measurement, 33, 613–619.
Article Google Scholar
Fleiss, J.L., Cohen, J., & Everitt, B.S. (1969). Large sample standard errors of kappa and weighted kappa. Psychological Bulletin, 72, 323–327.
Article Google Scholar
Heuvelmans, A.P.J.M., & Sanders, P.F. (1993). Beoordelaarsovereenstemming. In Eggen, T.J.H.M., & Sanders, P.F. (Eds.) Psychometrie in de Praktijk (pp. 443–470). Arnhem: Cito Instituut voor Toestontwikkeling.
Google Scholar
Holmquist, N.S., McMahon, C.A., & Williams, E.O. (1968). Variability in classification of carcinoma in situ of the uterine cervix. Obstetrical & Gynecological Survey, 23, 580–585.
Article Google Scholar
Hsu, L.M., & Field, R. (2003). Interrater agreement measures: comments on kappa_n, Cohen’s kappa, Scott’s π and Aickin’s α. Understanding Statistics, 2, 205–219.
Article Google Scholar
Hubert, L. (1977). Kappa revisited. Psychological Bulletin, 84, 289–297.
Article Google Scholar
Jakobsson, U., & Westergren, A. (2005). Statistical methods for assessing agreement for ordinal data. Scandinavian Journal of Caring Sciences, 19, 427–431.
Article PubMed Google Scholar
Janson, H., & Olsson, U. (2001). A measure of agreement for interval or nominal multivariate observations. Educational and Psychological Measurement, 61, 277–289.
Article Google Scholar
Kraemer, H.C. (1979). Ramifications of a population model for κ as a coefficient of reliability. Psychometrika, 44, 461–472.
Article Google Scholar
Kraemer, H.C., Periyakoil, V.S., & Noda, A. (2004). Tutorial in biostatistics: kappa coefficients in medical research. Statistics in Medicine, 21, 2109–2129.
Article Google Scholar
Krippendorff, K. (2004). Reliability in content analysis: some common misconceptions and recommendations. Human Communication Research, 30, 411–433.
Google Scholar
Kundel, H.L., & Polansky, M. (2003). Measurement of observer agreement. Radiology, 288, 303–308.
Article Google Scholar
Landis, J.R., & Koch, G.G. (1977). An application of hierarchical kappa-type statistics in the assessment of majority agreement among multiple observers. Biometrics, 33, 363–374.
Article PubMed Google Scholar
Mielke, P.W., & Berry, K.J. (2009). A note on Cohen’s weighted kappa coefficient of agreement with linear weights. Statistical Methodology, 6, 439–446.
Article Google Scholar
Mielke, P.W., Berry, K.J., & Johnston, J.E. (2007). The exact variance of weighted kappa with multiple raters. Psychological Reports, 101, 655–660.
PubMed Google Scholar
Mielke, P.W., Berry, K.J., & Johnston, J.E. (2008). Resampling probability values for weighted kappa with multiple raters. Psychological Reports, 102, 606–613.
Article PubMed Google Scholar
Nelson, J.C., & Pepe, M.S. (2000). Statistical description of interrater variability in ordinal ratings. Statistical Methods in Medical Research, 9, 475–496.
Article PubMed Google Scholar
Pop**, R. (1983). Overeenstemmingsmaten voor Nominale Data. Unpublished doctoral dissertation, Rijksuniversiteit Groningen, Groningen.
Pop**, R. (2010). Some views on agreement to be used in content analysis studies. Quality & Quantity, 44, 1067–1078.
Article Google Scholar
Schouten, H.J.A. (1986). Nominal scale agreement among observers. Psychometrika, 51, 453–466.
Article Google Scholar
Schuster, C. (2004). A note on the interpretation of weighted kappa and its relations to other rater agreement statistics for metric scales. Educational and Psychological Measurement, 64, 243–253.
Article Google Scholar
Scott, W.A. (1955). Reliability of content analysis: the case of nominal scale coding. Public Opinion Quarterly, 19, 321–325.
Article Google Scholar
Vanbelle, S., & Albert, A. (2009a). Agreement between two independent groups of raters. Psychometrika, 74, 477–491.
Article Google Scholar
Vanbelle, S., & Albert, A. (2009b). Agreement between an isolated rater and a group of raters. Statistica Neerlandica, 63, 82–100.
Article Google Scholar
Vanbelle, S., & Albert, A. (2009c). A note on the linearly weighted kappa coefficient for ordinal scales. Statistical Methodology, 6, 157–163.
Article Google Scholar
Visser, H., & de Nijs, T. (2006). The map comparison kit. Environmental Modelling & Software, 21, 346–358.
Article Google Scholar
Warrens, M.J. (2008a). On similarity coefficients for 2×2 tables and correction for chance. Psychometrika, 73, 487–502.
Article PubMed Google Scholar
Warrens, M.J. (2008b). On the equivalence of Cohen’s kappa and the Hubert–Arabie adjusted Rand index. Journal of Classification, 25, 177–183.
Article Google Scholar
Warrens, M.J. (2009). k-adic similarity coefficients for binary (presence/absence) data. Journal of Classification, 26, 227–245.
Article Google Scholar
Warrens, M.J. (2010a). Inequalities between kappa and kappa-like statistics for k×k tables. Psychometrika, 75, 176–185.
Article Google Scholar
Warrens, M.J. (2010b). Cohen’s kappa can always be increased and decreased by combining categories. Statistical Methodology, 7, 673–677.
Article Google Scholar
Warrens, M.J. (2010c). A Kraemer-type rescaling that transforms the odds ratio into the weighted kappa coefficient. Psychometrika, 75, 328–330.
Article Google Scholar
Warrens, M.J. (2010d). A formal proof of a paradox associated with Cohen’s kappa. Journal of Classification, 27, 322–332.
Article Google Scholar
Warrens, M.J. (2010e). Inequalities between multi-rater kappas. Advances in Data Analysis and Classification, 4, 271–286.
Article Google Scholar
Warrens, M.J. (2011). Weighted kappa is higher than Cohen’s kappa for tridiagonal agreement tables. Statistical Methodology, 4, 271–286.
Google Scholar
Zwick, R. (1988). Another look at interrater agreement. Psychological Bulletin, 103, 374–378.
Article PubMed Google Scholar

Download references

Author information

Authors and Affiliations

Department of Methodology and Statistics, Tilburg University, P.O. Box 90153, 5000 LE, Tilburg, The Netherlands
Matthijs J. Warrens

Authors

Matthijs J. Warrens
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Matthijs J. Warrens.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Warrens, M.J. Cohen’s Linearly Weighted Kappa is a Weighted Average of 2×2 Kappas. Psychometrika 76, 471–486 (2011). https://doi.org/10.1007/s11336-011-9210-z

Download citation

Received: 19 August 2010
Revised: 17 November 2010
Published: 30 March 2011
Issue Date: July 2011
DOI: https://doi.org/10.1007/s11336-011-9210-z

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

EUR 32.99 /Month

Get 10 units per month
Download Article/Chapter or Ebook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Cohen’s Linearly Weighted Kappa is a Weighted Average of 2×2 Kappas

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Properties of Bangdiwala’s B

A Note on the Linearly and Quadratically Weighted Kappa Coefficients

Estimators of various kappa coefficients based on the unbiased estimator of the expected index of agreements

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Cohen’s Linearly Weighted Kappa is a Weighted Average of 2×2 Kappas

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Properties of Bangdiwala’s B

A Note on the Linearly and Quadratically Weighted Kappa Coefficients

Estimators of various kappa coefficients based on the unbiased estimator of the expected index of agreements

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation