×

Cohen’s kappa is a weighted average. (English) Zbl 1248.62227

Summary: The \(\kappa\) coefficient is a popular descriptive statistic for summarizing an agreement table. It is sometimes desirable to combine some of the categories, for example, when categories are easily confused, and then calculate \(\kappa \) for the collapsed table. Since the categories of an agreement table are nominal and the order in which the categories of a table are listed is irrelevant, combining categories of an agreement table is identical to partitioning the categories in subsets. We prove that given a partition type of the categories, the overall \(\kappa\)-value of the original table is a weighted average of the \(\kappa\)-values of the collapsed tables corresponding to all partitions of that type. The weights are the denominators of the kappas of the subtables. An immediate consequence is that Cohen’s \(\kappa\) can be interpreted as a weighted average of the \(\kappa\)-values of the agreement tables corresponding to all non-trivial partitions. The \(\kappa \)-value of the \(2\times 2\) table that is obtained by combining all categories other than the one of current interest into a single “all others” category, reflects the reliability of the individual category. Since the overall \(\kappa\)-value is a weighted average of these \(2\times 2 \kappa\)-values the category reliability indicates how a category contributes to the overall \(\kappa\)-value. It would be good practice to report both the overall \(\kappa\)-value and the category reliabilities of an agreement table.

MSC:

62P15 Applications of statistics to psychology
62H17 Contingency tables
PDFBibTeX XMLCite
Full Text: DOI Link

References:

[1] Abramowitz, M.; Stegun, I. A., Handbook of Mathematical Functions, with Formulas, Graphs and Mathematical Tables (1965), Dover Publications: Dover Publications New York · Zbl 0515.33001
[2] Agresti, A., An Introduction to Categorical Data Analysis (2007), Wiley: Wiley New York · Zbl 1266.62008
[3] Brennan, R. L.; Prediger, D. J., Coefficient kappa: some uses, misuses, and alternatives, Educational and Psychological Measurement, 41, 687-699 (1981)
[4] Cohen, J., A coefficient of agreement for nominal scales, Educational and Psychological Measurement, 20, 213-220 (1960)
[5] Cohen, J., Weighted kappa: nominal scale agreement with provision for scaled disagreement or partial credit, Psychological Bulletin, 70, 213-220 (1968)
[6] Conger, A. J., Integration and generalization of kappas for multiple raters, Psychological Bulletin, 88, 322-328 (1980)
[7] Fleiss, J. L., Statistical Methods for Rates and Proportions (1981), Wiley: Wiley New York · Zbl 0544.62002
[8] Fleiss, J. L.; Levin, B.; Paik, M. C., Statistical Methods for Rates and Proportions (2003), Wiley: Wiley Hoboken, New Jersey · Zbl 1034.62113
[9] Goodman, G. D.; Kruskal, W. H., Measures of association for cross classifications, Journal of the American Statistical Association, 49, 732-764 (1954) · Zbl 0056.12801
[10] Graham, R. L.; Knuth, D. E.; Patashnik, O., Concrete Mathematics (1989), Addison-Wesley: Addison-Wesley Reading · Zbl 0668.00003
[11] Hsu, L. M.; Field, R., Interrater agreement measures: comments on \(kappa_n\), Cohen’s kappa, Scott’s \(\pi\) and Aickin’s \(\alpha \), Understanding Statistics, 2, 205-219 (2003)
[12] Kraemer, H. C., Ramifications of a population model for \(\kappa\) as a coefficient of reliability, Psychometrika, 44, 461-472 (1979) · Zbl 0425.62088
[13] Kraemer, H. C.; Periyakoil, V. S.; Noda, A., Tutorial in biostatistics: kappa coefficients in medical research, Statistics in Medicine, 21, 2109-2129 (2004)
[14] Krippendorff, K., Reliability in content analysis: some common misconceptions and recommendations, Human Communication Research, 30, 411-433 (2004)
[15] Kundel, H. L.; Polansky, M., Measurement of observer agreement, Radiology, 288, 303-308 (2003)
[16] Maclure, M.; Willett, W. C., Misinterpretation and misuse of the kappa statistic, Journal of Epidemiology, 126, 161-169 (1987)
[17] Nelson, J. C.; Pepe, M. S., Statistical description of interrater variability in ordinal ratings, Statistical Methods in Medical Research, 9, 475-496 (2000) · Zbl 1121.62644
[18] Schouten, H. J.A., Nominal scale agreement among observers, Psychometrika, 51, 453-466 (1986)
[19] Scott, W. A., Reliability of content analysis: the case of nominal scale coding, Public Opinion Quarterly, 19, 321-325 (1955)
[20] Spivey, M. Z., A generalized recurrence for Bell numbers, Journal of Integer Sequences, 11 (2008), Article 08.2.5 · Zbl 1231.11026
[21] Vanbelle, S.; Albert, A., Agreement between two independent groups of raters, Psychometrika, 74, 477-491 (2009) · Zbl 1272.62135
[22] Vanbelle, S.; Albert, A., Agreement between an isolated rater and a group of raters, Statistica Neerlandica, 63, 82-100 (2009)
[23] Vanbelle, S.; Albert, A., A note on the linearly weighted kappa coefficient for ordinal scales, Statistical Methodology, 6, 157-163 (2009) · Zbl 1220.62172
[24] Visser, H.; De Nijs, T., The map comparison kit, Environmental Modelling & Software, 21, 346-358 (2006)
[25] Warrens, M. J., On similarity coefficients for 2×2 tables and correction for chance, Psychometrika, 73, 487-502 (2008) · Zbl 1301.62125
[26] Warrens, M. J., On association coefficients for 2×2 tables and properties that do not depend on the marginal distributions, Psychometrika, 73, 777-789 (2008) · Zbl 1284.62762
[27] Warrens, M. J., On the equivalence of Cohen’s kappa and the Hubert-Arabie adjusted Rand index, Journal of Classification, 25, 177-183 (2008) · Zbl 1276.62043
[28] Warrens, M. J., Inequalities between kappa and kappa-like statistics for \(k \times k\) tables, Psychometrika, 75, 176-185 (2010) · Zbl 1272.62138
[29] Warrens, M. J., Cohen’s kappa can always be increased and decreased by combining categories, Statistical Methodology, 7, 673-677 (2010) · Zbl 1232.62161
[30] Warrens, M. J., A Kraemer-type rescaling that transforms the odds ratio into the weighted kappa coefficient, Psychometrika, 75, 328-330 (2010) · Zbl 1234.62088
[31] Warrens, M. J., A formal proof of a paradox associated with Cohen’s kappa, Journal of Classification, 27, 322-332 (2010) · Zbl 1337.62143
[32] Warrens, M. J., Inequalities between multi-rater kappas, Advances in Data Analysis and Classification, 4, 271-286 (2010) · Zbl 1284.62338
[33] Warrens, M. J., Weighted kappa is higher than Cohen’s kappa for tridiagonal agreement tables, Statistical Methodology, 8, 268-272 (2011) · Zbl 1213.62187
[34] Zwick, R., Another look at interrater agreement, Psychological Bulletin, 103, 374-378 (1988)
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.