×

Comparing learning methods for classification. (English) Zbl 1096.62071

Summary: We address the consistency property of cross validation (CV) for classification. Sufficient conditions are obtained on the data splitting ratio to ensure that the better classifier between two candidates will be favored by CV with probability approaching 1. Interestingly, it turns out that for comparing two general learning methods, the ratio of the training sample size and the evaluation size does not have to approach 0 for consistency in selection, as is required for comparing parametric regression models [J. Shao, J. Am. Stat. Assoc. 88, No. 422, 486–494 (1993; Zbl 0773.62051)]. In fact, the ratio may be allowed to converge to infinity or any positive constant, depending on the situation. In addition, we also discuss confidence intervals and sequential instability in selection for comparing classifiers.

MSC:

62H30 Classification and discrimination; cluster analysis (statistical aspects)
62J99 Linear inference, regression
68T05 Learning and adaptive systems in artificial intelligence
65C60 Computational problems in statistics (MSC2010)

Citations:

Zbl 0773.62051
PDFBibTeX XMLCite