A selective sampling strategy for label ranking. (English)
Fürnkranz, Johannes (ed.) et al., Machine learning: ECML 2006. 17th European conference on machine learning Berlin, Germany, September 18‒22, 2006. Proceedings. Berlin: Springer (ISBN 978-3-540-45375-8/pbk). Lecture Notes in Computer Science 4212. Lecture Notes in Artificial Intelligence, 18-29 (2006).
Summary: We propose a novel active learning strategy based on the compression framework of [9] for label ranking functions which, given an input instance, predict a total order over a predefined set of alternatives. Our approach is theoretically motivated by an extension to ranking and active learning of Kääriäinen’s generalization bounds using unlabeled data [7], initially developed in the context of classification. The bounds we obtain suggest a selective sampling strategy provided that a sufficiently, yet reasonably large initial labeled dataset is provided. Experiments on Information Retrieval corpora from automatic text summarization and question/answering show that the proposed approach allows to substantially reduce the labeling effort in comparison to random and heuristic-based sampling strategies.