id: 06071815 dt: a an: 06071815 au: Paukkeri, Mari-Sanna; Väyrynen, Jaakko; Arppe, Antti ti: Exploring extensive linguistic feature sets in near-synonym lexical choice. so: Gelbukh, Alexander (ed.), Computational linguistics and intelligent text processing. 13th international conference, CICLing 2012, New Delhi, India, March 11‒17, 2012. Proceedings, Part II. Berlin: Springer (ISBN 978-3-642-28600-1/pbk). Lecture Notes in Computer Science 7182, 1-12 (2012). py: 2012 pu: Berlin: Springer la: EN cc: ut: near-synonym lexical choice; linguistic features ci: li: doi:10.1007/978-3-642-28601-8_1 ab: Summary: In the near-synonym lexical choice task, the best alternative out of a set of near-synonyms is selected to fill a lexical gap in a text. We experiment on an approach of an extensive set, over 650, linguistic features to represent the context of a word, and a range of machine learning approaches in the lexical choice task. We extend previous work by experimenting with unsupervised and semi-supervised methods, and use automatic feature selection to cope with the problems arising from the rich feature set. It is natural to think that linguistic analysis of the word context would yield almost perfect performance in the task but we show that too many features, even linguistic, introduce noise and make the task difficult for unsupervised and semi-supervised methods. We also show that purely syntactic features play the biggest role in the performance, but also certain semantic and morphological features are needed. rv: