<?xml version="1.0" encoding="utf-8" standalone="yes"?>
<item>
  <id>05686735</id>
  <dt>a</dt>
  <an>05686735</an>
  <augroup>
    <au>Chan, Samuel W.K.</au>
    <au>Cheung, Lawrence Y.L.</au>
    <au>Chong, Mickey W.C.</au>
  </augroup>
  <ti>A machine learning parser using an unlexicalized distituent model.</ti>
  <so>Gelbukh, Alexander (ed.), Computational linguistics and intelligent text processing. 11th international conference, CICLing 2010, Ia{\c s}i, Romania, March 21--27, 2010. Proceedings. Berlin: Springer (ISBN 978-3-642-12115-9/pbk). Lecture Notes in Computer Science 6008, 121-136 (2010).</so>
  <py>2010</py>
  <pu>Berlin: Springer</pu>
  <lagroup>
    <la>EN</la>
  </lagroup>
  <ccgroup>
  </ccgroup>
  <utgroup>
    <ut>parsing</ut>
    <ut>distituency</ut>
    <ut>unlexicalized model</ut>
    <ut>machine learning</ut>
  </utgroup>
  <cigroup>
  </cigroup>
  <ligroup>
    <li>doi:10.1007/978-3-642-12116-6_11</li>
  </ligroup>
  <abgroup>
    <ab>Summary: Despite the popularity of lexicalized parsing models, practical concerns such as data sparseness and applicability to domains of different vocabularies make unlexicalized models that do not refer to word tokens themselves deserve more attention. A classifier-based parser using an unlexicalized parsing model has been developed. Most importantly, to enhance the accuracy of these tasks, we investigated the notion of distituency (the possibility that two parts of speech cannot remain in the same constituent or phrase) and incorporated it as attributes using various statistic measures. A machine learning method integrates linguistic attributes and information-theoretic attributes in two tasks, namely sentence chunking and phrase recognition. The parser was applied to parsing English and Chinese sentences in the Penn Treebank and the Tsinghua Chinese Treebank. It achieved a parsing performance of $F$-score 80.3\% in English and 82.4\% in Chinese.</ab>
    <rv></rv>
  </abgroup>
</item>