id: 06080265 dt: a an: 06080265 au: Ferneda, Edilson; do Prado, Hércules Antonio; Batista, Augusto Herrmann; Pinheiro, Marcello Sandi ti: Extracting definitions from Brazilian legal texts. so: Murgante, Beniamino (ed.) et al., Computational science and its applications ‒ ICCSA 2012. 12th international conference, Salvador de Bahia, Brazil, June 18‒21, 2012. Proceedings, Part III. Berlin: Springer (ISBN 978-3-642-31136-9/pbk). Lecture Notes in Computer Science 7335, 631-646 (2012). py: 2012 pu: Berlin: Springer la: EN cc: ut: information extraction; definition extraction; natural language processing ci: li: doi:10.1007/978-3-642-31137-6_48 ab: Summary: In order to avoid ambiguity and to ensure, as far as possible, a strict interpretation of law, legal texts usually define the specific lexical terms used within their discourse by means of normative rules. With an often large amount of rules in effect in a given domain, extracting these definitions manually would be a costly undertaking. This paper presents an approach to cope with this problem based in a variation of an automated technique of natural language processing of Brazilian Portuguese texts. For the sake of generality, the proposed solution was developed to address the more general problem of building a glossary from domain specific texts that contain definitions amongst their content. This solution was applied to a corpus of texts on the telecommunications regulations domain and the results are reported. The usual pipeline of natural language processing has been followed: preprocessing, segmentation, and part-of-speech tagging. A set of feature extraction functions is specified and used along with reference glossary information on whether or not a text fragment is a definition, to train a SVM classifier. At last, the definitions are extracted from the texts and evaluated upon a testing corpus, which also contains the reference glossary annotations on definitions. The results are then discussed in light of other definition extraction techniques. rv: