id: 06059547 dt: a an: 06059547 au: Wagner, Hubert; Dłotko, Paweł; Mrozek, Marian ti: Computational topology in text mining. so: Ferri, Massimo (ed.) et al., Computational topology in image context. 4th international workshop, CTIC 2012, Bertinoro, Italy, May 28‒30, 2012. Proceedings. Berlin: Springer (ISBN 978-3-642-30237-4/pbk). Lecture Notes in Computer Science 7309, 68-78 (2012). py: 2012 pu: Berlin: Springer la: EN cc: ut: computational topology; computational homology; flag complex; discrete Morse theory; text mining; vector space model ci: li: doi:10.1007/978-3-642-30238-1_8 ab: Summary: In this paper, we present our ongoing research on applying computational topology to analysis of structure of similarities within a collection of text documents. Our work is on the fringe between text mining and computational topology, and we describe techniques from each of these disciplines. We transform text documents to the so-called vector space model, which is often used in text mining. This representation is suitable for topological computations. We compute homology, using discrete Morse theory, and persistent homology of the flag complex built from the point-cloud representing the input data. Since the space is high-dimensional, many difficulties appear. We describe how we tackle these problems and point out what challenges are still to be solved. rv: