Title: Topological analysis of text data.
Lecturer: Hubert WAGNER <expertise>
Abstract: In this talk an ongoing effort will be described to apply persistent homology in the area of text data mining. Persistent homology is the main tool of topological data analysis. In essence, it allows to robustly describe the shape of a data set, and compare the shapes of different data sets.
First, persistent homology will be explained, emphasizing its intuitive side.
Then, it will be demonstrated how persistent homology can be applied in the context of analyzing sets of text documents. Using the vector space model interpretation, each document becomes a point in a high-dimensional space, and it is intuitive to ask about the shape of such a point cloud. It wil be discussed, how this information can be used for knowledge discovery. Finally, an algorithmic aspect is emphasized, which is crucial if industrial applications are to be tackled.
Biography: Hubert Wagner is a computer scientist, currently working as a Postdoc at the Institute of Science and Technology Austria (IST-Austria) at the Edelsbrunner Group. Having worked as a software engineer, he moved towards science and obtained a PhD degree in 2014 from the Jagiellonian University in Krakow, Poland. Hubert is interested in the application of computational geometry and topology and related algorithmic questions. He is convinced that tools such as persistent homology may offer novel and robust solutions to many problems he encountered as an engineer, including e.g. problems in text mining. This line of his research was supported by a Google Research Grant from 2011 to 2012 (with Prof. Marian Mrozek and Dr. Paweł Dłotko) and is now continued within the Topological Complex Systems (TOPOSYS) grant. Efficient algorithms and their implementations are an important part of his work.
More Information: https://publist.ist.ac.at/ist/people/180-Hubert_Wagner/works