Ugo Scaiella is an author.


Classification of short texts by deploying topical annotations Lecture Notes in Computer Science English 2012 We propose a novel approach to the classification of short texts based on two factors: the use of Wikipedia-based annotators that have been recently introduced to detect the main topics present in an input text, represented via Wikipedia pages, and the design of a novel classification algorithm that measures the similarity between the input text and each output category by deploying only their annotated topics and the Wikipedia link-structure. Our approach waives the common practice of expanding the feature-space with new dimensions derived either from explicit or from latent semantic analysis. As a consequence it is simple and maintains a compact intelligible representation of the output categories. Our experiments show that it is efficient in construction and query time, accurate as state-of-the-art classifiers (see e.g. Phan et al. WWW '08), and robust with respect to concept drifts and input sources. 0 0
First steps beyond the bag-of-words representation of short texts CEUR Workshop Proceedings English 2011 We address the problem of enhancing the classical bag-of- words representation of texts by designing and engineering Tagme, the first system that performs an accurate and on-the-y semantic annota- tion of short texts via Wikipedia as knowledge base. Several experiments show that Tagme outperforms state-of-the-art algorithms when they are adapted to work on short texts and it results fast and competitive on long ones. This leads us to argue favorably about Tagme's application to clustering, classification and retrieval systems on challenging scenarios like web-snippets, tweets, news, ads, etc. 0 0
