Andreas Stafylopatis

From WikiPapers
Jump to: navigation, search

Andreas Stafylopatis is an author.

Publications

Only those publications related to wikis are shown here.
Title Keyword(s) Published in Language DateThis property is a special property in this wiki. Abstract R C
Semantic question answering using Wikipedia categories clustering Question answering
Semantic clustering
Wikipedia category vector representation
International Journal on Artificial Intelligence Tools English 2014 We describe a system that performs semantic Question Answering based on the combination of classic Information Retrieval methods with semantic ones. First, we use a search engine to gather web pages and then apply a noun phrase extractor to extract all the candidate answer entities from them. Candidate entities are ranked using a linear combination of two IR measures to pick the most relevant ones. For each one of the top ranked candidate entities we find the corresponding Wikipedia page. We then propose a novel way to exploit Semantic Information contained in the structure of Wikipedia. A vector is built for every entity from Wikipedia category names by splitting and lemmatizing the words that form them. These vectors maintain Semantic Information in the sense that we are given the ability to measure semantic closeness between the entities. Based on this, we apply an intelligent clustering method to the candidate entities and show that candidate entities in the biggest cluster are the most semantically related to the ideal answers to the query. Results on the topics of the TREC 2009 Related Entity Finding task dataset show promising performance. 0 0
Related entity finding using semantic clustering based on wikipedia categories Related Entity Finding
Semantic clustering
Wikipedia category vector representation
Lecture Notes in Computer Science English 2013 We present a system that performs Related Entity Finding, that is, Question Answering that exploits Semantic Information from the WWW and returns URIs as answers. Our system uses a search engine to gather all candidate answer entities and then a linear combination of Information Retrieval measures to choose the most relevant. For each one we look up its Wikipedia page and construct a novel vector representation based on the tokenization of the Wikipedia category names. This novel representation gives our system the ability to compute a measure of semantic relatedness between entities, even if the entities do not share any common category. We use this property to perform a semantic clustering of the candidate entities and show that the biggest cluster contains entities that are closely related semantically and can be considered as answers to the query. Performance measured on 20 topics from the 2009 TREC Related Entity Finding task shows competitive results. 0 0
DoSO: A document self-organizer Document clustering
Document representation
SOM
Wikipedia
Journal of Intelligent Information Systems English 2012 In this paper, we propose a Document Self Organizer (DoSO), an extension of the classic Self Organizing Map (SOM) model, in order to deal more efficiently with a document clustering task. Starting from a document representation model, based on important "concepts" exploiting Wikipedia knowledge, that we have previously developed in order to overcome some of the shortcomings of the Bag-of-Words (BOW) model, we demonstrate how SOM's performance can be boosted by using themost important concepts of the document collection to explicitly initialize the neurons. We also show how a hierarchical approach can be utilized in the SOMmodel and how this can lead to amore comprehensive final clustering result with hierarchical descriptive labels attached to neurons and clusters. Experiments show that the proposed model (DoSO) yields promising results both in terms of extrinsic and SOM evaluation measures. 0 0
Exploiting Wikipedia Knowledge for Conceptual Hierarchical Clustering of Documents Comput. J. English 2012 0 0
Conceptual hierarchical clustering of documents using Wikipedia knowledge Lecture Notes in Electrical Engineering English 2010 In this paper, we propose a novel method for conceptual hierarchical clustering of documents using knowledge extracted from Wikipedia. A robust and compact document representation is built in real-time using the Wikipedia API. The clustering process is hierarchical and creates cluster labels which are descriptive and important for the examined corpus. Experiments show that the proposed technique greatly improves over the baseline approach. © 2011 Springer Science+Business Media B.V. 0 0