Eneko Agirre

From WikiPapers
Jump to: navigation, search

Eneko Agirre is an author.

Publications

Only those publications related to wikis are shown here.
Title Keyword(s) Published in Language DateThis property is a special property in this wiki. Abstract R C
PATHSenrich: A web service prototype for automatic cultural heritage item enrichment Lecture Notes in Computer Science English 2013 Large amounts of cultural heritage material are nowadays available through online digital library portals. Most of these cultural items have short descriptions and lack rich contextual information. The PATHS project has developed experimental enrichment services. As a proof of concept, this paper presents a web service prototype which allows independent content providers to enrich cultural heritage items with a subset of the full functionality: links to related items in the collection and links to related Wikipedia articles. In the future we plan to provide more advanced functionality, as available offline for PATHS. 0 0
Comparing taxonomies for organising collections of documents Hierarchy
Ontology
Semantic network
Taxonomy
Wikipedia
Wordnet
24th International Conference on Computational Linguistics - Proceedings of COLING 2012: Technical Papers English 2012 There is a demand for taxonomies to organise large collections of documents into categories for browsing and exploration. This paper examines four existing taxonomies that have been manually created, along with two methods for deriving taxonomies automatically from data items. We use these taxonomies to organise items from a large online cultural heritage collection. We then present two human evaluations of the taxonomies. The first measures the cohesion of the taxonomies to determine how well they group together similar items under the same concept node. The second analyses the concept relations in the taxonomies. The results show that the manual taxonomies have high quality well defined relations. However the novel automatic method is found to generate very high cohesion. 0 0
Two Birds with One Stone: Learning Semantic Models for Text Categorization and Word Sense Disambiguation Word Sense Disambiguation
Text Classification
Text Categorization
Wikipedia
Proceedings of the 20th ACM International Conference on Information and Knowledge Management 2011 In this paper we present a novel approach to learning semantic models for multiple domains, which we use to categorize Wikipedia pages and to perform domain Word Sense Disambiguation (WSD). In order to learn a semantic model for each domain we first extract relevant terms from the texts in the domain and then use these terms to initialize a random walk over the WordNet graph. Given an input text, we check the semantic models, choose the appropriate domain for that text and use the best-matching model to perform WSD. Our results show considerable improvements on text categorization and domain WSD tasks. 0 0
Two birds with one stone: Learning semantic models for text categorization and word sense disambiguation Text classification
Word sense disambiguation
International Conference on Information and Knowledge Management, Proceedings English 2011 In this paper we present a novel approach to learning semantic models for multiple domains, which we use to categorize Wikipedia pages and to perform domain Word Sense Disambiguation (WSD). In order to learn a semantic model for each domain we first extract relevant terms from the texts in the domain and then use these terms to initialize a random walk over the WordNet graph. Given an input text, we check the semantic models, choose the appropriate domain for that text and use the best-matching model to perform WSD. Our results show considerable improvements on text categorization and domain WSD tasks. 0 0
A study on Linking Wikipedia categories to WordNet synsets using text similarity International Conference Recent Advances in Natural Language Processing, RANLP English 2009 This paper studies the application of text similarity methods to disambiguate ambiguous links between WordNet nouns and Wikipedia categories. The methods range from word overlap between glosses, random projections, WordNetbased similarity, and a full-fledged textual entailment system. Both unsupervised and supervised combinations have been tried. The goldstandard with disambiguated links is publicly available. The results range from 64.7% for the first sense heuristic, 68% for an unsupervised combination, and up to 77.74% for a supervised combination. 0 0
Overview of the clef 2008 multilingual question answering track Lecture Notes in Computer Science English 2009 The QA campaign at CLEF 2008 [1], was mainly the same as that proposed last year. The results and the analyses reported by last year's participants suggested that the changes introduced in the previous campaign had led to a drop in systems' performance. So for this year's competition it has been decided to practically replicate last year's exercise. Following last year's experience some QA pairs were grouped in clusters. Every cluster was characterized by a topic (not given to participants). The questions from a cluster contained co-references between one of them and the others. Moreover, as last year, the systems were given the possibility to search for answers in Wikipedia as document corpus beside the usual newswire collection. In addition to the main task, three additional exercises were offered, namely the Answer Validation Exercise (AVE), the Question Answering on Speech Transcriptions (QAST), which continued last year's successful pilots, together with the new Word Sense Disambiguation for Question Answering (QA-WSD). As general remark, it must be said that the main task still proved to be very challenging for participating systems. As a kind of shallow comparison with last year's results the best overall accuracy dropped significantly from 42% to 19% in the multi-lingual subtasks, but increased a little in the monolingual sub-tasks, going from 54% to 63%. 0 0
WikiWalk: random walks on Wikipedia for semantic relatedness English 2009 0 0