| Eibe Frank|
(Alternative names for this author)
|Co-authors||Anna Huang, David N. Milne, Ian H. Witten, Olena Medelyan, Witten I.H.|
|Authorship||Publications (3), datasets (0), tools (0)|
|Citations||Total (0), average (0), median (0), max (0), min (0)|
|DBLP · Google Scholar|
|Export and share|
|BibTeX, CSV, RDF, JSON|
|Browse properties · List of authors|
Eibe Frank is an author.
PublicationsOnly those publications related to wikis are shown here.
|Title||Keyword(s)||Published in||Language||DateThis property is a special property in this wiki.||Abstract||R||C|
|Clustering Documents Using a Wikipedia-Based Concept Representation||English||2009||This paper shows how Wikipedia and the semantic knowledge it contains can be exploited for document clustering. We first create a concept-based document representation by mapping the terms and phrases within documents to their corresponding articles (or concepts) in Wikipedia. We also developed a similarity measure that evaluates the semantic relatedness between concept sets for two documents. We test the concept-based representation and the similarity measure on two standard text document datasets. Empirical results show that although further optimizations could be performed, our approach already improves upon related techniques.||0||0|
|Human-competitive tagging using automatic keyphrase extraction||EMNLP 2009 - Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: A Meeting of SIGDAT, a Special Interest Group of ACL, Held in Conjunction with ACL-IJCNLP 2009||English||2009||This paper connects two research areas: automatic tagging on the web and statistical keyphrase extraction. First, we analyze the quality of tags in a collaboratively created folksonomy using traditional evaluation techniques. Next, we demonstrate how documents can be tagged automatically with a state-of-the-art keyphrase extraction algorithm, and further improve performance in this new domain using a new algorithm, "Maui", that utilizes semantic information extracted from Wikipedia. Maui outperforms existing approaches and extracts tags that are competitive with those assigned by the best performing human taggers.||0||0|
|Clustering Documents with Active Learning Using Wikipedia||Data Mining, IEEE International Conference on||English||2008||Wikipedia has been applied as a background knowledge base to various text mining problems, but very few attempts have been made to utilize it for document clustering. In this paper we propose to exploit the semantic knowledge in Wikipedia for clustering, enabling the automatic grouping of documents with similar themes. Although clustering is intrinsically unsupervised, recent research has shown that incorporating supervision improves clustering performance, even when limited supervision is provided. The approach presented in this paper applies supervision using active learning. We first utilize Wikipedia to create a concept-based representation of a text document, with each concept associated to a Wikipedia article. We then exploit the semantic relatedness between Wikipedia concepts to find pair-wise instance-level constraints for supervised clustering, guiding clustering towards the direction indicated by the constraints. We test our approach on three standard text document datasets. Empirical results show that our basic document representation strategy yields comparable performance to previous attempts; and adding constraints improves clustering performance further by up to 20%.||0||0|