Wikipedia as an Ontology for Describing Documents
|Wikipedia as an Ontology for Describing Documents|
|Author(s)||Zareen Syed, Tim Finin, Anupam Joshi|
|Published in||Proceedings of the Second International Conference on Weblogs and Social Media, AAAI, March 31, 2008|
|Keyword(s)||ontology, wikipedia, information retrieval, text classification|
|Article||BASE, CiteSeerX, Google Scholar|
|Web||Ask, Bing, Google (PDF), Yahoo!|
|Download and mirrors|
|Local copy||Not available|
|Export and share|
|BibTeX, CSV, RDF, JSON|
|Browse properties · List of conference papers|
Wikipedia as an Ontology for Describing Documents is a 2008 conference paper by Zareen Syed, Tim Finin, Anupam Joshi and published in Proceedings of the Second International Conference on Weblogs and Social Media, AAAI, March 31, 2008.
Identifying topics and concepts associated with a set of documents is a task common to many applications. It can help in the annotation and categorization of documents and be used to model a person's current interests for improving search results, business intelligence or selecting appropriate advertisements. One approach is to associate a document with a set of topics selected from a fixed ontology or vocabulary of terms. We have investigated using Wikipedia's articles and associated pages as a topic ontology for this purpose. The benefits are that the ontology terms are developed through a social process, maintained and kept current by the Wikipedia community, represent a consensus view, and have meaning that can be understood simply by reading the associated Wikipedia page. We use Wikipedia articles and the category and article link graphs to predict concepts common to a set of documents. We describe several algorithms to aggregate and refine results, including the use of spreading activation to select the most appropriate terms. While the Wikipedia category graph can be used to predict generalized concepts, the article links graph helps by predicting more specific concepts and concepts not in the category hierarchy. Our experiments demonstrate the feasibility of extending the category system with new concepts identified as a union of pages from the page link graph.
- This section requires expansion. Please, help!
Probably, this publication is cited by others, but there are no articles available for them in WikiPapers.