Mining Domain-Specific Thesauri from Wikipedia: A case study

From WikiPapers
Jump to: navigation, search

Mining Domain-Specific Thesauri from Wikipedia: A case study is a 2006 conference paper by David Milne, Olena Medelyan, Ian H. Witten and published in ACM International Conference on Web Intelligence (WI 2006 Main Conference Proceedings)(WI'06).

[edit] Abstract

Domain-specific thesauri are high-cost, high-maintenance, high-value knowledge structures. We show how the classic thesaurus structure of terms and links can be mined automatically from Wikipedia, a vast, open encyclopedia. In a comparison with a professional thesaurus for agriculture (Agrovoc) we find that Wikipedia contains a substantial proportion of its domain-specific concepts and semantic relations; furthermore it has impressive coverage of a collection of contemporary documents in the domain. Thesauri derived using these techniques are attractive because they capitalize on existing public efforts and tend to reflect contemporary language usage better than their costly, painstakingly-constructed manual counterparts.

[edit] References

This section requires expansion. Please, help!

Cited by

This publication has 1 citations. Only those publications available in WikiPapers are shown here: