From names to entities using thematic context distance

From WikiPapers
Revision as of 19:38, November 7, 2014 by Nemo bis (Talk | contribs) (CSV import from another resource for wiki stuff; all data is PD-ineligible, abstracts quoted under quotation right. Skipping when title already exists. Sorry for authors and references to be postprocessed, please edit and create redirects.)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

From names to entities using thematic context distance is a 2011 conference paper written in English by Pilz A., Paass G. and published in International Conference on Information and Knowledge Management, Proceedings.

[edit] Abstract

Name ambiguity arises from the polysemy of names and causes uncertainty about the true identity of entities referenced in unstructured text. This is a major problem in areas like information retrieval or knowledge management, for example when searching for a specific entity or updating an existing knowledge base. We approach this problem of named entity disambiguation (NED) using thematic information derived from Latent Dirichlet Allocation (LDA) to compare the entity mention's context with candidate entities in Wikipedia represented by their respective articles. We evaluate various distances over topic distributions in a supervised classification setting to find the best suited candidate entity, which is either covered in Wikipedia or unknown. We compare our approach to a state of the art method and show that it achieves significantly better results in predictive performance, regarding both entities covered in Wikipedia as well as uncovered entities. We show that our approach is in general language independent as we obtain equally good results for named entity disambiguation using the English, the German and the French Wikipedia.

[edit] References

This section requires expansion. Please, help!

Cited by

Probably, this publication is cited by others, but there are no articles available for them in WikiPapers. Cited 4 time(s)