Extracting structured information from wikipedia articles to populate infoboxes

From WikiPapers
Revision as of 17:51, November 7, 2014 by Nemo bis (Talk | contribs) (CSV import from another resource for wiki stuff; all data is PD-ineligible, abstracts quoted under quotation right. Skipping when title already exists. Sorry for authors and references to be postprocessed, please edit and create redirects.)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

Extracting structured information from wikipedia articles to populate infoboxes is a 2010 conference paper written in English by Lange D., Bohm C., Naumann F. and published in International Conference on Information and Knowledge Management, Proceedings.

[edit] Abstract

Roughly every third Wikipedia article contains an infobox - a table that displays important facts about the subject in attribute-value form. The schema of an infobox, i.e., the attributes that can be expressed for a concept, is defined by an infobox template. Often, authors do not specify all template attributes, resulting in incomplete infoboxes. With iPopulator, we introduce a system that automatically populates infoboxes of Wikipedia articles by extracting attribute values from the article's text. In contrast to prior work, iPopulator detects and exploits the structure of attribute values to independently extract value parts. We have tested iPopulator on the entire set of infobox templates and provide a detailed analysis of its effectiveness. For instance, we achieve an average extraction precision of 91% for 1,727 distinct infobox template attributes.

[edit] References

This section requires expansion. Please, help!

Cited by

Probably, this publication is cited by others, but there are no articles available for them in WikiPapers. Cited 4 time(s)