Overview of the TREC 2009 entity track

From WikiPapers
Jump to: navigation, search

Overview of the TREC 2009 entity track is a 2009 conference paper written in English by Balog K., De Vries A.P., Serdyukov P., Thomas P., Westerveld T. and published in NIST Special Publication.

[edit] Abstract

The first year of the entity track featured a related entity finding task. Given an input entity, the type of the target entity (person, organization, or product), and the relation, described in free text, systems had to return homepages of related entities, and, optionally, the corresponding Wikipedia page and/or the name of the entity. Topic development encountered difficulties because it turned out that for many candidate topics, the "Category B" collection did not contain enough entity homepages. For the first year of the track, 20 topics were created and assessed. Assessment took place in two stages. First, the assessors judged the returned pages. Here, the hard parts of relevance assessment are to (a) identify a correct answer and (b) distinguish a homepage from a non-homepage. Assessors were then shown a list of all pages they had judged "primary" and all names that were judged "correct". They could assign each to a pre-existing class, or create a new class. Concerning submissions, a common take on the task was to first gather snippets for the input entity, then extract co-occurring entities from these snippets, using a named entity tagger (off-the-self or custom-made). Language modeling techniques were often employed by these approaches. Several submissions built heavily on Wikipedia; exploiting links outgoing from the entity's Wikipedia page, using it to improve named entity recognition, making use of Wikipedia categories for entity type detection, just to name a few examples.

[edit] References

This section requires expansion. Please, help!

Cited by

Probably, this publication is cited by others, but there are no articles available for them in WikiPapers.