Browse wiki

Jump to: navigation, search
Result diversity and entity ranking experiments: Anchors, links, text and Wikipedia
Abstract In this paper, we document our efforts in In this paper, we document our efforts in participating to the TREC 2009 Entity Ranking and Web Tracks. We had multiple aims: For the Web Track's Adhoc task we experiment with document text and anchor text representation, and the use of the link structure. For the Web Track's Diversity task we experiment with using a top down sliding window that, given the top ranked documents, chooses as the next ranked document the one that has the most unique terms or links. We test our sliding window method on a standard document text index and an index of propagated anchor texts. We also experiment with extreme query expansions by taking the top n results of the initial ranking as multi-faceted aspects of the topic to construct n relevance models to obtain n sets of results. A final diverse set of results is obtained by merging the n results lists. For the Entity Ranking Track, we also explore the effectiveness of the anchor text representation, look at the co-citation graph, and experiment with using Wikipedia as a pivot. Our main findings can be summarized as follows: Anchor text is very effective for diversity. It gives high early precision and the results cover more relevant sub-topics than the document text index. Our baseline runs have low diversity, which limits the possible impact of the sliding window approach. New link information seems more effective for diversifying text-based search results than the amount of unique terms added by a document. In the entity ranking task, anchor text finds few primary pages, but it does retrieve a large number of relevant pages. Using Wikipedia as a pivot results in large gains of P10 and NDCG when only primary pages are considered. Although the links between the Wikipedia entities and pages in the Clueweb collection are sparse, the precision of the existing links is very high.cision of the existing links is very high.
Abstractsub In this paper, we document our efforts in In this paper, we document our efforts in participating to the TREC 2009 Entity Ranking and Web Tracks. We had multiple aims: For the Web Track's Adhoc task we experiment with document text and anchor text representation, and the use of the link structure. For the Web Track's Diversity task we experiment with using a top down sliding window that, given the top ranked documents, chooses as the next ranked document the one that has the most unique terms or links. We test our sliding window method on a standard document text index and an index of propagated anchor texts. We also experiment with extreme query expansions by taking the top n results of the initial ranking as multi-faceted aspects of the topic to construct n relevance models to obtain n sets of results. A final diverse set of results is obtained by merging the n results lists. For the Entity Ranking Track, we also explore the effectiveness of the anchor text representation, look at the co-citation graph, and experiment with using Wikipedia as a pivot. Our main findings can be summarized as follows: Anchor text is very effective for diversity. It gives high early precision and the results cover more relevant sub-topics than the document text index. Our baseline runs have low diversity, which limits the possible impact of the sliding window approach. New link information seems more effective for diversifying text-based search results than the amount of unique terms added by a document. In the entity ranking task, anchor text finds few primary pages, but it does retrieve a large number of relevant pages. Using Wikipedia as a pivot results in large gains of P10 and NDCG when only primary pages are considered. Although the links between the Wikipedia entities and pages in the Clueweb collection are sparse, the precision of the existing links is very high.cision of the existing links is very high.
Bibtextype inproceedings  +
Has author Rianne Kaptein + , Marijn Koolen + , Jaap Kamps +
Has extra keyword Ad-hoc task + , Cocitation + , Entity ranking + , Link informations + , Link structure + , Query expansion + , Relevance models + , Search results + , Sliding Window + , Sliding window methods + , Standard documents + , Text indices + , Text representation + , Topdown + , Wikipedia + , Experiments + , Websites + , Information retrieval +
Issn 1048776X  +
Language English +
Number of citations by publication 0  +
Number of references by publication 0  +
Published in NIST Special Publication +
Title Result diversity and entity ranking experiments: Anchors, links, text and Wikipedia +
Type conference paper  +
Year 2009 +
Creation dateThis property is a special property in this wiki. 8 November 2014 06:51:41  +
Categories Publications without keywords parameter  + , Publications without license parameter  + , Publications without DOI parameter  + , Publications without remote mirror parameter  + , Publications without archive mirror parameter  + , Publications without paywall mirror parameter  + , Conference papers  + , Publications without references parameter  + , Publications  +
Modification dateThis property is a special property in this wiki. 8 November 2014 06:51:41  +
DateThis property is a special property in this wiki. 2009  +
hide properties that link here 
Result diversity and entity ranking experiments: Anchors, links, text and Wikipedia + Title
 

 

Enter the name of the page to start browsing from.