Browse wiki

Jump to: navigation, search
Concordance-based entity-oriented search
Abstract We consider the problem of finding the relWe consider the problem of finding the relevant named entities in response to a search query over a given text corpus. Entity search can readily be used to augment conventional web search engines for a variety of applications. To assess the significance of entity search, we analyzed the AOL dataset of 36 million web search queries with respect to two different sets of entities: namely (a) 2.3 million distinct entities extracted from a news text corpus and (b) 2.9 million Wikipedia article titles. The results clearly indicate that search engines should be aware of entities, for under various criteria of matching between 18-39% of all web search queries can be recognized as specifically searching for entities, while 73-87% of all queries contain entities. Our entity search engine creates a concordance document for each entity, consisting of all the sentences in the corpus containing that entity. We then index and search these documents using open-source search software. This gives a ranked list of entities as the result of search. Visit http://www.textmap.com for a demonstration of our entity search engine over a large news corpus. We evaluate our system by comparing the results of each query to the list of entities that have highest statistical juxtaposition scores with the queried entity. Juxtaposition score is a measure of how strongly two entities are related in terms of a probabilistic upper bound. The results show excellent performance, particularly over well-characterized classes of entities such as people.erized classes of entities such as people.
Abstractsub We consider the problem of finding the relWe consider the problem of finding the relevant named entities in response to a search query over a given text corpus. Entity search can readily be used to augment conventional web search engines for a variety of applications. To assess the significance of entity search, we analyzed the AOL dataset of 36 million web search queries with respect to two different sets of entities: namely (a) 2.3 million distinct entities extracted from a news text corpus and (b) 2.9 million Wikipedia article titles. The results clearly indicate that search engines should be aware of entities, for under various criteria of matching between 18-39% of all web search queries can be recognized as specifically searching for entities, while 73-87% of all queries contain entities. Our entity search engine creates a concordance document for each entity, consisting of all the sentences in the corpus containing that entity. We then index and search these documents using open-source search software. This gives a ranked list of entities as the result of search. Visit http://www.textmap.com for a demonstration of our entity search engine over a large news corpus. We evaluate our system by comparing the results of each query to the list of entities that have highest statistical juxtaposition scores with the queried entity. Juxtaposition score is a measure of how strongly two entities are related in terms of a probabilistic upper bound. The results show excellent performance, particularly over well-characterized classes of entities such as people.erized classes of entities such as people.
Bibtextype inproceedings  +
Doi 10.1109/WI.2007.37  +
Has author Bautin M. + , Skiena S. +
Has extra keyword Applications. + , Dataset + , Excellent performance + , International conferences + , Named entities + , News corpus + , Open sources + , Search queries + , Search software + , Text corpora + , Upper bounds + , Web intelligence + , Web search queries + , Web searches + , Wikipedia + , Computer software + , Identification (control systems) + , Information retrieval + , Information services + , Internet + , Search engine + , Telecommunication networks + , World Wide Web +
Isbn 0769530265; 9780769530260  +
Language English +
Number of citations by publication 0  +
Number of references by publication 0  +
Pages 586–592  +
Published in Proceedings of the IEEE/WIC/ACM International Conference on Web Intelligence, WI 2007 +
Title Concordance-based entity-oriented search +
Type conference paper  +
Year 2007 +
Creation dateThis property is a special property in this wiki. 7 November 2014 08:14:11  +
Categories Publications without keywords parameter  + , Publications without license parameter  + , Publications without remote mirror parameter  + , Publications without archive mirror parameter  + , Publications without paywall mirror parameter  + , Conference papers  + , Publications without references parameter  + , Publications  +
Modification dateThis property is a special property in this wiki. 7 November 2014 08:14:11  +
DateThis property is a special property in this wiki. 2007  +
hide properties that link here 
Concordance-based entity-oriented search + Title
 

 

Enter the name of the page to start browsing from.