Browse wiki

Jump to: navigation, search
Query by document
Abstract We are experiencing an unprecedented increWe are experiencing an unprecedented increase of content contributed by users in forums such as blogs, social networking sites and micro-blogging services. Such abundance of content complements content on web sites and traditional media forums such as news papers, news and financial streams, and so on. Given such plethora of information there is a pressing need to cross reference information across textual services. For example, commonly we read a news item and we wonder if there are any blogs reporting related content or vice versa. In this paper, we present techniques to automate the process of cross referencing online information content. We introduce methodologies to extract phrases from a given "query document" to be used as queries to search interfaces with the goal to retrieve content related to the query document. In particular, we consider two techniques to extract and score key phrases. We also consider techniques to complement extracted phrases with information present in external sources such as Wikipedia and introduce an algorithm called RelevanceRank for this purpose. We discuss both these techniques in detail and provide an experimental study utilizing a large number of human judges from Amazons's Mechanical Turk service. Detailed experiments demonstrate the effectiveness and efficiency of the proposed techniques for the task of automating retrieval of documents related to a query document. Copyright 2009 ACM.d to a query document. Copyright 2009 ACM.
Abstractsub We are experiencing an unprecedented increWe are experiencing an unprecedented increase of content contributed by users in forums such as blogs, social networking sites and micro-blogging services. Such abundance of content complements content on web sites and traditional media forums such as news papers, news and financial streams, and so on. Given such plethora of information there is a pressing need to cross reference information across textual services. For example, commonly we read a news item and we wonder if there are any blogs reporting related content or vice versa. In this paper, we present techniques to automate the process of cross referencing online information content. We introduce methodologies to extract phrases from a given "query document" to be used as queries to search interfaces with the goal to retrieve content related to the query document. In particular, we consider two techniques to extract and score key phrases. We also consider techniques to complement extracted phrases with information present in external sources such as Wikipedia and introduce an algorithm called RelevanceRank for this purpose. We discuss both these techniques in detail and provide an experimental study utilizing a large number of human judges from Amazons's Mechanical Turk service. Detailed experiments demonstrate the effectiveness and efficiency of the proposed techniques for the task of automating retrieval of documents related to a query document. Copyright 2009 ACM.d to a query document. Copyright 2009 ACM.
Bibtextype inproceedings  +
Doi 10.1145/1498759.1498806  +
Has author Yang Y. + , Bansal N. + , Dakka W. + , Ipeirotis P. + , Koudas N. + , Papadias D. +
Has extra keyword Blogs + , Blogging + , Experimental studies + , External sources + , If there are + , Key-phrase + , On-line information + , Query documents + , Related content + , Search interfaces + , Similarity matching + , Social networking sites + , Web 2.0 + , Wikipedia + , Information retrieval + , Websites + , Internet +
Has keyword Blogs + , Similarity matching + , Web 2.0 + , Wikipedia +
Isbn 9781605583907  +
Language English +
Number of citations by publication 0  +
Number of references by publication 0  +
Pages 34–43  +
Published in Proceedings of the 2nd ACM International Conference on Web Search and Data Mining, WSDM'09 +
Title Query by document +
Type conference paper  +
Year 2009 +
Creation dateThis property is a special property in this wiki. 8 November 2014 05:45:00  +
Categories Publications without license parameter  + , Publications without remote mirror parameter  + , Publications without archive mirror parameter  + , Publications without paywall mirror parameter  + , Conference papers  + , Publications without references parameter  + , Publications  +
Modification dateThis property is a special property in this wiki. 8 November 2014 05:45:00  +
DateThis property is a special property in this wiki. 2009  +
hide properties that link here 
Query by document + Title
 

 

Enter the name of the page to start browsing from.