Browse wiki

Jump to: navigation, search
Binrank: Scaling dynamic authority-based search using materialized subgraphs
Abstract Dynamic authority-based keyword search algDynamic authority-based keyword search algorithms, such as ObjectRank and personalized PageRank, leverage semantic link information to provide high quality, high recall search in databases and on the Web. Conceptually, these algorithms require a query-time PageRank-style iterative computation over the full graph. This computation is too expensive for large graphs, and not feasible at query time. Alternatively, building an index of pre-computed results for some or all keywords involves very expensive preprocessing. We introduce BinRank, a system that approximates ObjectRank results by utilizing a hybrid approach inspired by materialized views in traditional query processing. We materialize a number of relatively small subsets of the data graph in such a way that any keyword query can be answered by running ObjectRank on only one of the sub-graphs. BinRank generates the sub-graphs by partitioning all the terms in the corpus based on their co-occurrence, executing ObjectRank for each partition using the terms to generate a set of random walk starting points, and keeping only those objects that receive nonnegligible scores. The intuition is that a sub-graph that contains all objects and links relevant to a set of related terms should have all the information needed to rank objects with respect to one of these terms. We demonstrate that BinRank can achieve sub-second query execution time on the English Wikipedia dataset, while producing high quality search results that closely approximate the results of ObjectRank on the original graph. The Wikipedia link graph contains about 108 edges, which is at least two orders of magnitude larger than what prior state of the art dynamic authority-based search systems have been able to demonstrate. Our experimental evaluation investigates the trade-off between query execution time, quality of the results, and storage requirements of BinRank.ults, and storage requirements of BinRank.
Abstractsub Dynamic authority-based keyword search algDynamic authority-based keyword search algorithms, such as ObjectRank and personalized PageRank, leverage semantic link information to provide high quality, high recall search in databases and on the Web. Conceptually, these algorithms require a query-time PageRank-style iterative computation over the full graph. This computation is too expensive for large graphs, and not feasible at query time. Alternatively, building an index of pre-computed results for some or all keywords involves very expensive preprocessing. We introduce BinRank, a system that approximates ObjectRank results by utilizing a hybrid approach inspired by materialized views in traditional query processing. We materialize a number of relatively small subsets of the data graph in such a way that any keyword query can be answered by running ObjectRank on only one of the sub-graphs. BinRank generates the sub-graphs by partitioning all the terms in the corpus based on their co-occurrence, executing ObjectRank for each partition using the terms to generate a set of random walk starting points, and keeping only those objects that receive nonnegligible scores. The intuition is that a sub-graph that contains all objects and links relevant to a set of related terms should have all the information needed to rank objects with respect to one of these terms. We demonstrate that BinRank can achieve sub-second query execution time on the English Wikipedia dataset, while producing high quality search results that closely approximate the results of ObjectRank on the original graph. The Wikipedia link graph contains about 108 edges, which is at least two orders of magnitude larger than what prior state of the art dynamic authority-based search systems have been able to demonstrate. Our experimental evaluation investigates the trade-off between query execution time, quality of the results, and storage requirements of BinRank.ults, and storage requirements of BinRank.
Bibtextype inproceedings  +
Doi 10.1109/ICDE.2009.94  +
Has author Heasoo Hwang + , Andrey Balmin + , Berthold Reinwald + , Erik Nijkamp +
Has extra keyword Co-occurrence + , Data graph + , Dataset + , Experimental evaluation + , High quality + , Hybrid approach + , Iterative computation + , Keyword queries + , Keyword search + , Large graphs + , Materialized view + , Orders of magnitude + , PageRank + , Query execution time + , Query time + , Random Walk + , Search results + , Search system + , Semantic link + , State of the art + , Storage requirements + , Subgraphs + , Wikipedia + , Learning algorithms + , Graph theory +
Isbn 9780769535456  +
Language English +
Number of citations by publication 0  +
Number of references by publication 0  +
Pages 66–77  +
Published in Proceedings - International Conference on Data Engineering +
Title Binrank: Scaling dynamic authority-based search using materialized subgraphs +
Type conference paper  +
Year 2009 +
Creation dateThis property is a special property in this wiki. 6 November 2014 20:59:07  +
Categories Publications without keywords parameter  + , Publications without license parameter  + , Publications without remote mirror parameter  + , Publications without archive mirror parameter  + , Publications without paywall mirror parameter  + , Conference papers  + , Publications without references parameter  + , Publications  +
Modification dateThis property is a special property in this wiki. 6 November 2014 20:59:07  +
DateThis property is a special property in this wiki. 2009  +
hide properties that link here 
Binrank: Scaling dynamic authority-based search using materialized subgraphs + Title
 

 

Enter the name of the page to start browsing from.