BinRank: Scaling dynamic authority-based search using materialized subgraphs

From WikiPapers
Jump to: navigation, search

BinRank: Scaling dynamic authority-based search using materialized subgraphs is a 2010 journal article by Heasoo Hwang, Andrey Balmin, Berthold Reinwald, Erik Nijkamp and published in IEEE Transactions on Knowledge and Data Engineering.

[edit] Abstract

Dynamic authority-based keyword search algorithms, such as {ObjectRank} and personalized {PageRank,} leverage semantic link information to provide high quality, high recall search in databases, and the Web. Conceptually, these algorithms require a query-time {PageRank-style} iterative computation over the full graph. This computation is too expensive for large graphs, and not feasible at query time. Alternatively, building an index of precomputed results for some or all keywords involves very expensive preprocessing. We introduce {BinRank,} a system that approximates {ObjectRank} results by utilizing a hybrid approach inspired by materialized views in traditional query processing. We materialize a number of relatively small subsets of the data graph in such a way that any keyword query can be answered by running {ObjectRank} on only one of the subgraphs. {BinRank} generates the subgraphs by partitioning all the terms in the corpus based on their co-occurrence, executing {ObjectRank} for each partition using the terms to generate a set of random walk starting points, and keeping only those objects that receive non-negligible scores. The intuition is that a subgraph that contains all objects and links relevant to a set of related terms should have all the information needed to rank objects with respect to one of these terms. We demonstrate that {BinRank} can achieve subsecond query execution time on the English Wikipedia data set, while producing high-quality search results that closely approximate the results of {ObjectRank} on the original graph. The Wikipedia link graph contains about 108 edges, which is at least two orders of magnitude larger than what prior state of the art dynamic authority-based search systems have been able to demonstrate. Our experimental evaluation investigates the trade-off between query execution time, quality of the results, and storage requirements of {BinRank.

[edit] References

This section requires expansion. Please, help!

Cited by

Probably, this publication is cited by others, but there are no articles available for them in WikiPapers.

Discussion

No comments yet. Be first!