Pavel Velikhov

From WikiPapers
Jump to: navigation, search

Pavel Velikhov is an author.

Publications

Only those publications related to wikis are shown here.
Title Keyword(s) Published in Language DateThis property is a special property in this wiki. Abstract R C
Accuracy estimate and optimization techniques for SimRank computation VLDB Journal 2010 The measure of similarity between objects is a very useful tool in many areas of computer science, including information retrieval. {SimRank} is a simple and intuitive measure of this kind, based on a graph-theoretic model. {SimRank} is typically computed iteratively, in the spirit of {PageRank.} However, existing work on {SimRank} lacks accuracy estimation of iterative computation and has discouraging time complexity. In this paper, we present a technique to estimate the accuracy of computing {SimRank} iteratively. This technique provides a way to find out the number of iterations required to achieve a desired accuracy when computing {SimRank.} We also present optimization techniques that improve the computational complexity of the iterative algorithm from O(n4) in the worst case to {min(O(nl),} O(n3/ log2n)), with n denoting the number of objects, and l denoting the number object-to-object relationships. We also introduce a threshold sieving heuristic and its accuracy estimation that further improves the efficiency of the method. As a practical illustration of our techniques, we computed {SimRank} scores on a subset of English Wikipedia corpus, consisting of the complete set of articles and category links. {Springer-Verlag} 2009. 0 0
Accuracy estimate and optimization techniques for simrank computation Proceedings of the VLDB Endowment English 2008 The measure of similarity between objects is a very useful tool in many areas of computer science, including informa-tion retrieval. SimRank is a simple and intuitive measure of this kind, based on graph-theoretic model. SimRank is typ-ically computed iteratively, in the spirit of PageRank. How-ever, existing work on SimRank lacks accuracy estimation of iterative computation and has discouraging time complexity. In this paper we present a technique to estimate the ac-curacy of computing SimRank iteratively. This technique provides a way to find out the number of iterations required to achieve a desired accuracy when computing SimRank. We also present optimization techniques that improve the com-putational complexity of the iterative algorithm from O(n 4) to O(n 3) in the worst case. We also introduce a threshold sieving heuristic and its accuracy estimation that further improves the efficiency of the method. As a practical illustration of our techniques we computed SimRank scores on a subset of English Wikipedia corpus, consisting of the complete set of articles and category links. 0 0
Efficient Ranking and Computation of Semantic Relatedness and its Application to Word Sense Disambiguation English 2008 0 0
Semantic relatedness metric for Wikipedia concepts based on link analysis and its application to word sense disambiguation CEUR Workshop Proceedings English 2008 Wikipedia has grown into a high quality up-todate knowledge base and can enable many knowledge-based applications, which rely on semantic information. One of the most general and quite powerful semantic tools is a measure of semantic relatedness between concepts. Moreover, the ability to efficiently produce a list of ranked similar concepts for a given concept is very important for a wide range of applications. We propose to use a simple measure of similarity between Wikipedia concepts, based on Dice's measure, and provide very efficient heuristic methods to compute top k ranking results. Furthermore, since our heuristics are based on statistical properties of scale-free networks, we show that these heuristics are applicable to other complex ontologies. Finally, in order to evaluate the measure, we have used it to solve the problem of word-sense disambiguation. Our approach to word sense disambiguation is based solely on the similarity measure and produces results with high accuracy. 0 1