Efficient Wikipedia-based semantic interpreter by exploiting top-k processing
|Efficient Wikipedia-based semantic interpreter by exploiting top-k processing|
|Author(s)||Kim J.W., Kashyap A., Li D., Bhamidipati S.|
|Published in||International Conference on Information and Knowledge Management, Proceedings|
|Keyword(s)||Concept, Semantic interpretation, Wikipedia (Extra: Bag of words, Best match, Concept, Concept-based, Efficient algorithm, Execution time, Over current, Semantic interpretation, Semantic relatedness, Sheer size, Wikipedia, Algorithms, Data mining, Image matching, Knowledge management, Natural language processing systems, Semantics, Information retrieval)|
|Article||BASE, CiteSeerX, Google Scholar|
|Web||Ask, Bing, Google (PDF), Yahoo!|
|Download and mirrors|
|Local copy||Not available|
|Remote mirror(s)||Not available|
|Export and share|
|BibTeX, CSV, RDF, JSON|
|Browse properties · List of conference papers|
Efficient Wikipedia-based semantic interpreter by exploiting top-k processing is a 2010 conference paper written in English by Kim J.W., Kashyap A., Li D., Bhamidipati S. and published in International Conference on Information and Knowledge Management, Proceedings.
Proper representation of the meaning of texts is crucial to enhancing many data mining and information retrieval tasks, including clustering, computing semantic relatedness between texts, and searching. Representing of texts in the concept-space derived from Wikipedia has received growing attention recently, due to its comprehensiveness and expertise. This concept-based representation is capable of extracting semantic relatedness between texts that cannot be deduced with the bag of words model. A key obstacle, however, for using Wikipedia as a semantic interpreter is that the sheer size of the concepts derived from Wikipedia makes it hard to efficiently map texts into concept-space. In this paper, we develop an efficient algorithm which is able to represent the meaning of a text by using the concepts that best match it. In particular, our approach first computes the approximate top-k concepts that are most relevant to the given text. We then leverage these concepts for representing the meaning of the given text. The experimental results show that the proposed technique provides significant gains in execution time over current solutions to the problem.
- This section requires expansion. Please, help!
Probably, this publication is cited by others, but there are no articles available for them in WikiPapers. Cited 1 time(s)