A Cross-Lingual Dictionary for English Wikipedia Concepts
|A Cross-Lingual Dictionary for English Wikipedia Concepts|
|Author(s)||Valentin I. Spitkovsky, Angel X. Chang|
|Published in||Proceedings of the Eighth International Conference on Language Resources and Evaluation|
|Keyword(s)||cross-language information retrieval, entity linking, Wikipedia|
|Dataset(s)||Google dataset linking strings and concepts|
|Article||BASE, CiteSeerX, Google Scholar|
|Web||Ask, Bing, Google (PDF), Yahoo!|
|Download and mirrors|
|Local copy||Not available|
|Export and share|
|BibTeX, CSV, RDF, JSON|
|Browse properties · List of conference papers|
A Cross-Lingual Dictionary for English Wikipedia Concepts is a 2012 conference paper written in English by Valentin I. Spitkovsky, Angel X. Chang and published in Proceedings of the Eighth International Conference on Language Resources and Evaluation.
We present a resource for automatically associating strings of text with English Wikipedia concepts. Our machinery is bi-directional, in the sense that it uses the same fundamental probabilistic methods to map strings to empirical distributions over Wikipedia articles as it does to map article URLs to distributions over short, language-independent strings of natural language text. For maximal interoperability, we release our resource as a set of flat line-based text files, lexicographically sorted and encoded with UTF-8. These files capture joint probability distributions underlying concepts (we use the terms article, concept and Wikipedia URL interchangeably) and associated snippets of text, as well as other features that can come in handy when working with Wikipedia articles and related information.
This publication has 5 references. Only those references related to wikis are included here:
- "Epistemology and the Wikipedia" (create it!)
- "A large ontology from Wikipedia and WordNet" (create it!)
Probably, this publication is cited by others, but there are no articles available for them in WikiPapers.