Information arbitrage across multi-lingual Wikipedia
| Information arbitrage across multi-lingual Wikipedia | |
| Author(s) | Eytan Adar, Michael Skinner, Daniel S. Weld |
| Published in | Unknown [+] |
| Date | 2009 |
| Volume | Unknown [+] |
| Issue | Unknown [+] |
| Page(s) | 94-103 |
| Keyword(s) | Unknown [+] |
| Peer-reviewed? | Unknown [+] |
| Language(s) | English |
| License(s) | Unknown [+] |
| Identifiers | |
| ISBN | Unknown [+] |
| DOI | 10.1145/1498759.1498813 |
| OCLC Number | Unknown [+] |
| CiteULike | 5453884 |
| arXiv | Unknown [+] |
| PubMed | Unknown [+] |
| Related material | |
| Concept(s) | Unknown [+] |
| Tool(s) | Unknown [+] |
| Dataset(s) | Unknown [+] |
| Slides | Not available [+] |
| Presentation | Not available [+] |
| Search | |
| Article | BASE, CiteSeerX, Google Scholar |
| Web | Ask, Bing, Google (PDF), Yahoo! |
| Download and mirrors | |
| Local copy | Not available [+] |
| Remote mirror(s) | www.cond.org |
| Archive(s) | Not available [+] |
| Paywall(s) | Not available [+] |
| Export and share | |
| BibTeX, CSV, RDF, JSON | |
| | |
| Browse properties ยท List of publications | |
Information arbitrage across multi-lingual Wikipedia is a 2009 publication written in English by Eytan Adar, Michael Skinner, Daniel S. Weld.
[edit] Abstract
The rapid globalization of Wikipedia is generating a parallel, multi-lingual corpus of unprecedented scale. Pages for the same topic in many different languages emerge both as a result of manual translation and independent development. Unfortunately, these pages may appear at different times, vary in size, scope, and quality. Furthermore, differential growth rates cause the conceptual mapping between articles in different languages to be both complex and dynamic. These disparities provide the opportunity for a powerful form of information arbitrage --leveraging articles in one or more languages to improve the content in another. Analyzing four large language domains (English, Spanish, French, and German), we present Ziggurat , an automated system for aligning Wikipedia infoboxes, creating new infoboxes as necessary, filling in missing information, and detecting discrepancies between parallel pages. Our method uses self-supervised learning and our experiments demonstrate the method's feasibility, even in the absence of dictionaries.
[edit] References
- This section requires expansion. Please, help!
Cited by
This publication has 2 citations. Only those publications available in WikiPapers are shown here:- Cultural bias in Wikipedia content on famous persons
- The people's encyclopedia under the gaze of the sages: a systematic review of scholarly research on Wikipedia
Discussion
No comments yet. Be first!
