Measuring comparability of multilingual corpora extracted from wikipedia

From WikiPapers
Jump to: navigation, search

Measuring comparability of multilingual corpora extracted from wikipedia is a 2011 conference paper written in English by Otero P.G., Lopez I.G. and published in CEUR Workshop Proceedings.

[edit] Abstract

Comparable corpora can be used for many linguistic tasks such as bilingual lexicon extraction. By improving the quality of comparable corpora, we improve the quality of the extraction. This article describes some strategies to build comparable corpora from Wikipedia and proposes a measure of comparability. Experiments were performed on Portuguese, Spanish, and English Wikipedia.

[edit] References

This section requires expansion. Please, help!

Cited by

Probably, this publication is cited by others, but there are no articles available for them in WikiPapers.