Extracción de Corpus Paralelos de la Wikipedia basada en la Obtención de Alineamientos Bilingües a Nivel de Frase

From WikiPapers
Revision as of 02:58, May 22, 2012 by 555 (Talk | contribs) (Created page with "{{Infobox Publication |type=conference paper |title=Extracci´on de corpus paralelos de la Wikipedia basada en la obtenci´on de alineamientos biling¨ues a nivel de frase |au...")

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

Extracci´on de corpus paralelos de la Wikipedia basada en la obtenci´on de alineamientos biling¨ues a nivel de frase is a 2011 conference paper written in Spanish by Joan Albert Silvestre-Cerdà, Mercedes García-Martínez, Alberto Barrón-Cedeño, Jorge Civera, Paolo Rosso and published in Proceedings of the Workshop on Iberian Cross-Language Natural Language Processing Tasks (ICL 2011).

[edit] Abstract

This paper presents a proposal for extracting parallel corpora from Wikipedia on the basis of statistical machine translation techniques. We have used word-level alignment models from IBM in order to obtain phrase-level bilingual alignments between documents pairs. We have manually annotated a set of test English-Spanish comparable documents in order to evaluate the model. The obtained results are encouraging.

[edit] References

This publication has 4 references. Only those references related to wikis are included here:

Cited by

Probably, this publication is cited by others, but there are no articles available for them in WikiPapers.

Discussion

No comments yet. Be first!