Exploiting a multilingual web-based encyclopedia for bilingual terminology extraction
|Exploiting a multilingual web-based encyclopedia for bilingual terminology extraction|
|Published in||PACLIC 24 - Proceedings of the 24th Pacific Asia Conference on Language, Information and Computation|
|Keyword(s)||Bilingual terminology, Comparable corpora, Multilingual linguistic tool, Wikipedia (Extra: Comparable corpora, Cross language information retrieval, Language pairs, Linguistic resources, Link informations, Parallel corpora, Target language, Terminology extraction, Wikipedia, Computational linguistics, Information retrieval systems, Quality control, Terminology, Websites)|
|Article||BASE, CiteSeerX, Google Scholar|
|Web||Ask, Bing, Google (PDF), Yahoo!|
|Download and mirrors|
|Local copy||Not available|
|Remote mirror(s)||Not available|
|Export and share|
|BibTeX, CSV, RDF, JSON|
|Browse properties · List of journal articles|
Exploiting a multilingual web-based encyclopedia for bilingual terminology extraction is a 2010 journal article written in English by Sadat F. and published in PACLIC 24 - Proceedings of the 24th Pacific Asia Conference on Language, Information and Computation.
Multilingual linguistic resources are usually constructed from parallel corpora, but since these corpora are available only for selected text domains and language pairs, the potential of other resources is being explored as well. This article seeks to explore and to exploit the idea of using multilingual web-based encyclopedias such as Wikipedia as comparable corpora for bilingual terminology extraction. We propose an approach to extract terms and their translations from different types of Wikipedia link information and data. The next step will be using a linguistic-based information to re-rank and filter the extracted term candidates in the target language. Preliminary evaluations using the combined statistics-based and linguistic-based approaches were applied on different pairs of languages including Japanese, French and English. These evaluations showed a real open improvement and a good quality of the extracted term candidates for building or enriching multilingual ontology, dictionaries or feeding a cross-language information retrieval system with the related expansion terms of the source query.
- This section requires expansion. Please, help!
Probably, this publication is cited by others, but there are no articles available for them in WikiPapers.