From WikiPapers
Jump to: navigation, search

wiktionary is included as keyword or extra keyword in 0 datasets, 1 tools and 7 publications.


There is no datasets for this keyword.


Tool Operating System(s) Language(s) Programming language(s) License Description Image
Wikokit Cross-platform Multilingual Java EPLv1.0
New BSD License
wikokit (wiki tool kit) - several projects related to wiki.

wiwordik - machine-readable Wiktionary. A visual interface to the parsed English Wiktionary and Russian Wiktionary databases.
Java WebStart application + JavaFX, English interface.
742 languages extracted from the English Wiktionary.

423 languages extracted from the Russian Wiktionary.
Wiwordik-en.0.09.1094 scrollbox.jpg


Title Author(s) Published in Language DateThis property is a special property in this wiki. Abstract R C
Avoimen suomenkielisen morfologian liittäminen Wikimedian hakujärjestelmään Niklas Laxström University of Helsinki Finnish 1 January 2012 In my thesis I investigated the feasibility of using a Finnish morphology implementation with the Lucene search system. With the same Lucene-search package that is used by the Wikimedia Foundation I built two search indexes: one with the existing Porter stemming algorithm and the other one with morphological analysis. The corpus I used was the current text dump of Finnish Wikipedia. [...] See http://laxstrom.name/blag/2012/02/13/exploring-the-states-of-open-source-search-stack-supporting-finnish/ 9 0
Multilingual Ontology Matching based on Wiktionary Data Accessible via SPARQL Endpoint Feiyu Lin
Andrew Krizhanovsky
Proceedings of the 13th Russian Conference on Digital Libraries RCDL’2011 English 2011 Interoperability is a feature required by the Semantic Web. It is provided by the ontology matching methods and algorithms. But now ontologies are presented not only in English, but in other languages as well. It is important to use an automatic translation for obtaining correct matching pairs in multilingual ontology matching. The translation into many languages could be based on the Google Translate API, the Wiktionary database, etc. From the point of view of the balance of presence of many languages, of manually crafted translations, of a huge size of a dictionary, the most promising resource is the Wiktionary. It is a collaborative project working on the same principles as the Wikipedia. The parser of the Wiktionary was developed and the machine-readable dictionary was designed. The data of the machine-readable Wiktionary are stored in a relational database, but with the help of D2R server the database is presented as an RDF store. Thus, it is possible to get lexicographic information (definitions, translations, synonyms) from web service using SPARQL requests. In the case study, the problem entity is a task of multilingual ontology matching based on Wiktionary data accessible via SPARQL endpoint. Ontology matching results obtained using Wiktionary were compared with results based on Google Translate API. 5 0
Semi-automatic enrichment of crowdsourced synonymy networks: the WISIGOTH system applied to Wiktionary Franck Sajous
Emmanuel Navarro
Bruno Gaume
Laurent Prévot
Yannick Chudy
Language Resources and Evaluation English 2011 Semantic lexical resources are a mainstay of various Natural Language Processing applications. However, comprehensive and reliable resources are rare and not often freely available. Handcrafted resources are too costly for being a general solution while automatically-built resources need to be validated by experts or at least thoroughly evaluated. We propose in this paper a picture of the current situation with regard to lexical resources, their building and their evaluation. We give an in-depth description of Wiktionary, a freely available and collaboratively built multilingual dictionary. Wiktionary is presented here as a promising raw resource for NLP. We propose a semi-automatic approach based on random walks for enriching Wiktionary synonymy network that uses both endogenous and exogenous data. We take advantage of the wiki infrastructure to propose a validation “by crowds”. Finally, we present an implementation called WISIGOTH, which supports our approach. 7 0
Encouraging language students to contribute inflection data to Wiktionary Zachary Kurmas WikiSym English 2010 We propose building a computer program to simplify access to the inflection (i.e., “word ending”) data in Wiktionary. This program will make it easier to both (1) look up a word’s inflections and, more importantly, (2) edit incorrect inflections. We expect that such a program will encourage foreign language students to both use Wiktionary as a resource and contribute inflection and other grammar data toWiktionary. We believe that the resulting additional activity will make Wiktionary a better resource for students — especially students of those languages for which there are no cheap, comprehensive inflection resources — and provide data that will be beneficial to the wiki research community 1 0
Zawilinski: a library for studying grammar in Wiktionary Zachary Kurmas WikiSym English 2010 We present Zawilinski, a Java library that supports the extraction and analysis of grammatical data in Wiktionary. Zawilinski can efficiently (1) filter Wiktionary for content pertaining to a specified language, and (2) extract a word’s inflections from its Wiktionary entry. We have thus far used Zawilinski to (1) measure the correctness of the inflections for a subset of the Polish words in the English Wiktionary and to (2) show that this grammatical data is very stable. (Only 131 out of 4748 Polish words have had their inflection data corrected.) We also explain Zawilinski’s key features and discuss how it can be used to simplify the development of additional grammar-based analyses. 3 2
Related terms search based on WordNet / Wiktionary and its application in Ontology Matching Feiyu Lin Andrew Krizhanovsky RCDL 2009 A set of ontology matching algorithms (for finding correspondences between concepts) is based on a thesaurus that provides the source data for the semantic distance calculations. In this wiki era, new resources may spring up and improve this kind of semantic search. In the paper a solution of this task based on Russian Wiktionary is compared to WordNet based algorithms. Metrics are estimated using the test collection, containing 353 English word pairs with a relatedness score assigned by human evaluators. The experiment shows that the proposed method is capable in principle of calculating a semantic distance between pair of words in any language presented in Russian Wiktionary. The calculation of Wiktionary based metric had required the development of the open-source Wiktionary parser software. 0 0
Using the Wiktionary graph structure for synonym detection Timothy Weale
Chris Brew
Eric Fosler-Lussier
People’s Web Meets English 2009 This paper presents our work on using the graph structure of Wiktionary for synonym detection. We implement semantic relatedness metrics using both a direct measure of information flow on the graph and a comparison of the list of vertices found to be “close” to a given vertex. Our algorithms, evaluated on ESL 50, TOEFL 80 and RDWP 300 data sets, perform better than or comparable to existing semantic relatedness measures. 5 1