| Christoph Ringlstetter|
(Alternative names for this author)
|Co-authors||Annette Gotscharek, Maja Žorga, Tomaž Erjavec|
|Authorship||Publications (1), datasets (0), tools (0)|
|Citations||Total (0), average (0), median (0), max (0), min (0)|
|DBLP · Google Scholar|
|Export and share|
|BibTeX, CSV, RDF, JSON|
|Browse properties · List of authors|
Christoph Ringlstetter is an author.
PublicationsOnly those publications related to wikis are shown here.
|Title||Keyword(s)||Published in||Language||DateThis property is a special property in this wiki.||Abstract||R||C|
|A lexicon for processing archaic language: the case of XIXth century Slovene||WoLeR 2011: International Workshop on Lexical Resources||English||2011||The paper presents a lexicon to support computational processing of historical Slovene texts. Historical Slovene texts are being increasingly digitised and made available on the internet but are still underutilised as no language technology support is offered for their processing. Appropriate tools and resources would enable full-text searching with modern-day lemmas, modernisation of archaic language to make it more accessible to today‟s readers, and automatic OCR correction. We discuss the lexicon needed to support tokenisation, modernisation, lemmatisation and part-of-speech tagging of historical texts. The process of lexicon acquisition relies on a proof-read corpus, a large lexicon of contemporary Slovene, and tools to map historical forms to their contemporary equivalents via a set of rewrite rules, and to provide an editing environment for lexicon construction. The lexicon, currently work in progress, will be made publicly available; it should help not only in making digital libraries more accessible but also provide a quantitative basis for linguistic explorations of historical Slovene texts and a prototype electronic dictionary of archaic Slovene.||1||0|