Alexander E. Richman
| Alexander E. Richman|
(Alternative names for this author)
|Authorship||Publications (1), datasets (0), tools (0)|
|Citations||Total (0), average (0), median (0), max (0), min (0)|
|DBLP · Google Scholar|
|Export and share|
|BibTeX, CSV, RDF, JSON|
|Browse properties · List of authors|
Alexander E. Richman is an author.
PublicationsOnly those publications related to wikis are shown here.
|Title||Keyword(s)||Published in||Language||DateThis property is a special property in this wiki.||Abstract||R||C|
|Mining Wiki Resources for Multilingual Named Entity Recognition||ACL-08: HLT, pp. 1-9||2008||In this paper, we describe a system by which the multilingual characteristics of Wikipedia can be utilized to annotate a large corpus of text with Named Entity Recognition (NER) tags requiring minimal human intervention and no linguistic expertise. This process, though of value in languages for which resources exist, is particularly useful for less commonly taught languages. We show how
the Wikipedia format can be used to identify possible named entities and discuss in detail the process by which we use the Category structure inherent to Wikipedia to determine the named entity type of a proposed entity.We further describe the methods by which English language data can be used to bootstrap the NER process in other languages. We demonstrate the system by using the generated corpus as training sets for a variant of BBN's Identifinder in French, Ukrainian, Spanish, Polish, Russian, and Portuguese, achieving overall F-scores as high as 84.7% on independent, human-annotated corpora, comparable to a system trained on up to 40,000 words of human-annotated newswire.