A comparison of approaches for geospatial entity extraction from Wikipedia

From WikiPapers
Jump to: navigation, search

A comparison of approaches for geospatial entity extraction from Wikipedia is a 2010 conference paper written in English by Woodward D., Witmer J., Kalita J. and published in Proceedings - 2010 IEEE 4th International Conference on Semantic Computing, ICSC 2010.

[edit] Abstract

We target in this paper the challenge of extracting geospatial data from the article text of the English Wikipedia. We present the results of a Hidden Markov Model (HMM) based approach to identify location-related named entities in the our corpus of Wikipedia articles, which are primarily about battles and wars due to their high geospatial content. The HMM NER process drives a geocoding and resolution process, whose goal is to determine the correct coordinates for each place name (often referred to as grounding). We compare our results to a previously developed data structure and algorithm for disambiguating place names that can have multiple coordinates. We demonstrate an overall f-measure of 79.63% identifying and geocoding place names. Finally, we compare the results of the HMM-driven process to earlier work using a Support Vector Machine.

[edit] References

This section requires expansion. Please, help!

Cited by

Probably, this publication is cited by others, but there are no articles available for them in WikiPapers. Cited 1 time(s)