A comparison of approaches for geospatial entity extraction from Wikipedia
|A comparison of approaches for geospatial entity extraction from Wikipedia|
|Author(s)||Woodward D., Witmer J., Kalita J.|
|Published in||Proceedings - 2010 IEEE 4th International Conference on Semantic Computing, ICSC 2010|
|Keyword(s)||Unknown (Extra: Entity extractions, F-measure, Geo coding, Geo-spatial, Geo-spatial data, Named entities, Process drives, Resolution process, Wikipedia, Data structures, Hidden Markov models, Semantics)|
|Article||BASE, CiteSeerX, Google Scholar|
|Web||Ask, Bing, Google (PDF), Yahoo!|
|Download and mirrors|
|Local copy||Not available|
|Remote mirror(s)||Not available|
|Export and share|
|BibTeX, CSV, RDF, JSON|
|Browse properties · List of conference papers|
A comparison of approaches for geospatial entity extraction from Wikipedia is a 2010 conference paper written in English by Woodward D., Witmer J., Kalita J. and published in Proceedings - 2010 IEEE 4th International Conference on Semantic Computing, ICSC 2010.
We target in this paper the challenge of extracting geospatial data from the article text of the English Wikipedia. We present the results of a Hidden Markov Model (HMM) based approach to identify location-related named entities in the our corpus of Wikipedia articles, which are primarily about battles and wars due to their high geospatial content. The HMM NER process drives a geocoding and resolution process, whose goal is to determine the correct coordinates for each place name (often referred to as grounding). We compare our results to a previously developed data structure and algorithm for disambiguating place names that can have multiple coordinates. We demonstrate an overall f-measure of 79.63% identifying and geocoding place names. Finally, we compare the results of the HMM-driven process to earlier work using a Support Vector Machine.
- This section requires expansion. Please, help!
Probably, this publication is cited by others, but there are no articles available for them in WikiPapers. Cited 1 time(s)