Abstract
|
We target in this paper the challenge of e … We target in this paper the challenge of extracting geospatial data from the article text of the English Wikipedia. We present the results of a Hidden Markov Model (HMM) based approach to identify location-related named entities in the our corpus of Wikipedia articles, which are primarily about battles and wars due to their high geospatial content. The HMM NER process drives a geocoding and resolution process, whose goal is to determine the correct coordinates for each place name (often referred to as grounding). We compare our results to a previously developed data structure and algorithm for disambiguating place names that can have multiple coordinates. We demonstrate an overall f-measure of 79.63% identifying and geocoding place names. Finally, we compare the results of the HMM-driven process to earlier work using a Support Vector Machine.rlier work using a Support Vector Machine.
|
Abstractsub
|
We target in this paper the challenge of e … We target in this paper the challenge of extracting geospatial data from the article text of the English Wikipedia. We present the results of a Hidden Markov Model (HMM) based approach to identify location-related named entities in the our corpus of Wikipedia articles, which are primarily about battles and wars due to their high geospatial content. The HMM NER process drives a geocoding and resolution process, whose goal is to determine the correct coordinates for each place name (often referred to as grounding). We compare our results to a previously developed data structure and algorithm for disambiguating place names that can have multiple coordinates. We demonstrate an overall f-measure of 79.63% identifying and geocoding place names. Finally, we compare the results of the HMM-driven process to earlier work using a Support Vector Machine.rlier work using a Support Vector Machine.
|
Bibtextype
|
inproceedings +
|
Doi
|
10.1109/ICSC.2010.74 +
|
Has author
|
Daryl Woodward +
, Jeremy Witmer +
, Jugal Kalita +
|
Has extra keyword
|
Entity extractions +
, F-measure +
, Geo coding +
, Geo-spatial +
, Geo-spatial data +
, Named entities +
, Process drives +
, Resolution process +
, Wikipedia +
, Data structures +
, Hidden Markov models +
, Semantics +
|
Isbn
|
9780769541549 +
|
Language
|
English +
|
Number of citations by publication
|
0 +
|
Number of references by publication
|
0 +
|
Pages
|
402–407 +
|
Published in
|
Proceedings - 2010 IEEE 4th International Conference on Semantic Computing, ICSC 2010 +
|
Title
|
A comparison of approaches for geospatial entity extraction from Wikipedia +
|
Type
|
conference paper +
|
Year
|
2010 +
|
Creation dateThis property is a special property in this wiki.
|
6 November 2014 13:27:55 +
|
Categories |
Publications without keywords parameter +
, Publications without license parameter +
, Publications without remote mirror parameter +
, Publications without archive mirror parameter +
, Publications without paywall mirror parameter +
, Conference papers +
, Publications without references parameter +
, Publications +
|
Modification dateThis property is a special property in this wiki.
|
6 November 2014 13:27:55 +
|
redirect page |
A comparison of approaches for geospatial entity extraction from Wikipedia +
|
|
|
DateThis property is a special property in this wiki.
|
2010 +
|