Inducing gazetteer for Chinese named entity recognition based on local high-frequent strings

From WikiPapers
Jump to: navigation, search

Inducing gazetteer for Chinese named entity recognition based on local high-frequent strings is a 2009 conference paper written in English by Pang W., Fan X. and published in 2009 2nd International Conference on Future Information Technology and Management Engineering, FITME 2009.

[edit] Abstract

Gazetteers, or entity dictionaries, are important for named entity recognition (NER). Although the dictionaries extracted automatically by the previous methods from a corpus, web or Wikipedia are very huge, they also misses some entities, especially the domain-specific entities. We present a novel method of automatic entity dictionary induction, which is able to construct a dictionary more specific to the processing text at a much lower computational cost than the previous methods. It extracts the local high-frequent strings in a document as candidate entities, and filters the invalid candidates with the accessor variety (AV) as our entity criterion. The experiments show that the obtained dictionary can effectively improve the performance of a high-precision baseline of NER.

[edit] References

This section requires expansion. Please, help!

Cited by

Probably, this publication is cited by others, but there are no articles available for them in WikiPapers.