Inducing gazetteer for Chinese named entity recognition based on local high-frequent strings
|Inducing gazetteer for Chinese named entity recognition based on local high-frequent strings|
|Author(s)||Pang W., Fan X.|
|Published in||2009 2nd International Conference on Future Information Technology and Management Engineering, FITME 2009|
|Keyword(s)||Information extraction, Local high-frequent strings, Named entity recognition, Natural language processing (Extra: Chinese named entity recognition, Computational costs, Domain specific, High-precision, Information Extraction, Named entity recognition, NAtural language processing, Novel methods, Wikipedia, Computational linguistics, Information analysis, Information technology, Natural language processing systems)|
|Article||BASE, CiteSeerX, Google Scholar|
|Web||Ask, Bing, Google (PDF), Yahoo!|
|Download and mirrors|
|Local copy||Not available|
|Remote mirror(s)||Not available|
|Export and share|
|BibTeX, CSV, RDF, JSON|
|Browse properties · List of conference papers|
Inducing gazetteer for Chinese named entity recognition based on local high-frequent strings is a 2009 conference paper written in English by Pang W., Fan X. and published in 2009 2nd International Conference on Future Information Technology and Management Engineering, FITME 2009.
Gazetteers, or entity dictionaries, are important for named entity recognition (NER). Although the dictionaries extracted automatically by the previous methods from a corpus, web or Wikipedia are very huge, they also misses some entities, especially the domain-specific entities. We present a novel method of automatic entity dictionary induction, which is able to construct a dictionary more specific to the processing text at a much lower computational cost than the previous methods. It extracts the local high-frequent strings in a document as candidate entities, and filters the invalid candidates with the accessor variety (AV) as our entity criterion. The experiments show that the obtained dictionary can effectively improve the performance of a high-precision baseline of NER.
- This section requires expansion. Please, help!
Probably, this publication is cited by others, but there are no articles available for them in WikiPapers.