Browse wiki

Jump to: navigation, search
Automatic link detection: A sequence labeling approach
Abstract The popularity of Wikipedia and other onliThe popularity of Wikipedia and other online knowledge bases has recently produced an interest in the machine learning community for the problem of automatic linking. Automatic hyperlinking can be viewed as two sub problems - link detection which determines the source of a link, and link disambiguation which determines the destination of a link. Wikipedia is a rich corpus with hyperlink data provided by authors. It is possible to use this data to train classifiers to be able to mimic the authors in some capacity. In this paper, we introduce automatic link detection as a sequence labeling problem. Conditional random fields (CRFs) are a probabilistic framework for labeling sequential data. We show that training a CRF with different types of features from the Wikipedia dataset can be used to automatically detect links with almost perfect precision and high recall. Copyright 2009 ACM.ision and high recall. Copyright 2009 ACM.
Abstractsub The popularity of Wikipedia and other onliThe popularity of Wikipedia and other online knowledge bases has recently produced an interest in the machine learning community for the problem of automatic linking. Automatic hyperlinking can be viewed as two sub problems - link detection which determines the source of a link, and link disambiguation which determines the destination of a link. Wikipedia is a rich corpus with hyperlink data provided by authors. It is possible to use this data to train classifiers to be able to mimic the authors in some capacity. In this paper, we introduce automatic link detection as a sequence labeling problem. Conditional random fields (CRFs) are a probabilistic framework for labeling sequential data. We show that training a CRF with different types of features from the Wikipedia dataset can be used to automatically detect links with almost perfect precision and high recall. Copyright 2009 ACM.ision and high recall. Copyright 2009 ACM.
Bibtextype inproceedings  +
Doi 10.1145/1645953.1646208  +
Has author Gardner J.J. + , Xiong L. +
Has extra keyword Automatic linking + , Automatic links + , Conditional random field + , Dataset + , Hyperlinking + , Hyperlinks + , Knowledge basis + , Machine learning communities + , Probabilistic framework + , Sequence Labeling + , Sequential data + , Sub-problems + , Wikipedia + , Hypertext systems + , Knowledge management + , Semantic web + , Semantics + , Labeling +
Has keyword Data mining + , Semantic web + , Sequence labeling + , Wikipedia +
Isbn 9781605585123  +
Language English +
Number of citations by publication 0  +
Number of references by publication 0  +
Pages 1701–1704  +
Published in International Conference on Information and Knowledge Management, Proceedings +
Title Automatic link detection: A sequence labeling approach +
Type conference paper  +
Year 2009 +
Creation dateThis property is a special property in this wiki. 6 November 2014 18:39:35  +
Categories Publications without license parameter  + , Publications without remote mirror parameter  + , Publications without archive mirror parameter  + , Publications without paywall mirror parameter  + , Conference papers  + , Publications without references parameter  + , Publications  +
Modification dateThis property is a special property in this wiki. 6 November 2014 18:39:35  +
DateThis property is a special property in this wiki. 2009  +
hide properties that link here 
Automatic link detection: A sequence labeling approach + Title
 

 

Enter the name of the page to start browsing from.