Browse wiki

Jump to: navigation, search
Building distant supervised relation extractors
Abstract A well-known drawback in building machine A well-known drawback in building machine learning semantic relation detectors for natural language is the lack of a large number of qualified training instances for the target relations in multiple languages. Even when good results are achieved, the datasets used by the state-of-the-art approaches are rarely published. In order to address these problems, this work presents an automatic approach to build multilingual semantic relation detectors through distant supervision combining two of the largest resources of structured and unstructured content available on the Web, DBpedia and Wikipedia. We map the DBpedia ontology back to the Wikipedia text to extract more than 100.000 training instances for more than 90 DBpedia relations for English and Portuguese languages without human intervention. First, we mine the Wikipedia articles to find candidate instances for relations described in the DBpedia ontology. Second, we preprocess and normalize the data filtering out irrelevant instances. Finally, we use the normalized data to construct regularized logistic regression detectors that achieve more than 80% of F-Measure for both English and Portuguese languages. In this paper, we also compare the impact of different types of features on the accuracy of the trained detector, demonstrating significant performance improvements when combining lexical, syntactic and semantic features. Both the datasets and the code used in this research are available online.sed in this research are available online.
Abstractsub A well-known drawback in building machine A well-known drawback in building machine learning semantic relation detectors for natural language is the lack of a large number of qualified training instances for the target relations in multiple languages. Even when good results are achieved, the datasets used by the state-of-the-art approaches are rarely published. In order to address these problems, this work presents an automatic approach to build multilingual semantic relation detectors through distant supervision combining two of the largest resources of structured and unstructured content available on the Web, DBpedia and Wikipedia. We map the DBpedia ontology back to the Wikipedia text to extract more than 100.000 training instances for more than 90 DBpedia relations for English and Portuguese languages without human intervention. First, we mine the Wikipedia articles to find candidate instances for relations described in the DBpedia ontology. Second, we preprocess and normalize the data filtering out irrelevant instances. Finally, we use the normalized data to construct regularized logistic regression detectors that achieve more than 80% of F-Measure for both English and Portuguese languages. In this paper, we also compare the impact of different types of features on the accuracy of the trained detector, demonstrating significant performance improvements when combining lexical, syntactic and semantic features. Both the datasets and the code used in this research are available online.sed in this research are available online.
Bibtextype inproceedings  +
Doi 10.1109/ICSC.2014.15  +
Has author Nunes T. + , Schwabe D. +
Has extra keyword Artificial intelligence + , Information retrieval + , Semantics + , Automatic approaches + , DBpedia + , Distant Supervision + , Logistic regressions + , Portuguese languages + , Relation extraction + , State-of-the-art approach + , Wikipedia + , Semantic web +
Has keyword DBpedia + , Distant Supervision + , Information extraction + , Relation Extraction + , Wikipedia +
Isbn 9781479940028  +
Language English +
Number of citations by publication 0  +
Number of references by publication 0  +
Pages 44–51  +
Published in Proceedings - 2014 IEEE International Conference on Semantic Computing, ICSC 2014 +
Title Building distant supervised relation extractors +
Type conference paper  +
Year 2014 +
Creation dateThis property is a special property in this wiki. 6 November 2014 18:33:39  +
Categories Publications without license parameter  + , Publications without remote mirror parameter  + , Publications without archive mirror parameter  + , Publications without paywall mirror parameter  + , Conference papers  + , Publications without references parameter  + , Publications  +
Modification dateThis property is a special property in this wiki. 6 November 2014 18:33:39  +
DateThis property is a special property in this wiki. 2014  +
hide properties that link here 
Building distant supervised relation extractors + Title
 

 

Enter the name of the page to start browsing from.