Browse wiki

Jump to: navigation, search
Catriple: Extracting triples from wikipedia categories
Abstract As an important step towards bootstrappingAs an important step towards bootstrapping the Semantic Web, many efforts have been made to extract triples from Wikipedia because of its wide coverage, good organization and rich knowledge. One kind of important triples is about Wikipedia articles and their non-isa properties, e.g. (Beijing, country, China). Previous work has tried to extract such triples from Wikipedia infoboxes, article text and categories. The infobox-based and text-based extraction methods depend on the infoboxes and suffer from a low article coverage. In contrast, the category-based extraction methods exploit the widespread categories. However, they rely on predefined properties, which is too effort-consuming and explores only very limited knowledge in the categories. This paper automatically extracts properties and triples from the less explored Wikipedia categories so as to achieve a wider article coverage with less manual effort. We manage to realize this goal by utilizing the syntax and semantics brought by super-sub category pairs in Wikipedia. Our prototype implementation outputs about 10M triples with a 12-level confidence ranging from 47.0% to 96.4%, which cover 78.2% of Wikipedia articles. Among them, 1.27M triples have confidence of 96.4%. Applications can on demand use the triples with suitable confidence. use the triples with suitable confidence.
Abstractsub As an important step towards bootstrappingAs an important step towards bootstrapping the Semantic Web, many efforts have been made to extract triples from Wikipedia because of its wide coverage, good organization and rich knowledge. One kind of important triples is about Wikipedia articles and their non-isa properties, e.g. (Beijing, country, China). Previous work has tried to extract such triples from Wikipedia infoboxes, article text and categories. The infobox-based and text-based extraction methods depend on the infoboxes and suffer from a low article coverage. In contrast, the category-based extraction methods exploit the widespread categories. However, they rely on predefined properties, which is too effort-consuming and explores only very limited knowledge in the categories. This paper automatically extracts properties and triples from the less explored Wikipedia categories so as to achieve a wider article coverage with less manual effort. We manage to realize this goal by utilizing the syntax and semantics brought by super-sub category pairs in Wikipedia. Our prototype implementation outputs about 10M triples with a 12-level confidence ranging from 47.0% to 96.4%, which cover 78.2% of Wikipedia articles. Among them, 1.27M triples have confidence of 96.4%. Applications can on demand use the triples with suitable confidence. use the triples with suitable confidence.
Bibtextype inproceedings  +
Citeulike 6516365  +
Doi 10.1007/978-3-540-89704-0_23  +
Has author Qiaoling Liu + , Kaifeng Xu + , Lei Zhang + , Haofen Wang + , Yiqin Yu + , Yue Pan +
Has extra keyword Semantic web + , Semantics + , Software prototyping + , Beijing + , Extraction methods + , On demands + , Prototype implementations + , Wikipedia + , Information theory +
Has remote mirror http://mathcs.emory.edu/~qliu26/docs/aswc08.pdf  +
Isbn 3540897038; 9783540897033  +
Language English +
Number of citations by publication 0  +
Number of references by publication 0  +
Pages 330–344  +
Published in Lecture Notes in Computer Science +
Title Catriple: Extracting triples from wikipedia categories +
Type conference paper  +
Volume 5367 LNCS  +
Year 2008 +
Creation dateThis property is a special property in this wiki. 7 November 2014 03:38:39  +
Categories Publications without keywords parameter  + , Publications without license parameter  + , Publications without archive mirror parameter  + , Publications without paywall mirror parameter  + , Conference papers  + , Publications without references parameter  + , Publications  +
Modification dateThis property is a special property in this wiki. 22 November 2014 19:13:06  +
DateThis property is a special property in this wiki. 2008  +
hide properties that link here 
Catriple: Extracting triples from wikipedia categories + Title
 

 

Enter the name of the page to start browsing from.