From WikiPapers
Jump to: navigation, search

infobox2rdf generates huge RDF datasets from the infobox data in Wikipedia dump files.


Title Author(s) Keyword(s) Published in Language DateThis property is a special property in this wiki. Abstract
Extraction of RDF Dataset from Wikipedia Infobox Data Jimmy K. Chiu
Thomas Y. Lee
Sau Dan Lee
Hailey H. Zhu
David W. Cheung
English 2010 This paper outlines the cleansing and extraction process of infobox data from Wikipedia data dump into Resource Description Framework (RDF) triplets. The numbers of the extracted triplets, resources, and predicates are substantially large enough for many research purposes such as semantic web search. Our software tool will be open-sourced for researchers to produce up-to-date RDF datasets from routine Wikipedia data dumps.