- Not to be confused with Wikimedia dumps.
(Alternative names for this dataset)
|Size||From a few MB to several GB|
|Language(s)||English, German, French, Dutch, Spanish, Chinese, Arabic, Japanese|
|Export and share|
|BibTeX, CSV, RDF, JSON|
|Browse properties · List of datasets|
wikipediaXML can be used in a large variety of XML IR tasks like ad-hoc retrieval, categorization, clustering or Structure Mapping task.
|Title||Author(s)||Keyword(s)||Published in||Language||DateThis property is a special property in this wiki.||Abstract||R||C|
|The Wikipedia XML corpus||Ludovic Denoyer
|English||2006||Wikipedia is a well known free content, multilingual encyclopedia written collaboratively by contributors around the world. Anybody can edit an article using a wiki markup language that offers a simplified alternative to HTML. This encyclopedia is composed of millions of articles in different languages.||0||1|