wikipediaXML can be used in a large variety of XML IR tasks like ad-hoc retrieval, categorization, clustering or Structure Mapping task.


Title Author(s)
The Wikipedia XML corpus Ludovic Denoyer
Patrick Gallinari
English 2006 Wikipedia is a well known free content, multilingual encyclopedia written collaboratively by contributors around the world. Anybody can edit an article using a wiki markup language that offers a simplified alternative to HTML. This encyclopedia is composed of millions of articles in different languages. 0 1