Overview of the INEX 2008 XML mining track categorization and clustering of XML documents in a graph of documents
|Overview of the INEX 2008 XML mining track categorization and clustering of XML documents in a graph of documents|
|Author(s)||Denoyer L., Gallinari P.|
|Published in||Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)|
|Keyword(s)||Unknown (Extra: Classification and clustering, Content information, Developed model, Hyperlinks, Internal structure, Key problems, Machine learning techniques, Machine-learning, Semi-structured documents, Supervised classification, Unsupervised clustering, Wikipedia, XML mining, Hypertext systems, Information use, Learning algorithms, Markup languages, Robot learning, XML, Information retrieval systems)|
|Article||BASE, CiteSeerX, Google Scholar|
|Web||Ask, Bing, Google (PDF), Yahoo!|
|Download and mirrors|
|Local copy||Not available|
|Remote mirror(s)||Not available|
|Export and share|
|BibTeX, CSV, RDF, JSON|
|Browse properties · List of conference papers|
Overview of the INEX 2008 XML mining track categorization and clustering of XML documents in a graph of documents is a 2009 conference paper written in English by Denoyer L., Gallinari P. and published in Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics).
We describe here the XML Mining Track at INEX 2008. This track was launched for exploring two main ideas: first identifying key problems for mining semi-structured documents and new challenges of this emerging field and second studying and assessing the potential of machine learning techniques for dealing with generic Machine Learning (ML) tasks in the structured domain i.e. classification and clustering of semi structured documents. This year, the track focuses on the supervised classification and the unsupervised clustering of XML documents using link information. We consider a corpus of about 100,000 Wikipedia pages with the associated hyperlinks. The participants have developed models using the content information, the internal structure information of the XML documents and also the link information between documents.
- This section requires expansion. Please, help!
Probably, this publication is cited by others, but there are no articles available for them in WikiPapers. Cited 8 time(s)