Browse wiki

Jump to: navigation, search
A comparison of dimensionality reduction techniques for Web structure mining
Abstract In many domains, dimensionality reduction In many domains, dimensionality reduction techniques have been shown to be very effective for elucidating the underlying semantics of data. Thus, in this paper we investigate the use of various dimensionality reduction techniques (DRTs) to extract the implicit structures hidden in the web hyperlink connectivity. We apply and compare four DRTs, namely, Principal Component Analysis (PCA), Non-negative Matrix Factorization (NMF), Independent Component Analysis (ICA) and Random Projection (RP). Experiments conducted on three datasets allow us to assert the following: NMF outperforms PCA and ICA in terms of stability and interpretability of the discovered structures; the wellknown WebKb dataset used in a large number of works about the analysis of the hyperlink connectivity seems to be not adapted for this task and we suggest rather to use the recent Wikipedia dataset which is better suited. Wikipedia dataset which is better suited.
Abstractsub In many domains, dimensionality reduction In many domains, dimensionality reduction techniques have been shown to be very effective for elucidating the underlying semantics of data. Thus, in this paper we investigate the use of various dimensionality reduction techniques (DRTs) to extract the implicit structures hidden in the web hyperlink connectivity. We apply and compare four DRTs, namely, Principal Component Analysis (PCA), Non-negative Matrix Factorization (NMF), Independent Component Analysis (ICA) and Random Projection (RP). Experiments conducted on three datasets allow us to assert the following: NMF outperforms PCA and ICA in terms of stability and interpretability of the discovered structures; the wellknown WebKb dataset used in a large number of works about the analysis of the hyperlink connectivity seems to be not adapted for this task and we suggest rather to use the recent Wikipedia dataset which is better suited. Wikipedia dataset which is better suited.
Bibtextype inproceedings  +
Doi 10.1109/WI.2007.6  +
Has author Chikhi N.F. + , Rothenburger B. + , Aussenac-Gilles N. +
Has extra keyword Dataset + , Dimensionality reduction techniques + , Hyperlink + , Independent component Analysis (ICA) + , International conferences + , Interpretability + , Non-Negative Matrix Factorization + , Principal component analysis (PCA) + , Random projections + , Web intelligence + , Web structure mining + , Wikipedia + , Blind source separation + , Factorization + , Feature extraction + , Financial data processing + , Hemodynamics + , Hypertext systems + , Independent component analysis + , Information theory + , Matrix algebra + , Principal component analysis +
Isbn 0769530265; 9780769530260  +
Language English +
Number of citations by publication 0  +
Number of references by publication 0  +
Pages 116–119  +
Published in Proceedings of the IEEE/WIC/ACM International Conference on Web Intelligence, WI 2007 +
Title A comparison of dimensionality reduction techniques for Web structure mining +
Type conference paper  +
Year 2007 +
Creation dateThis property is a special property in this wiki. 6 November 2014 16:11:19  +
Categories Publications without keywords parameter  + , Publications without license parameter  + , Publications without remote mirror parameter  + , Publications without archive mirror parameter  + , Publications without paywall mirror parameter  + , Conference papers  + , Publications without references parameter  + , Publications  +
Modification dateThis property is a special property in this wiki. 6 November 2014 16:11:19  +
DateThis property is a special property in this wiki. 2007  +
hide properties that link here 
A comparison of dimensionality reduction techniques for Web structure mining + Title
 

 

Enter the name of the page to start browsing from.