Graph local clustering for topic detection in web collections
|Graph local clustering for topic detection in web collections|
|Author(s)||Garza S.E., Brena R.|
|Published in||2009 Latin American Web Congress - Joint LA-WEB/CLIHC Conference|
|Keyword(s)||Graph clustering, Topic detection, Web hyperlink structure mining, Wikipedia (Extra: Document organization, Formal framework, Graph clustering, Information domains, Local clustering, Topic detection, Topic Discovery, Web collections, Web hyperlink structure, Wikipedia, Fluorine containing polymers, Hypertext systems, Clustering algorithms)|
|Article||BASE, CiteSeerX, Google Scholar|
|Web||Ask, Bing, Google (PDF), Yahoo!|
|Download and mirrors|
|Local copy||Not available|
|Remote mirror(s)||Not available|
|Export and share|
|BibTeX, CSV, RDF, JSON|
|Browse properties · List of conference papers|
Graph local clustering for topic detection in web collections is a 2009 conference paper written in English by Garza S.E., Brena R. and published in 2009 Latin American Web Congress - Joint LA-WEB/CLIHC Conference.
In the midst of a developing Web that increases its size with a constant rhythm, automatic document organization becomes important. One way to arrange documents is by categorizing them into topics. Even when there are different forms to consider topics and their extraction, a practical option is to view them as document groups and apply clustering algorithms. An attractive alternative that naturally copes with the Web size and complexity is the one proposed by graph local clustering (GLC) methods. In this paper, we define a formal framework for working with topics in hyperlinked environments and analyze the feasibility of GLC for this task. We performed tests over an important Web collection, namely Wikipedia, and our results, which were validated using various kinds of methods (some of them specific for the information domain), indicate that this approach is suitable for topic discovery.
- This section requires expansion. Please, help!
Probably, this publication is cited by others, but there are no articles available for them in WikiPapers. Cited 2 time(s)