Graph structures

From WikiPapers
Jump to: navigation, search

Graph structures is included as keyword or extra keyword in 0 datasets, 0 tools and 5 publications.

Datasets

There is no datasets for this keyword.

Tools

There is no tools for this keyword.


Publications

Title Author(s) Published in Language DateThis property is a special property in this wiki. Abstract R C
Analysis of cluster structure in large-scale English Wikipedia category networks Klaysri T.
Fenner T.
Lachish O.
Mark Levene
Papapetrou P.
Lecture Notes in Computer Science English 2013 In this paper we propose a framework for analysing the structure of a large-scale social media network, a topic of significant recent interest. Our study is focused on the Wikipedia category network, where nodes correspond to Wikipedia categories and edges connect two nodes if the nodes share at least one common page within the Wikipedia network. Moreover, each edge is given a weight that corresponds to the number of pages shared between the two categories that it connects. We study the structure of category clusters within the three complete English Wikipedia category networks from 2010 to 2012. We observe that category clusters appear in the form of well-connected components that are naturally clustered together. For each dataset we obtain a graph, which we call the t-filtered category graph, by retaining just a single edge linking each pair of categories for which the weight of the edge exceeds some specified threshold t. Our framework exploits this graph structure and identifies connected components within the t-filtered category graph. We studied the large-scale structural properties of the three Wikipedia category networks using the proposed approach. We found that the number of categories, the number of clusters of size two, and the size of the largest cluster within the graph all appear to follow power laws in the threshold t. Furthermore, for each network we found the value of the threshold t for which increasing the threshold to t + 1 caused the "giant" largest cluster to diffuse into two or more smaller clusters of significant size and studied the semantics behind this diffusion. 0 0
Crew: cross-modal resource searching by exploiting Wikipedia Chen Liu
Beng C. Ooi
Anthony K. H. Tung
Dongxiang Zhang
English 2010 In Web 2.0, users have generated and shared massive amounts of resources in various media formats, such as news, blogs, audios, photos and videos. The abundance and diversity of the resources call for better integration to improve the accessibility. A straightforward approach is to link the resources via tags so that resources from different modals sharing the same tag can be connected as a graph structure. This naturally motivates a new kind of information retrieval system, named cross-modal resource search, in which given a query object from any modal, all the related resources from other modals can be retrieved in a convenient manner. However, due to the tag homonym and synonym, such an approach returns results of low quality because resources with the same tag but not semantically related will be directly connected as well. In this paper, we propose to build the resource graph and perform query processing by exploiting Wikipedia. We construct a concept middle-ware between the layer of tags and resources to fully capture the semantic meaning of the resources. Such a cross-modal search system based on Wikipedia, named Crew, is built and demonstrates promising search results. 0 0
Learning concept graphs from text with stick-breaking priors Chambers A.L.
Smyth P.
Steyvers M.
Advances in Neural Information Processing Systems 23: 24th Annual Conference on Neural Information Processing Systems 2010, NIPS 2010 English 2010 We present a generative probabilistic model for learning general graph structures, which we term concept graphs, from text. Concept graphs provide a visual summary of the thematic content of a collection of documents-a task that is difficult to accomplish using only keyword search. The proposed model can learn different types of concept graph structures and is capable of utilizing partial prior knowledge about graph structure as well as labeled documents. We describe a generative model that is based on a stick-breaking process for graphs, and a Markov Chain Monte Carlo inference procedure. Experiments on simulated data show that the model can recover known graph structure when learning in both unsupervised and semi-supervised modes. We also show that the proposed model is competitive in terms of empirical log likelihood with existing structure-based topic models (hPAM and hLDA) on real-world text data sets. Finally, we illustrate the application of the model to the problem of updating Wikipedia category graphs. 0 0
Wikipedia as sense inventory to improve diversity in Web search results Santamaria C.
Gonzalo J.
Artiles J.
ACL 2010 - 48th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference English 2010 Is it possible to use sense inventories to improve Web search results diversity for one word queries? To answer this question, we focus on two broad-coverage lexical resources of a different nature: Word- Net, as a de-facto standard used in Word Sense Disambiguation experiments; and Wikipedia, as a large coverage, updated encyclopaedic resource which may have a better coverage of relevant senses in Web pages. Our results indicate that (i) Wikipedia has a much better coverage of search results, (ii) the distribution of senses in search results can be estimated using the internal graph structure of the Wikipedia and the relative number of visits received by each sense in Wikipedia, and (iii) associating Web pages to Wikipedia senses with simple and efficient algorithms, we can produce modified rankings that cover 70% more Wikipedia senses than the original search engine rankings. 0 0
A graph-based approach to named entity categorization in Wikipedia using conditional random fields Watanabe Y.
Asahara M.
Matsumoto Y.
EMNLP-CoNLL 2007 - Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning English 2007 This paper presents a method for categorizing named entities in Wikipedia. In Wikipedia, an anchor text is glossed in a linked HTML text. We formalize named entity categorization as a task of categorizing anchor texts with linked HTML texts which glosses a named entity. Using this representation, we introduce a graph structure in which anchor texts are regarded as nodes. In order to incorporate HTML structure on the graph, three types of cliques are defined based on the HTML tree structure. We propose a method with Conditional Random Fields (CRFs) to categorize the nodes on the graph. Since the defined graph may include cycles, the exact inference of CRFs is computationally expensive. We introduce an approximate inference method using Treebased Reparameterization (TRP) to reduce computational cost. In experiments, our proposed model obtained significant improvements compare to baseline models that use Support Vector Machines. 0 0