Chao Wang

From WikiPapers
Jump to: navigation, search

Chao Wang is an author.


Only those publications related to wikis are shown here.
Title Keyword(s) Published in Language DateThis property is a special property in this wiki. Abstract R C
Inferring attitude in online social networks based on quadratic correlation Machine learning
Quadratic optimization
Signed Networks
Lecture Notes in Computer Science English 2014 The structure of an online social network in most cases cannot be described just by links between its members. We study online social networks, in which members may have certain attitude, positive or negative, toward each other, and so the network consists of a mixture of both positive and negative relationships. Our goal is to predict the sign of a given relationship based on the evidences provided in the current snapshot of the network. More precisely, using machine learning techniques we develop a model that after being trained on a particular network predicts the sign of an unknown or hidden link. The model uses relationships and influences from peers as evidences for the guess, however, the set of peers used is not predefined but rather learned during the training process. We use quadratic correlation between peer members to train the predictor. The model is tested on popular online datasets such as Epinions, Slashdot, and Wikipedia. In many cases it shows almost perfect prediction accuracy. Moreover, our model can also be efficiently updated as the underlying social network evolves. 0 0
Network positions and contributions to online public goods: The case of Chinese wikipedia Effort allocation
Mass collaboration
Natural experiment
Network centrality
Online public goods
Journal of Management Information Systems English 2012 We study the effect of collaboration network structure on the contribution behavior of participating editors in Wikipedia. Collaboration in Wikipedia is organized around articles, and any two editors co-editing an article have a collaborative relationship. Based on the economic theories about network games and social role theory, we propose that an editor's position in the collaboration network influences the editor's decisions about her total contribution as well as the allocation of her efforts. By leveraging panel data collected from the Chinese language version of Wikipedia and a natural experiment resulting from blocking it in mainland China, we find strong support for the proposed effect of network position on contribution behavior. Our analysis further reveals that different aspects of an individual's network position have distinct implications. This research enhances our understanding about how collaboration network structure shapes individual behavior in online mass collaboration platforms. © 2012 M.E. Sharpe, Inc. All rights reserved. 0 0
Network centrality and contributions to online public good - The case of Chinese Wikipedia Proceedings of the Annual Hawaii International Conference on System Sciences English 2011 Internet technology enables virtual collaboration and plays an important role in knowledge production. However, collaborative technology will not function without conducive underlying social mechanisms. Previous research mostly investigates individual-level motivations of editors, with only a few exceptions examining the collaboration relationships. In this paper, we take a structural perspective and investigate the impact of positions (centralities) of editors in the collaboration networks on their efforts and effort allocations. To achieve this, we empirically reconstruct the dynamic collaboration network of Chinese Wikipedia for the period between 2002 and 2007. Based on a dynamic view of the network, we compose a panel data set that covers both the contribution behavior and network position characteristics of Wikipedia editors. We strengthen our causal interpretation by leveraging the exogenous block that prevented Wikipedia editors in Mainland China from accessing the website. We find distinctive effort allocation patterns that strongly correlate with network centrality measures. This confirms theoretical predictions derived in recent developments in network economics and social network theories. This research enhances our understanding about how collaboration network structure shapes individuals' behavior in online collaboration platforms. 0 0
Probabilistic quality assessment based on article's revision history Lecture Notes in Computer Science English 2011 The collaborative efforts of users in social media services such as Wikipedia have led to an explosion in user-generated content and how to automatically tag the quality of the content is an eminent concern now. Actually each article is usually undergoing a series of revision phases and the articles of different quality classes exhibit specific revision cycle patterns. We propose to Assess Quality based on Revision History (AQRH) for a specific domain as follows. First, we borrow Hidden Markov Model (HMM) to turn each article's revision history into a revision state sequence. Then, for each quality class its revision cycle patterns are extracted and are clustered into quality corpora. Finally, article's quality is thereby gauged by comparing the article's state sequence with the patterns of pre-classified documents in probabilistic sense. We conduct experiments on a set of Wikipedia articles and the results demonstrate that our method can accurately and objectively capture web article's quality. 0 0
Probabilistic quality assessment of articles based on learning editing patterns Data quality
Quality assessment
Web article
2011 International Conference on Computer Science and Service System, CSSS 2011 - Proceedings English 2011 As a new model of distributed, collaborative information source, such as Wikipedia, is emerging, its content is constantly being generated, updated and maintained by various users and its data quality varies from time to time. Thus the quality assessment of the content is a pressing concern now. We observe that each article usually goes through a series of editing phases such as building structure, contributing text, discussing text, etc., gradually getting into the final quality state and that the articles of different quality classes exhibit specific edit cycle patterns. We propose a new approach to Assess Quality based on article's Editing History (AQEH) for a specific domain as follows. First, each article's editing history is transformed into a state sequence borrowing HiddenMarkov Model(HMM). Second, edit cycle patterns are first extracted for each quality class and then each quality class is further refined into quality corpora by clustering. Now, each quality class is clearly represented by a series of quality corpora and each quality corpus is described by a group of frequently co-occurring edit cycle patterns. Finally, article quality can be determined in probabilistic sense by comparing the article with the quality corpora. Experimental results demonstrate that our method can capture and predict web article's quality accurately and objectively. 0 0
Relation extraction with relation topics EMNLP 2011 - Conference on Empirical Methods in Natural Language Processing, Proceedings of the Conference English 2011 This paper describes a novel approach to the semantic relation detection problem. Instead of relying only on the training instances for a new relation, we leverage the knowledge learned from previously trained relation detectors. Specifically, we detect a new semantic relation by projecting the new relation's training instances onto a lower dimension topic space constructed from existing relation detectors through a three step process. First, we construct a large relation repository of more than 7,000 relations from Wikipedia. Second, we construct a set of non-redundant relation topics defined at multiple scales from the relation repository to characterize the existing relations. Similar to the topics defined over words, each relation topic is an interpretable multinomial distribution over the existing relations. Third, we integrate the relation topics in a kernel function, and use it together with SVM to construct detectors for new relations. The experimental results on Wikipedia and ACE data have confirmed that background knowledge-based topics generated from the Wikipedia relation repository can significantly improve the performance over the state-of-the-art relation detection approaches. 0 0
Web article quality assessment in multi-dimensional space Lecture Notes in Computer Science English 2011 Nowadays user-generated content (UGC) such as Wikipedia, is emerging on the web at an explosive rate, but its data quality varies dramatically. How to effectively rate the article's quality is the focus of research and industry communities. Considering that each quality class demonstrates its specific characteristics on different quality dimensions, we propose to learn the web quality corpus by taking different quality dimensions into consideration. Each article is regarded as an aggregation of sections and each section's quality is modelled using Dynamic Bayesian Network(DBN) with reference to accuracy, completeness and consistency. Each quality class is represented by three dimension corpora, namely accuracy corpus, completeness corpus and consistency corpus. Finally we propose two schemes to compute quality ranking. Experiments show our approach performs well. 0 0
Efficient indices using graph partitioning in RDF triple stores Proceedings - International Conference on Data Engineering English 2009 With the advance of the Semantic Web, varying RDF data were increasingly generated, published, queried, and reused via the Web. For example, the DBpedia, a community effort to extract structured data from Wikipedia articles, broke 100 million RDF triples in its latest release. Initiated by Tim Berners-Lee, likewise, the Linking Open Data (LOD) project has published and interlinked many open licence datasets which consisted of over 2 billion RDF triples so far. In this context, fast query response over such large scaled data would be one of the challenges to existing RDF data stores. In this paper, we propose a novel triple indexing scheme to help RDF query engine fast locate the instances within a small scope. By considering the RDF data as a graph, we would partition the graph into multiple subgraph pieces and store them individually, over which a signature tree would be built up to index the URIs. When a query arrives, the signature tree index is used to fast locate the partitions that might include the matches of the query by its constant URIs. Our experiments indicate that the indexing scheme dramatically reduces the query processing time in most cases because many partitions would be early filtered out and the expensive exact matching is only performed over a quite small scope against the original dataset. 0 0
Community tools for repurposing learning objects Community of practice
Contextual metadata
Learning objects
Lecture Notes in Computer Science English 2007 A critical success factor for the reuse of learning objects is the ease by which they may be repurposed in order to enable reusability in a different teaching context from which they were originally designed. The current generation of tools for creating, storing, describing and locating learning objects are best suited for users with technical expertise. Such tools are an obstacle to teachers who might wish to perform alterations to learning objects in order to make them suitable for their context. In this paper we describe a simple set of tools to enable practitioners to adapt the content of existing learning objects and to store and modify metadata describing the intended teaching context of these learning objects. We are deploying and evaluating these tools within the UK language teaching community. 0 0
Creating and managing ontology data on the web: A semantic wiki approach Lecture Notes in Computer Science English 2007 The creation of ontology data on web sites and proper management of them would help the growth of the semantic web. This paper proposes a semantic wiki approach to tackle this issue. Desirable functions that a semantic wiki approach should implement to offer a better solution to this issue are discussed. Along with that, some key problems such as usability, data reliability and data quality are identified and analyzed. Based on that, a system framework is presented to show how such functions are designed. These functions are further explained along with the description of our implemented prototype system. By addressing the identified key problems, our semantic wiki approach is expected to be able to create and manage web ontology data more effectively. 0 0
Creating and managing ontology data on the web: a semantic wiki approach WISE English 2007 0 0