Xiaodan Zhang

From WikiPapers
Jump to: navigation, search

Xiaodan Zhang is an author.

Publications

Only those publications related to wikis are shown here.
Title Keyword(s) Published in Language DateThis property is a special property in this wiki. Abstract R C
Mining knowledge on relationships between objects from the web Content-based image retrieval
Knowledge retrieval
Relationships between objects
Data mining
IEICE Transactions on Information and Systems English 2014 How do global warming and agriculture influence each other? It is possible to answer the question by searching knowledge about the relationship between global warming and agriculture. As exemplified by this question, strong demands exist for searching relationships between objects. Mining knowledge about relationships on Wikipedia has been studied. However, it is desired to search more diverse knowledge about relationships on theWeb. By utilizing the objects constituting relationships mined from Wikipedia, we propose a new method to search images with surrounding text that include knowledge about relationships on the Web. Experimental results show that our method is effective and applicable in searching knowledge about relationships. We also construct a relationship search system named "Enishi" based on the proposed new method. Enishi supplies a wealth of diverse knowledge including images with surrounding text to help users to understand relationships deeply, by complementarily utilizing knowledge from Wikipedia and the Web. Copyright 0 0
A generalized flow-based method for analysis of implicit relationships on wikipedia Generalized flow
Link analysis
Relationship
Data mining
IEEE Transactions on Knowledge and Data Engineering English 2013 We focus on measuring relationships between pairs of objects in Wikipedia whose pages can be regarded as individual objects. Two kinds of relationships between two objects exist: in Wikipedia, an explicit relationship is represented by a single link between the two pages for the objects, and an implicit relationship is represented by a link structure containing the two pages. Some of the previously proposed methods for measuring relationships are cohesion-based methods, which underestimate objects having high degrees, although such objects could be important in constituting relationships in Wikipedia. The other methods are inadequate for measuring implicit relationships because they use only one or two of the following three important factors: distance, connectivity, and cocitation. We propose a new method using a generalized maximum flow which reflects all the three factors and does not underestimate objects having high degree. We confirm through experiments that our method can measure the strength of a relationship more appropriately than these previously proposed methods do. Another remarkable aspect of our method is mining elucidatory objects, that is, objects constituting a relationship. We explain that mining elucidatory objects would open a novel way to deeply understand a relationship. 0 0
Towards accurate distant supervision for relational facts extraction ACL 2013 - 51st Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference English 2013 Distant supervision (DS) is an appealing learning method which learns from existing relational facts to extract more from a text corpus. However, the accuracy is still not satisfying. In this paper, we point out and analyze some critical factors in DS which have great impact on accuracy, including valid entity type detection, negative training examples construction and ensembles. We propose an approach to handle these factors. By experimenting on Wikipedia articles to extract the facts in Freebase (the top 92 relations), we show the impact of these three factors on the accuracy of DS and the remarkable improvement led by the proposed approach. 0 0
Research on the construction of open education resources based on semantic wiki Open education resources
Resource co-construction
Semantic wiki
Lecture Notes in Computer Science English 2012 Since the MIT's OpenCourseWare project in 2001, open education resources movement has gone through more than ten years. Except for the fruitful results, some problems of resource construction are also exposed. Part of open education resources projects cannot be carried out or even were forced to drop out for a shortage of personnel or funds. A lack of uniform norms or standards leads to the duplication of resource construction and low resource utilization. Semantic media Wiki combines the openness, self-organization and collaboration of Wiki with the structured knowledge in the Semantic Web, which meets the needs of resource co-construction and sharing in open education resources movement. In this study, based on the online course Education Information Processing, we explore the Semantic MediaWiki's application in the open education resources construction. 0 0
Location-based information fusion for mobile navigation Location-based social network
Navigation
UbiComp'11 - Proceedings of the 2011 ACM Conference on Ubiquitous Computing English 2011 Comprehensive yet personalized information for a location is usually desired by mobile users in situ. Traditional navigation systems provide complete static information, such as address, contact, even photos and reviews for a certain place. However, such information does not reflect the real time situation (e.g. popularity/crowdness). Location-based social networks provide opportunity to build social dynamics between the place and potential visitors. In this work, we propose a design by leveraging public online information with users' social network resources to provide real time exploration in novel environments. A mobile application is implemented using Wikipedia, Panoramio, and Foursquare data to provide complete, updated, and trustworthy information. Design highlights and implementation are reported. 0 0
Network centrality and contributions to online public good - The case of Chinese Wikipedia Proceedings of the Annual Hawaii International Conference on System Sciences English 2011 Internet technology enables virtual collaboration and plays an important role in knowledge production. However, collaborative technology will not function without conducive underlying social mechanisms. Previous research mostly investigates individual-level motivations of editors, with only a few exceptions examining the collaboration relationships. In this paper, we take a structural perspective and investigate the impact of positions (centralities) of editors in the collaboration networks on their efforts and effort allocations. To achieve this, we empirically reconstruct the dynamic collaboration network of Chinese Wikipedia for the period between 2002 and 2007. Based on a dynamic view of the network, we compose a panel data set that covers both the contribution behavior and network position characteristics of Wikipedia editors. We strengthen our causal interpretation by leveraging the exogenous block that prevented Wikipedia editors in Mainland China from accessing the website. We find distinctive effort allocation patterns that strongly correlate with network centrality measures. This confirms theoretical predictions derived in recent developments in network economics and social network theories. This research enhances our understanding about how collaboration network structure shapes individuals' behavior in online collaboration platforms. 0 0
Analysis of implicit relations on wikipedia: Measuring strength through mining elucidatory objects Generalized flow
Link analysis
Relation
Data mining
Lecture Notes in Computer Science English 2010 We focus on measuring relations between pairs of objects in Wikipedia whose pages can be regarded as individual objects. Two kinds of relations between two objects exist: in Wikipedia, an explicit relation is represented by a single link between the two pages for the objects, and an implicit relation is represented by a link structure containing the two pages. Previously proposed methods are inadequate for measuring implicit relations because they use only one or two of the following three important factors: distance, connectivity, and co-citation. We propose a new method reflecting all the three factors by using a generalized maximum flow. We confirm that our method can measure the strength of a relation more appropriately than these previously proposed methods do. Another remarkable aspect of our method is mining elucidatory objects, that is, objects constituting a relation. We explain that mining elucidatory objects opens a novel way to deeply understand a relation. 0 0
Enishi: Searching knowledge about relations by complementarily utilizing wikipedia and the web Knowledge retrieval
Relation
Data mining
Lecture Notes in Computer Science English 2010 How global warming and agriculture mutually influence each other? It is possible to answer the question by searching knowledge about the relation between global warming and agriculture. As exemplified by this question, strong demands exist for searching relations between objects. However, methods or systems for searching relations are not well studied. In this paper, we propose a relation search system named "Enishi." Enishi supplies a wealth of diverse multimedia information for deep understanding of relations between two objects by complementarily utilizing knowledge from Wikipedia and the Web. Enishi first mines elucidatory objects constituting relations between two objects from Wikipedia. We then propose new approaches for Enishi to search more multimedia information about relations on the Web using elucidatory objects. Finally, we confirm through experiments that our new methods can search useful information from the Web for deep understanding of relations. 0 0
Mining and explaining relationships in Wikipedia Generalized max-flow
Link analysis
Relationship
Data mining
Lecture Notes in Computer Science English 2010 Mining and explaining relationships between objects are challenging tasks in the field of knowledge search. We propose a new approach for the tasks using disjoint paths formed by links in Wikipedia. To realizing this approach, we propose a naive and a generalized flow based method, and a technique of avoiding flow confluences for forcing a generalized flow to be disjoint as possible. We also apply the approach to classification of relationships. Our experiments reveal that the generalized flow based method can mine many disjoint paths important for a relationship, and the classification is effective for explaining relationships. 0 0
Exploiting Wikipedia as external knowledge for document clustering English 2009 In traditional text clustering methods, documents are represented as "bags of words" without considering the semantic information of each document. For instance, if two documents use different collections of core words to represent the same topic, they may be falsely assigned to different clusters due to the lack of shared core words, although the core words they use are probably synonyms or semantically associated in other forms. The most common way to solve this problem is to enrich document representation with the background knowledge in an ontology. There are two major issues for this approach: (1) the coverage of the ontology is limited, even for WordNet or Mesh, (2) using ontology terms as replacement or additional features may cause information loss, or introduce noise. In this paper, we present a novel text clustering method to address these two issues by enriching document representation with Wikipedia concept and category information. We develop two approaches, exact match and relatedness-match, to map text documents to Wikipedia concepts, and further to Wikipedia categories. Then the text documents are clustered based on a similarity metric which combines document content information, concept information as well as category information. The experimental results using the proposed clustering framework on three datasets (20-newsgroup, TDT2, and LA Times) show that clustering performance improves significantly by enriching document representation with Wikipedia concepts and categories. 0 0
Google challenge: Incremental-learning for web video categorization on robust semantic feature space N-ISVM
Web video categorization
Wikipedia propagation
MM'09 - Proceedings of the 2009 ACM Multimedia Conference, with Co-located Workshops and Symposiums English 2009 With the advent of video sharing websites, the amount of videos on the internet grows rapidly. Web video categorization is an efficient methodology to organize the huge amount of data. In this paper, we propose an effective web video categorization algorithm for the large scale dataset. It includes two factors: 1) For the great diversity of web videos, we develop an effective semantic feature space called Concept Collection for Web Video Categorization (CCWV-CD) to represent web videos, which consists of concepts with small semantic gap and high distinguishing ability. Meanwhile, the online Wikipedia API is employed to diffuse the concept correlations in this space. 2) We propose an incremental support vector machine with fixed number of support vectors (n-ISVM) to fit the large scale incremental learning problem in web video categorization. Extensive experiments are conducted on the dataset of 80024 most representative videos on YouTube demonstrate that the semantic space with Wikipedia prorogation is more representative for web videos, and n-ISVM outperforms other algorithms in efficiency when performs the incremental learning. 0 0
Large scale incremental web video categorization Concept collection
Incremental learning
Large scale
N-ISVM
Similarity measurement
Web video categorization
1st International Workshop on Web-Scale Multimedia Corpus, WSMC'09, Co-located with the 2009 ACM International Conference on Multimedia, MM'09 English 2009 With the advent of video sharing websites, the amount of videos on the internet grows rapidly. Web video categorization is an efficient methodology for organizing the huge amount of videos. In this paper we investigate the characteristics of web videos, and make two contributions for the large scale incremental web video categorization. First, we develop an effective semantic feature space Concept Collection for Web Video with Categorization Distinguishability (CCWV-CD), which is consisted of concepts with small semantic gap, and the concept correlations are diffused by a novel Wikipedia Propagation (WP) method. Second, we propose an incremental support vector machine with fixed number of support vectors (n-ISVM) for large scale incremental learning. To evaluate the performance of CCWV-CD, WP and n-ISVM, we conduct extensive experiments on the dataset of 80,021 most representative videos on a video sharing website. The experiment results show that the CCWV-CD and WP is more representative for web videos, and the n-ISVM algorithm greatly improves the efficiency in the situation of incremental learning. Copyright 2009 ACM. 0 0
Metadata and multilinguality in video classification Speech recognition
SVM
Video classification
Lecture Notes in Computer Science English 2009 The VideoCLEF 2008 Vid2RSS task involves the assignment of thematic category labels to dual language (Dutch/English) television episode videos. The University of Amsterdam chose to focus on exploiting archival metadata and speech transcripts generated by both Dutch and English speech recognizers. A Support Vector Machine (SVM) classifier was trained on training data collected from Wikipedia. The results provide evidence that combining archival metadata with speech transcripts can improve classification performance, but that adding speech transcripts in an additional language does not yield performance gains. 0 0
The update version development of "Wiki Message Linking" system integrated ajax with mvc model Ajax
JSP
MVC model
Web applicaton
Wiki message linking
IFCSTA 2009 Proceedings - 2009 International Forum on Computer Science-Technology and Applications English 2009 This paper described the using of Ajax technology to improve the functions of the 'Wiki Message Linking' system which was developed as study tool for the IT group in California State University, San Bernardino in 2005. It is a system extending the wiki paradigm with the characters to allow users to freely create, edit their content and link them into the evolving body of content by using any regular web browser in a quick and easy way. We adopted the MVC pattern with Ajax technology in this new version. Instead of loading a webpage, the browser loads an Ajax engine. The Ajax engine allows the user's interaction with the application to happen asynchronously. So the user is never waiting around for the server to do something. 0 0
A web 2.0 based computer knowledge learning platform Blogs
Learning platform
RSS
Web 2.0
Wiki
Proceedings - International Conference on Computer Science and Software Engineering, CSSE 2008 English 2008 Traditional web-based online learning systems usually focus on the dispatch of knowledge, and lack of ways for students to get involved. Introduction to Computer Basics (ICB) is one of the first professional courses for freshmen majored in computer science, as well as information technology. To make the learning platform of ICB more helpful, a Web 2.0 based computer knowledge learning platform is presented, which changes the focus from course content to the students participation. Web2.0 elements including personal and group spaces, wiki cyclopedia, interest mining and personalized recommendation, and RSS resource subscription are integrated. The platform has been put into use already, and got satisfaction from both teachers and students. 0 0