|Arabic · Catalan · Chinese · Dutch · English · Finnish · French|
This is a list of publications in Chinese. Currently, there are 21 publications in this language.
Some links about wiki research in Chinese:
- Growth of Academic Interest in Wikipedia From Major Chinese-speaking regions: Has it peaked? (Archived at WebCitation)
- List wiki - PhD and Master’s theses that study Wiki or other Wiki-related topics (Archived at WebCitation)
- List zh_UGE - PhD and Master’s theses that study Chinese Wikipedia or other UGE (Archived at WebCitation)
- List UGE - PhD and Master’s theses that study Wikipedia or other UGE (Archived at WebCitation)
|Title||Author(s)||Keyword(s)||Published in||DateThis property is a special property in this wiki.||Abstract||R||C|
|Entity ranking based on Wikipedia for related entity finding||Jinghua Zhang
Related entity finding
|Jisuanji Yanjiu yu Fazhan/Computer Research and Development||2014||Entity ranking is a very important step for related entity finding (REF). Although researchers have done many works about "entity ranking based on Wikipedia for REF", there still exists some issues: the semi-automatic acquirement of target-type, the coarse-grained target-type, the binary judgment of entity-type relevancy and ignoring the effects of stop words in calculation of entity-relation relevancy. This paper designs a framework, which ranks entities through the calculation of a triple-combination (including entity relevancy, entity-type relevancy and entity-relation relevancy) and acquires the best combination-method through the comparisons of experimental results. A novel approach is proposed to calculate the entity-type relevancy. It can automatically acquire the fine-grained target-type and the discriminative rules of its hyponym Wikipedia-categories through inductive learning, and calculate entity-type relevancy through counting the number of categories which meet the discriminative rules. Also, this paper proposes a "cut stop words to rebuild relation" approach to calculate the entity-relation relevancy between candidate entity and source entity. Experiment results demonstrate that the proposed approaches can effectively improve the entity-ranking results and reduce the time consumed in calculating.||0||0|
|Semi-automatic construction of plane geometry ontology based-on WordNet and Wikipedia||Fu H.-G.
|Dianzi Keji Daxue Xuebao/Journal of the University of Electronic Science and Technology of China||2014||Ontology, as a member of the Semantic Web's hierarchical structure, is located in the central position. Regarding the current research situation of ontology construction, the manual construction is difficult to ensure its efficiency and scalability; and the automatic construction is hard to guarantee its interoperability. This paper presents a semi-automatic domain ontology construction method based on WordNet and Wikipedia. First, we construct the top-level ontology and then reuse WordNet structure to expand the terminology and terminology-level at the depth of the ontology. Furthermore, we expand the relationship and supplement the terminology at the width of the ontology by referring to page information of Wikipedia. Finally, this method of ontology construction is applied in elementary geometry domain. The experiments show that this method can greatly improve the efficiency of ontology construction and ensure the quality of the ontology to some degree.||0||0|
|Topic modeling approach to named entity linking||Huai B.-X.
|Named entity linking
Probabilistic topic models
|Ruan Jian Xue Bao/Journal of Software||2014||Named entity linking (NEL) is an advanced technology which links a given named entity to an unambiguous entity in the knowledge base, and thus plays an important role in a wide range of Internet services, such as online recommender systems and Web search engines. However, with the explosive increasing of online information and applications, traditional solutions of NEL are facing more and more challenges towards linking accuracy due to the large number of online entities. Moreover, the entities are usually associated with different semantic topics (e.g., the entity "Apple" could be either a fruit or a brand) whereas the latent topic distributions of words and entities in same documents should be similar. To address this issue, this paper proposes a novel topic modeling approach to named entity linking. Different from existing works, the new approach provides a comprehensive framework for NEL and can uncover the semantic relationship between documents and named entities. Specifically, it first builds a knowledge base of unambiguous entities with the help of Wikipedia. Then, it proposes a novel bipartite topic model to capture the latent topic distribution between entities and documents. Therefore, given a new named entity, the new approach can link it to the unambiguous entity in the knowledge base by calculating their semantic similarity with respect to latent topics. Finally, the paper conducts extensive experiments on a real-world data set to evaluate our approach for named entity linking. Experimental results clearly show that the proposed approach outperforms other state-of-the-art baselines with a significant margin.||0||0|
|A collaboration effectiveness and Easiness Evaluation Method for RE-specific wikis based on Cognition-Behavior Consistency Decision Triangle||Peng R.
|Cognition-behavior consistency decision triangle
Collaboration effectiveness and easiness evaluation
|Jisuanji Xuebao/Chinese Journal of Computers||2013||Wiki technology, represented by Wikipedia, has attracted serious concern due to its capability to support collaboratively online contents' creation in a flexible and simple manner. Under the guidance of Wiki technology, developing specific wiki-based requirements management tools, namely RE-specific wikis, through extending various open source wikis to support distributed requirements engineering activities becomes a hot research topic. Many RE-specific wikis, such as RE-Wiki, SOP-Wiki and WikiWinWin, have been developed. But how to evaluate its collaboration effectiveness and easiness still needs further study. Based on Cognition-Behavior Consistency Decision Triangle (CBCDT), a Collaboration Effectiveness and Easiness Evaluation Method (CE3M) for evaluating RE-specific wikis is proposed. As to a specific RE-specific wiki, it evaluates the consistencies from three aspects: the expectations of its designers, the cognitions of its users and the behavior significations of its users. Specifically, the expectations of its designers and the cognitions of users are got from investigation; the behavior significations are gained from experts' investigation according to their opinions on the statistical data of the users' collaboration behaviors. And then, the consistencies' evaluations based on statistical hypothesis testing are performed. Through the case study, it shows that CE3M is appropriate to discover the similarities and differences among the expectations, cognitions and behaviors. These insights gained can be used as the objective evidences of RE-specific wiki's evolution decisions.||0||0|
|Evolution of peer production system based on limited matching and preferential selection||Li X.
Peer production system
|Shanghai Ligong Daxue Xuebao/Journal of University of Shanghai for Science and Technology||2013||Based on the real background of Wikipedia adopted as a classic peer production system and many users taking part in its editing, the two characteristics of preferential selection and limited matching during the editing process were considered. Two rules for " preferential selection" and " limited matching" and the evolving model of peer production system were presented. The analysis was based on computational experiments on the times of page editing, the status variation of pages and users, the affection of matching degree on page editing times, etc. The computational experiments show that the Wikipedia system evolves to a stable status under the action of the two rules. In the stable status, the times of page editing follow power-law distribution; the difference between user's status and page status(i. e. the matching degree)is toward to zero; the larger the matching degree of user and page, the smaller the power index of power-law distribution, so the longer the tail of power-law distribution.||0||0|
|Learning to rank concept annotation for text||Tu X.
Explicit semantic analysis
Learning to ranking
|Beijing Daxue Xuebao (Ziran Kexue Ban)/Acta Scientiarum Naturalium Universitatis Pekinensis||2013||This paper proposed an automatic text annotation method (CRM, concept ranking model) based on learning to ranking model. Firstly the authors built a training set of concept annotation manualy, and then used the Ranking SVM algorithm to generate concept ranking model, finally the concept ranking model was used to generate concept annotation for any texts. Experiments show that proposed method has a significant improvement in various indicators compared to traditional annotation methods, and concept annotation results is closer to human annotation.||0||0|
|Model and simulation of collective collaboration article edit in wikipedia based on CAS theory||Zhao D.-J.
|Collective collaboration article edit
Complex adaptive system (CAS)
Model and simulation
Web collective intelligence
|Shanghai Ligong Daxue Xuebao/Journal of University of Shanghai for Science and Technology||2012||Wikipedia users were classified into five agents, including content creator, content modifier, content cleaner, diverse editor and content visitor, and a collective collaboration article edit model was established based on the complex adaptive system theory. The multi-agent simulation of collective collaboration article edit was achieved by using Netlogo software based on the configuration of agent appearance probabilities of different quality articles. Simulation results show that diverse editor is an important driving force for the improvement of article quality; the bigger the appearance probability of diverse editors is, the higher the article quality is. The self-modifying behavior of editors plays an important role in promoting article quality. When the configuration of agent appearance probabilities follows the golden section law, the collectiveperformance can tend to be maximum. There exists a process from seesaw-like complementarity to dynamic balance between word quantity and word meaning, the critical point of balance is close to the golden section point, and the article edit evolution follows the golden section law. The research deepens the knowledge of article edit evolution, web collective intelligence and social computing.||0||0|
|Survey on statics of Wikipedia||Deyi Li
|Wuhan Daxue Xuebao (Xinxi Kexue Ban)/Geomatics and Information Science of Wuhan University||2012||This paper mainly focuses on the Wikipedia, a collaborative editing pattern in Web 2. 0. The articles, editors and the editing relationships between the two ones are three important components in Wikipedia statistical analysis. We collected different kinds of statistical tools, methods and results, and further analyzed the problems in the current statistics researches and discussed the possible resolutions.||0||0|
|WSR: A semantic relatedness measure based on Wikipedia structure||Sun C.-C.
|Article referenced network
|Jisuanji Xuebao/Chinese Journal of Computers||2012||This paper proposes a semantic relatedness measure based on Wikipedia structure: WikiStruRel (WSR). Nowadays, Wikipedia is the largest and the fastest-growing online encyclopedia, consisting of two net-like structures: an article referenced network and a category tree (actually a tree-like graph), which include lots of explicitly defined semantic information. WSR explicitly analyzes the article referenced network and the category tree from Wikipedia and computes semantic relatedness between words. While WSR achieves effective accuracy and large coverage by testing on three common datasets, the measure doesn't have to deal with text, resulting in low cost.||0||0|
|A named entity mining method based on transfer learning||Zhai H.-J.
|Named entity mining
One class learning
|Shanghai Jiaotong Daxue Xuebao/Journal of Shanghai Jiaotong University||2011||This paper addresses the problem of mining named entities from query logs. A novel scheme was introduced based on transfer learning, which trains classifier for target category by leveraging Wikipedia data source. In this way it can greatly make use of supervised learning and also deal with the large scale labeling problem. The experiment results show the effectiveness of the novel scheme based on transfer learning.||0||0|
|Dissemination and control model of internet public opinion in the ubiquitous media environments||Chen B.
Internet public opinion
Ubiquitous media environments
|Xitong Gongcheng Lilun yu Shijian/System Engineering Theory and Practice||2011||The gradual formation of the ubiquitous media environment has a profound effect on the dissemination and control of internet public opinion. This paper presents a novel propagation model with direct immune named SEIR, and the traditional epidemic model is generalized to the ubiquitous media environment. The existing models process the netizens' state and treat the propagation media of public opinion in rather simple ways. The proposed novel model has overcome the defects of the existing models. The equilibrium point and stability of the model are proved, and the evolution rules are analyzed. The given control methods are starting control from the internet public opinion environment, and have early intervention in the formation of public opinion. This paper then constructs an information propagation platform with self-purification capacity applying Wiki technology. The simulations to the opinion control efficacy of the platform are accomplished, so the effectiveness of the control methods in the internet public opinion is verified.||0||0|
|Group interests and their correlations mining based on Wikipedia||Zhang H.-S.
General tree of interests
Social network mining
|Jisuanji Xuebao/Chinese Journal of Computers||2011||Personalized recommendation technologies, such as collaborative filtering and content based filtering, face some problems. The obvious ones are the privacy history data collection and cold start. In this paper, we suggest a group interests mining method from Wikipedia. We also apply the group interests into the recommendation system, which avoid the cold start, and don't need any privacy data. Here, the group interest replaces the personalized interest in the traditional personalized recommendation technologies. In detail, we first suggest a general tree structure and a growing strategy to denote the interest of a users group, which includes the semantic relationship of each interest. Then we define the group interest based on the structure of users groups. At last, we measure the correlations of interests according to the general tree structure of interests. We further design three types of experiment to evaluate the reasonability of group interests, which is manual evaluation, test set evaluation and a news recommendation experiment in video service. The results show that, the accuracy of correlation between group interests can be more than 50%, and the news hits rate on the recommendation from group interests is 2 times larger than that on the recommendation from news popularity.||0||0|
|Human dynamics analysis in online collaborative writing||Fei Zhao
Online collaborative writing
|Wuli Xuebao/Acta Physica Sinica||2011||Investigating the human online behavior has become a central issue for understanding human dynamics in recent years. In this paper we analyze the temporal and content-updating statistical properties of online collaborative writing based on Wikipedia data. Online collaborative writing is one of the important and widespread human online behaviors, which is of great apphication. Empirical result shows that the distribution of inter-event time in collaborative writing is on the multi-scale. That is to say, two time intervals that range from 1 min to 30 min and 30 min to 24 h both obey power-law distribution with exponents equal to 1.62 and 1.16 respectively, while the interval larger than 24 h obeys a distribution whose cumulative form is F(τ)∝τ-b-alog(τ). More investigatons show successive updating behavior and mutual updating behavior working together to lead to the multi-scale distribution of inter-event time. Successive updating behavior leads to the power-law distribution with an exponent 1.62 of interval within 30 min while mutual updating behavior leads to the power-law distribution with an exponent 1.16 of interval ranging from 30 min to 24 h. Furthermore, we find that reverse updating repeats frequently in collaborative writing. The proportions of reversing updating and the updating size are strongly relatively reflect that the updating size is a main reason leading to the relevant content to be preserved. The bigger the updating size, the harder it would be preserved. More statistical analyses imply that "watching dog" and "edit war" exist in Wikipedia editing. Those results are very helpful to deepen the understanding of the human collective behavior, especially of the collaborative developing behavior.||0||0|
|Integrated architecture for TRIZ based on knowledge network and its key technologies||Huang S.-Q.
Theory of inventive problem solving(TRIZ)
|Zhejiang Daxue Xuebao (Gongxue Ban)/Journal of Zhejiang University (Engineering Science)||2011||Aiming at the existing relevant theory and application problems of theory of inventive problem solving (TRIZ), the integrated architecture for TRIZ based on knowledge network was put forward. The conceptual model of knowledge network for TRIZ was presented. On the above basis, the framework and collaborative construction mode of the integrated architecture for TRIZ based on knowledge network were put forward, and the integrated architecture's characteristics and key technologies were analyzed. Especially, the technology of ontology collaborative construction based on meta-ontology and Web2.0 mode was studied, the formal definition of meta-ontology in the meta-ontology layer was presented, and ontology model in ontology layer was established, then the process model of ontology instance collaborative construction in ontology-instance layer was presented. Finally, the application process model of the integrated architecture was presented, a prototype system based on this integrated architecture was developed. And a practical case study was provided for testing this integrated architecture, the results show that this integrated architecture is feasible and effective in innovation knowledge acquisition, TRIZ knowledge network collaborative construction and so on.||0||0|
|Quality of articles in Wikipedia||Deyi Li
Quality of article evaluation
|Wuhan Daxue Xuebao (Xinxi Kexue Ban)/Geomatics and Information Science of Wuhan University||2011||The recent research of wikipediais is firs briefly analyzed, especially on the statistics of quality of articles in Wikipedia. Then the automatic evaluating methods of article quality are discussed. The methods mainly include two kinds: the correlation-based analysis and cooperation modeling. Furthermore, we present the open problems of automatic quality evaluation and the possiblepromotions of collective intelligence.||0||0|
|Subject searching based on links structure analysis||Yiqin Yu
|Knowledge discovery in databases(KDD)
|Beijing Gongye Daxue Xuebao/Journal of Beijing University of Technology||2011||Current text search engines always have low search efficiency due to their keyword matching method. Based on the comparison of previous works, a thematic search strategy is proposed. The main idea of this strategy is grounded on the rich information implicated by the link structure of Wikipedia. It defines a measure of distance between words in terms of DBW, underpinned by computational thematic communities model. In this way, the authors can use this algorithm to rank and reorient the keywords to discover the closest keyword clusters and improve the quality of searching result. Introducing users' appraisal mechanism and making comparison with the traditional search engines' outcomes in experiment prove that the strategy expands the thematic coverage and maintains a high users' intent recognition at the same time.||0||0|
|The translation mining of the out of Vocabulary based on Wikipedia||Sun C.
|Cross-Language Information Retrieval
Out of vocabulary (OOV)
|Jisuanji Yanjiu yu Fazhan/Computer Research and Development||2011||The query translation is one of the key factors that affect the performance of cross-language information retrieval (CLIR). In the process of querying, the excavation of the out of vocabulary (OOV) has the important significance to improve CLIRT. Out of Vocabulary means the words or phrase which can't be found in the dictionary. In this paper, according to Wikipedia data structure and language features, we divide translation environment into target-existence environment and target-deficit environment. Depending on the difficulty of translation mining in the target-deficit environment, we adopt the frequency change information and adjacency information to realize the extraction of candidate units, and compare common extraction methods of units. The results verify that our methods are more effective. We establish the strategy of mixed translation mining based on the frequency-distance model, surface pattern matching model and summary-score model, and add the model one by one, and then verify the function influence of each model. The experiments use the mining technique of OOV in search engine as baseline and then evaluate the results with TOP1. The results verify that the mixed translation mining method based on Wikipedia can achieve the correct translation rate of 0.6822, and the improvements on this method are 6.98% over the baseline.||0||0|
|Chinese characters conversion system based on lookup table and language model||Li M.-H.
|Chinese character conversion
|Proceedings of the 22nd Conference on Computational Linguistics and Speech Processing, ROCLING 2010||2010||The character sets used in China and Taiwan are both Chinese, but they are divided into simplified and traditional Chinese characters. There are large amount of information exchange between China and Taiwan through books and Internet. To provide readers a convenient reading environment, the character conversion between simplified and traditional Chinese is necessary. The conversion between simplified and traditional Chinese characters has two problems: one-to-many ambiguity and term usage problems. Since there are many traditional Chinese characters that have only one corresponding simplified character, when converting simplified Chinese into traditional Chinese, the system will face the one-to-many ambiguity. Also, there are many terms that have different usages between the two Chinese societies. This paper focus on designing an extensible conversion system, that can take the advantage of community knowledge by accumulating lookup tables through Wikipedia to tackle the term usage problem and can integrate language model to disambiguate the one-to-many ambiguity. The system can reduce the cost of proofreading of character conversion for books, e-books, or online publications. The extensible architecture makes it easy to improve the system with new training data.||1||0|
|Requirements semantics-driven aggregated production for on-demand service||Wen B.
|Jisuanji Xuebao/Chinese Journal of Computers||2010||Based on the existing technology for assembling service, related approaches including requirements semantic encapsulation, requirements semantic interoperability extending and requirements semantics-driven service customized manufacture are put forward. Large-scale and complex system exhibits adaptive feature, and evolutionary emergence of collective behaviors is its fundamental phenomena. This paper adopts stakeholders-driven requirements semantics acquiring technique for software services and combines with semantic wikis for supporting evolution and semantic annotating of requirements. The approach not only elicits the conventional documentary requirements through collaboration and interactive negotiation mechanism, but also can it process intelligent retrieve, requirements consistency check and reasoning for services requirements entity depending on requirements semantics and requirements element instantiation conducted by underlying requirements ontology of services. By choosing connecting ontologies as semantic carrier for service aggregation, software production will be focused on upper level semantic description rather than concrete service. Theoretical and empirical studying has proven the validity practicability of the proposed method.||0||0|
|Wikipedia based semantic related Chinese words exploring and relatedness computing||Yanyan Li
|Beijing Youdian Daxue Xuebao/Journal of Beijing University of Posts and Telecommunications||2009||To find how to collect semantic related words and calculate semantic relatedness, an experiment is done to download about 50 thousand documents from the web site of Chinese Wikipedia and extract hyperlinks between lines which contains semantic information. By mining hyperlinked references in documents, about 400 thousand semantic related word pairs are collected. With more experiments on topic groups of related words, tightly related words are grouped into smaller sets with an average semantic relatedness calculated. Semantic relatedness is calculated using information of hyperlink positions and frequencies in documents. Comparing with the result by classic algorithms, the reliability of the new measures is analyzed.||0||0|
|A Study of Phenomena of Knowledge Sharing in Wikipedia||Chun-yu Huang||National Central University, Taiwan||2006||Wikipedia is an encyclopedia on the Internet. It provides a lot of knowledge for the user. The first Wikipedia appeared in 2001 and was only in English. After six year of development, there are now various versions in more than 250 languages. Contents in Wikipedia were contributed and edited not by authorities, but by users of Wikipedia. As long as one wants, one can contribute to the contents of Wikipedia. Many users spent their time and energy to devote themselves to Wikipedia. Wikipedia gives no monetary reward to its contributor, but there are more and more users sharing their knowledge to Wikipedia. Does this reveal a massive pro-social phenomenon? This study thus attempts to look into factors that effect knowledge sharing of these sharing individuals. A web based questionnaire was designed, and known Wikipedia users were invited as informants. 156 valid samples were tallied out of a total of 181 returns. Empirical results reveal that reputation and altruism have positive effects on attitude of knowledge sharing, while expected reward has significant but negative effect on attitude of knowledge sharing. External control and community identification have moderating effect on the relationship between attitude of knowledge sharing and behavior of knowledge sharing. However, we failed to find evidence that support the effect of attitude of knowledge sharing on behavior of knowledge sharing. This is an issue that calls for more studies.||0||0|