From WikiPapers
Jump to: navigation, search

Twitter is included as keyword or extra keyword in 0 datasets, 1 tools and 25 publications.


There is no datasets for this keyword.


Tool Operating System(s) Language(s) Programming language(s) License Description Image
Wikitweets Wikitweets is a visualization of how Wikipedia is cited on Twitter.


Title Author(s) Published in Language DateThis property is a special property in this wiki. Abstract R C
A novel system for the semi automatic annotation of event images McParlane P.J.
Jose J.M.
SIGIR 2014 - Proceedings of the 37th International ACM SIGIR Conference on Research and Development in Information Retrieval English 2014 With the rise in popularity of smart phones, taking and sharing photographs has never been more openly accessible. Further, photo sharing websites, such as Flickr, have made the distribution of photographs easy, resulting in an increase of visual content uploaded online. Due to the laborious nature of annotating images, however, a large percentage of these images are unannotated making their organisation and retrieval difficult. Therefore, there has been a recent research focus on the automatic and semi-automatic process of annotating these images. Despite the progress made in this field, however, annotating images automatically based on their visual appearance often results in unsatisfactory suggestions and as a result these models have not been adopted in photo sharing websites. Many methods have therefore looked to exploit new sources of evidence for annotation purposes, such as image context for example. In this demonstration, we instead explore the scenario of annotating images taken at a large scale events where evidences can be extracted from a wealth of online textual resources. Specifically, we present a novel tag recommendation system for images taken at a popular music festival which allows the user to select relevant tags from related Tweets and Wikipedia content, thus reducing the workload involved in the annotation process. Copyright 2014 ACM. 0 0
Analysing the duration of trending topics in twitter using wikipedia Thanh Tran
Georgescu M.
Zhu X.
Kanhabua N.
WebSci 2014 - Proceedings of the 2014 ACM Web Science Conference English 2014 The analysis of trending topics in Twitter is a goldmine for a variety of studies and applications. However, the contents of topics vary greatly from daily routines to major public events, enduring from a few hours to weeks or months. It is thus helpful to distinguish trending topics related to real- world events with those originated within virtual communi- ties. In this paper, we analyse trending topics in Twitter using Wikipedia as reference for studying the provenance of trending topics. We show that among difierent factors, the duration of a trending topic characterizes exogenous Twitter trending topics better than endogenous ones. Copyright 0 0
Comparing the pulses of categorical hot events in Twitter and Weibo Shuai X.
Xiaojiang Liu
Xia T.
Wu Y.
Guo C.
HT 2014 - Proceedings of the 25th ACM Conference on Hypertext and Social Media English 2014 The fragility and interconnectivity of the planet argue compellingly for a greater understanding of how different communities make sense of their world. One of such critical demands relies on comparing the Chinese and the rest of the world (e.g., Americans), where communities' ideological and cultural backgrounds can be significantly different. While traditional studies aim to learn the similarities and differences between these communities via high-cost user studies, in this paper we propose a much more efficient method to compare different communities by utilizing social media. Specifically, Weibo and Twitter, the two largest microblogging systems, are employed to represent the target communities, i.e. China and the Western world (mainly United States), respectively. Meanwhile, through the analysis of the Wikipedia page-click log, we identify a set of categorical 'hot events' for one month in 2012 and search those hot events in Weibo and Twitter corpora along with timestamps via information retrieval methods. We further quantitatively and qualitatively compare users' responses to those events in Twitter and Weibo in terms of three aspects: popularity, temporal dynamic, and information diffusion. The comparative results show that although the popularity ranking of those events are very similar, the patterns of temporal dynamics and information diffusion can be quite different. 0 0
Exploiting Twitter and Wikipedia for the annotation of event images McParlane P.J.
Jose J.M.
SIGIR 2014 - Proceedings of the 37th International ACM SIGIR Conference on Research and Development in Information Retrieval English 2014 With the rise in popularity of smart phones, there has been a recent increase in the number of images taken at large social (e.g. festivals) and world (e.g. natural disasters) events which are uploaded to image sharing websites such as Flickr. As with all online images, they are often poorly annotated, resulting in a difficult retrieval scenario. To overcome this problem, many photo tag recommendation methods have been introduced, however, these methods all rely on historical Flickr data which is often problematic for a number of reasons, including the time lag problem (i.e. in our collection, users upload images on average 50 days after taking them, meaning "training data" is often out of date). In this paper, we develop an image annotation model which exploits textual content from related Twitter and Wikipedia data which aims to overcome the discussed problems. The results of our experiments show and highlight the merits of exploiting social media data for annotating event images, where we are able to achieve recommendation accuracy comparable with a state-of-the-art model. Copyright 2014 ACM. 0 0
Towards twitter user recommendation based on user relations and taxonomical analysis Slabbekoorn K.
Noro T.
Tokuda T.
Frontiers in Artificial Intelligence and Applications English 2014 Twitter is one of the largest social media platforms in the world. Although Twitter can be used as a tool for getting valuable information related to a topic of interest, it is a hard task for us to find users to follow for this purpose. In this paper, we present a method for Twitter user recommendation based on user relations and taxonomical analysis. This method first finds some users to follow related to the topic of interest by giving keywords representing the topic, then picks up users who continuously provide related tweets from the user list. In the first phase we rank users based on user relations obtained from tweet behaviour of each user such as retweet and mention (reply), and we create topic taxonomies of each user from tweets posted during different time periods in the second phase. Experimental results show that our method is very effective in recommending users who post tweets related to the topic of interest all the time rather than users who post related tweets just temporarily. 0 0
User interests identification on Twitter using a hierarchical knowledge base Kapanipathi P.
Jain P.
Venkataramani C.
Sheth A.
Lecture Notes in Computer Science English 2014 Twitter, due to its massive growth as a social networking platform, has been in focus for the analysis of its user generated content for personalization and recommendation tasks. A common challenge across these tasks is identifying user interests from tweets. Semantic enrichment of Twitter posts, to determine user interests, has been an active area of research in the recent past. These approaches typically use available public knowledge-bases (such as Wikipedia) to spot entities and create entity-based user profiles. However, exploitation of such knowledge-bases to create richer user profiles is yet to be explored. In this work, we leverage hierarchical relationships present in knowledge-bases to infer user interests expressed as a Hierarchical Interest Graph. We argue that the hierarchical semantics of concepts can enhance existing systems to personalize or recommend items based on a varied level of conceptual abstractness. We demonstrate the effectiveness of our approach through a user study which shows an average of approximately eight of the top ten weighted hierarchical interests in the graph being relevant to a user's interests. 0 0
Jointly They Edit: Examining the Impact of Community Identification on Political Interaction in Wikipedia Jessica J. Neff
David Laniado
Karolin E. Kappler
Yana Volkovich
Pablo Aragón
Andreas Kaltenbrunner
PLOS ONE English 3 April 2013 Background

In their 2005 study, Adamic and Glance coined the memorable phrase ‘divided they blog’, referring to a trend of cyberbalkanization in the political blogosphere, with liberal and conservative blogs tending to link to other blogs with a similar political slant, and not to one another. As political discussion and activity increasingly moves online, the power of framing political discourses is shifting from mass media to social media.

Methodology/Principal Findings

Continued examination of political interactions online is critical, and we extend this line of research by examining the activities of political users within the Wikipedia community. First, we examined how users in Wikipedia choose to display their political affiliation. Next, we analyzed the patterns of cross-party interaction and community participation among those users proclaiming a political affiliation. In contrast to previous analyses of other social media, we did not find strong trends indicating a preference to interact with members of the same political party within the Wikipedia community.


Our results indicate that users who proclaim their political affiliation within the community tend to proclaim their identity as a ‘Wikipedian’ even more loudly. It seems that the shared identity of ‘being Wikipedian’ may be strong enough to triumph over other potentially divisive facets of personal identity, such as political affiliation.
0 0
A framework for detecting public health trends with Twitter Parker J.
Wei Y.
Yates A.
Frieder O.
Goharian N.
Proceedings of the 2013 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, ASONAM 2013 English 2013 Traditional public health surveillance requires regular clinical reports and considerable effort by health professionals to analyze data. Therefore, a low cost alternative is of great practical use. As a platform used by over 500 million users worldwide to publish their ideas about many topics, including health conditions, Twitter provides researchers the freshest source of public health conditions on a global scale. We propose a framework for tracking public health condition trends via Twitter. The basic idea is to use frequent term sets from highly purified health-related tweets as queries into a Wikipedia article index - treating the retrieval of medically-related articles as an indicator of a health-related condition. By observing fluctuations in frequent term sets and in turn medically-related articles over a series of time slices of tweets, we detect shifts in public health conditions and concerns over time. Compared to existing approaches, our framework provides a general a priori identification of emerging public health conditions rather than a specific illness (e.g., influenza) as is commonly done. Copyright 2013 ACM. 0 0
A generic open world named entity disambiguation approach for tweets Habib M.B.
Van Keulen M.
IC3K 2013; KDIR 2013 - 5th International Conference on Knowledge Discovery and Information Retrieval and KMIS 2013 - 5th International Conference on Knowledge Management and Information Sharing, Proc. English 2013 Social media is a rich source of information. To make use of this information, it is sometimes required to extract and disambiguate named entities. In this paper, we focus on named entity disambiguation (NED) in twitter messages. NED in tweets is challenging in two ways. First, the limited length of Tweet makes it hard to have enough context while many disambiguation techniques depend on it. The second is that many named entities in tweets do not exist in a knowledge base (KB). We share ideas from information retrieval (IR) and NED to propose solutions for both challenges. For the first problem we make use of the gregarious nature of tweets to get enough context needed for disambiguation. For the second problem we look for an alternative home page if there is no Wikipedia page represents the entity. Given a mention, we obtain a list of Wikipedia candidates from YAGO KB in addition to top ranked pages from Google search engine. We use Support Vector Machine (SVM) to rank the candidate pages to find the best representative entities. Experiments conducted on two data sets show better disambiguation results compared with the baselines and a competitor. 0 0
Analysis and forecasting of trending topics in online media streams Althoff T.
Borth D.
Hees J.
Andreas Dengel
MM 2013 - Proceedings of the 2013 ACM Multimedia Conference English 2013 Among the vast information available on the web, social media streams capture what people currently pay attention to and how they feel about certain topics. Awareness of such trending topics plays a crucial role in multimedia systems such as trend aware recommendation and automatic vocabulary selection for video concept detection systems. Correctly utilizing trending topics requires a better under- standing of their various characteristics in different social media streams. To this end, we present the first comprehensive study across three major online and social media streams, Twitter, Google, and Wikipedia, covering thou- sands of trending topics during an observation period of an entire year. Our results indicate that depending on one's requirements one does not necessarily have to turn to Twitter for information about current events and that some media streams strongly emphasize content of specific categories. As our second key contribution, we further present a novel approach for the challenging task of forecasting the life cycle of trending topics in the very moment they emerge. Our fully automated approach is based on a nearest neighbor forecasting technique exploiting our assumption that semantically similar topics exhibit similar behavior. We demonstrate on a large-scale dataset of Wikipedia page view statistics that forecasts by the proposed approach are about 9-48k views closer to the actual viewing statistics compared to baseline methods and achieve a mean average percentage error of 45-19% for time periods of up to 14 days. Copyright 0 0
Boot-strapping language identifiers for short colloquial postings Goldszmidt M.
Najork M.
Paparizos S.
Lecture Notes in Computer Science English 2013 There is tremendous interest in mining the abundant user generated content on the web. Many analysis techniques are language dependent and rely on accurate language identification as a building block. Even though there is already research on language identification, it focused on very 'clean' editorially managed corpora, on a limited number of languages, and on relatively large-sized documents. These are not the characteristics of the content to be found in say, Twitter or Facebook postings, which are short and riddled with vernacular. In this paper, we propose an automated, unsupervised, scalable solution based on publicly available data. To this end we thoroughly evaluate the use of Wikipedia to build language identifiers for a large number of languages (52) and a large corpus and conduct a large scale study of the best-known algorithms for automated language identification, quantifying how accuracy varies in correlation to document size, language (model) profile size and number of languages tested. Then, we show the value in using Wikipedia to train a language identifier directly applicable to Twitter. Finally, we augment the language models and customize them to Twitter by combining our Wikipedia models with location information from tweets. This method provides massive amount of automatically labeled data that act as a bootstrapping mechanism which we empirically show boosts the accuracy of the models. With this work we provide a guide and a publicly available tool [1] to the mining community for language identification on web and social data. 0 0
Interest classification of twitter users using wikipedia Lim K.H.
Anwitaman Datta
Proceedings of the 9th International Symposium on Open Collaboration, WikiSym + OpenSym 2013 English 2013 We present a framework for (automatically) classifying the relative interests of Twitter users using information from Wikipedia. Our proposed framework first usesWikipedia to automatically classify a user's celebrity followings into various interest categories, followed by determining the relative interests of the user with a weighting compared to his/her other interests. Our preliminary evaluation on Twitter shows that this framework is able to correctly classify users' interests and that these users frequently converse about topics that reflect both their (detected) interest and a related real-life event. Categories and Subject Descriptors: J.4 [Computer Applications]: Social and behavioral sciences General Terms: Theory. Copyright 2010 ACM. 0 0
Twitter anticipates bursts of requests for wikipedia articles Tolomei G.
Orlando S.
Ceccarelli D.
Lucchese C.
International Conference on Information and Knowledge Management, Proceedings English 2013 Most of the tweets that users exchange on Twitter make implicit mentions of named-entities, which in turn can be mapped to corresponding Wikipedia articles using proper Entity Linking (EL) techniques. Some of those become trending entities on Twitter due to a long-lasting or a sudden effect on the volume of tweets where they are mentioned. We argue that the set of trending entities discovered from Twitter may help predict the volume of requests for relating Wikipedia articles. To validate this claim, we apply an EL technique to extract trending entities from a large dataset of public tweets. Then, we analyze the time series derived from the hourly trending score (i.e., an index of popularity) of each entity as measured by Twitter and Wikipedia, respectively. Our results reveals that Twitter actually leads Wikipedia by one or more hours. Copyright 2013 ACM. 0 0
Use of Web 2.0 technologies in K-12 and higher education: The search for evidence-based practice Hew K.F.
Cheung W.S.
Educational Research Review English 2013 Evidence-based practice in education entails making pedagogical decisions that are informed by relevant empirical research evidence. The main purpose of this paper is to discuss evidence-based pedagogical approaches related to the use of Web 2.0 technologies in both K-12 and higher education settings. The use of such evidence-based practice would be useful to educators interested in fostering student learning through Web 2.0 tools. A comprehensive literature search across the Academic Search Premier, Education Research Complete, ERIC, and PsycINFO databases was conducted. Empirical studies were included for review if they specifically examined the impact of Web 2.0 technologies on student learning. Articles that merely described anecdotal studies such as student perception or feeling toward learning using Web 2.0, or studies that relied on student self-report data such as student questionnaire survey and interview were excluded. Overall, the results of our review suggested that actual evidence regarding the impact of Web 2.0 technologies on student learning is as yet fairly weak. Nevertheless, the use of Web 2.0 technologies appears to have a general positive impact on student learning. None of the studies reported a detrimental or inferior effect on learning. The positive effects are not necessarily attributed to the technologies per se but to how the technologies are used, and how one conceptualizes learning. It may be tentatively concluded that a dialogic, constructionist, or co-constructive pedagogy supported by activities such as Socratic questioning, peer review and self-reflection appeared to increase student achievement in blog-, wiki-, and 3-D immersive virtual world environments, while a transmissive pedagogy supported by review activities appeared to enhance student learning using podcast. 0 0
Analysis and enhancement of wikification for microblogs with context expansion Cassidy T.
Ji H.
Lev Ratinov
Zubiaga A.
Houkuan Huang
24th International Conference on Computational Linguistics - Proceedings of COLING 2012: Technical Papers English 2012 Disambiguation to Wikipedia (D2W) is the task of linking mentions of concepts in text to their corresponding Wikipedia entries. Most previous work has focused on linking terms in formal texts (e.g. newswire) to Wikipedia. Linking terms in short informal texts (e.g. tweets) is difficult for systems and humans alike as they lack a rich disambiguation context. We first evaluate an existing Twitter dataset as well as the D2W task in general. We then test the effects of two tweet context expansion methods, based on tweet authorship and topic-based clustering, on a state-of-the-art D2W system and evaluate the results. 0 0
Bieber no more: First Story Detection using Twitter and Wikipedia Miles Osborne
Saša Petrović
Richard McCreadie
Craig Macdonald
Iadh Ounis
English 2012 Twitter is a well known source of information regarding breaking news stories. This aspect of Twitter makes it ideal for identifying events as they happen. However, a key problem with Twitter-driven event detection approaches is that they produce many spurious events, i.e., events that are wrongly detected or simply are of no interest to anyone. In this paper, we examine whether Wikipedia (when viewed

as a stream of page views) can be used to improve the quality of discovered events in Twitter. Our results suggest that Wikipedia is a powerful filtering mechanism, allowing for easy blocking of large numbers of spurious events. Our results also indicate that events within Wikipedia tend to lag

behind Twitter.
0 0
CrowdTiles: Presenting crowd-based information for event-driven information needs Whiting S.
Zhou K.
Jose J.
Alonso O.
Leelanupab T.
ACM International Conference Proceeding Series English 2012 Time plays a central role in many web search information needs relating to recent events. For recency queries where fresh information is most desirable, there is likely to be a great deal of highly-relevant information created very recently by crowds of people across the world, particularly on platforms such as Wikipedia and Twitter. With so many users, mainstream events are often very quickly reflected in these sources. The English Wikipedia encyclopedia consists of a vast collection of user-edited articles covering a range of topics. During events, users collaboratively create and edit existing articles in near real-time. Simultaneously, users on Twitter disseminate and discuss event details, with a small number of users becoming influential for the topic. In this demo, we propose a novel approach to presenting a summary of new information and users related to recent or ongoing events associated with the user's search topic, therefore aiding most recent information discovery. We outline methods to detect search topics which are driven by events, identify and extract changing Wikipedia article passages and find influential Twitter users. Using these, we provide a system which displays familiar tiles in search results to present recent changes in the event-related Wikipedia articles, as well as Twitter users who have tweeted recent relevant information about the event topics. 0 0
Impact of platform design on cross-language information exchange Hale S. Conference on Human Factors in Computing Systems - Proceedings English 2012 This paper describes two case studies examining the impact of platform design on cross-language communications. The sharing of off-site hyperlinks between language editions of Wikipedia and between users on Twitter with different languages in their user descriptions are analyzed and compared in the context of the 2011 Tohoku earthquake and tsunami in Japan. The paper finds that a greater number of links are shared across languages on Twitter, while a higher percentage of links are shared between Wikipedia articles. The higher percentage of links being shared on Wikipedia is attributed to the persistence of links and the ability for users to link articles on the same topic together across languages. 0 0
Keyterm extraction from microblogs' messages using Wikipedia-based keyphraseness measure Korshunov A. 2012 6th International Conference on Sciences of Electronics, Technologies of Information and Telecommunications, SETIT 2012 English 2012 The paper describes a method for keyterm extraction from messages of microblogs. The described approach utilizes the information obtained by the analysis of structure and content of Wikipedia. The algorithm is based on computation of 'keyphraseness' measure for each term, i.e. an estimation of probability that it can be selected as a key in the text. The experimental study of the proposed technique demonstrated satisfactory results which significantly outpaces analogues. As a demonstration of possible application of the algorithm, the prototype of context-sensitive advertising system has been implemented. This system is able to obtain the descriptions of the goods relevant to the found keyterms from Amazon online store. Several suggestions are also made on how to utilize the information obtained by the analysis of Twitter messages in different auxiliary services. 0 0
Twevent: Segment-based event detection from tweets Chenliang Li
Aixin Sun
Anwitaman Datta
ACM International Conference Proceeding Series English 2012 Event detection from tweets is an important task to understand the current events/topics attracting a large number of common users. However, the unique characteristics of tweets (e.g. short and noisy content, diverse and fast changing topics, and large data volume) make event detection a challenging task. Most existing techniques proposed for well written documents (e.g. news articles) cannot be directly adopted. In this paper, we propose a segment-based event detection system for tweets, called Twevent. Twevent first detects bursty tweet segments as event segments and then clusters the event segments into events considering both their frequency distribution and content similarity. More specifically, each tweet is split into non-overlapping segments (i.e. phrases possibly refer to named entities or semantically meaningful information units). The bursty segments are identified within a fixed time window based on their frequency patterns, and each bursty segment is described by the set of tweets containing the segment published within that time window. The similarity between a pair of bursty segments is computed using their associated tweets. After clustering bursty segments into candidate events, Wikipedia is exploited to identify the realistic events and to derive the most newsworthy segments to describe the identified events. We evaluate Twevent and compare it with the state-of-the-art method using 4.3 million tweets published by Singapore-based users in June 2010. In our experiments, Twevent outperforms the state-of-the-art method by a large margin in terms of both precision and recall. More importantly, the events detected by Twevent can be easily interpreted with little background knowledge because of the newsworthy segments. We also show that Twevent is efficient and scalable, leading to a desirable solution for event detection from tweets. 0 0
TwiNER: Named entity recognition in targeted twitter stream Chenliang Li
Weng J.
He Q.
Yao Y.
Anwitaman Datta
Aixin Sun
Lee B.-S.
SIGIR'12 - Proceedings of the International ACM SIGIR Conference on Research and Development in Information Retrieval English 2012 Many private and/or public organizations have been reported to create and monitor targeted Twitter streams to collect and understand users' opinions about the organizations. Targeted Twitter stream is usually constructed by filtering tweets with user-defined selection criteria e.g. tweets published by users from a selected region, or tweets that match one or more predefined keywords. Targeted Twitter stream is then monitored to collect and understand users' opinions about the organizations. There is an emerging need for early crisis detection and response with such target stream. Such applications require a good named entity recognition (NER) system for Twitter, which is able to automatically discover emerging named entities that is potentially linked to the crisis. In this paper, we present a novel 2-step unsupervised NER system for targeted Twitter stream, called TwiNER. In the first step, it leverages on the global context obtained from Wikipedia and Web N-Gram corpus to partition tweets into valid segments (phrases) using a dynamic programming algorithm. Each such tweet segment is a candidate named entity. It is observed that the named entities in the targeted stream usually exhibit a gregarious property, due to the way the targeted stream is constructed. In the second step, TwiNER constructs a random walk model to exploit the gregarious property in the local context derived from the Twitter stream. The highly-ranked segments have a higher chance of being true named entities. We evaluated TwiNER on two sets of real-life tweets simulating two targeted streams. Evaluated using labeled ground truth, TwiNER achieves comparable performance as with conventional approaches in both streams. Various settings of TwiNER have also been examined to verify our global context + local context combo idea. 0 0
Integrating Twitter into Wiki to support informal awareness Xuan Zhao
Wenpeng Xiao
Changyan Chi
Min Yang
Computer-Supported Cooperative Work English 2011 0 0
Integrating Twitter into wiki to support informal awareness Xuan Zhao
Wenpeng Xiao
Changyan Chi
Min Yang
English 2011 In the current study, we explored Twitter as a useful and practical extension to a wiki-based collaborative work space. A two-week experiment and a survey study shed some light on the potential benefits of integrating Twitter, or other existing social networking tools with a formal collaborative work space in encouraging meta-data level communication and promoting informal awareness. Copyright 2011 ACM. 0 0
The power of truth Stolley R.B. Publishing Research Quarterly English 2010 Democracy thrives on journalism, and journalism thrives on truth. The subject is Truth - what it is, how we find it and why it's important. In finding Truth, the internet is a rich and dangerous source. Nothing is better than a well-prepared, face to face interview. But Truth is not free. Journalism must find a way to elicit payment for all forms of disseminated Truth - print and online. One Truth seems clear: books and magazines will never disappear, be displaced or diminished to irrelevance. 0 0
Implications of digital technologies for book publishing Tian X.
Martin Prof. B.
4th International Conference on Cooperation and Promotion of Information Resources in Science and Technology, COINFO 2009 English 2009 This paper is based on an Australian governmentfunded research project looking at the implications of digitization for the book publishing industry, which was completed in 2008. Although Australian-based, the project and subsequent research have wider implications for application elsewhere. The paper initially provides a snapshot of Australian book publishing in a global context, and then summarizes our findings on the current and potential future impact of digital technologies. The original research employed an interpretive research paradigm, using a mixed methodology design, including an online survey of book publishers and the conduct of 14 case studies. Since completion of the project, the pace of technology-related change in book publishing has been addressed in follow-up research, based on a global context. In reporting this new research, the paper discusses in detail the diverse range of technologies, their limitations and the risks and opportunities they offer to the book publishing industry. This includes insights into the business as well as technical issues confronting the industry. 0 0