Ingmar Weber

From WikiPapers
Jump to: navigation, search

Ingmar Weber is an author.

Publications

Only those publications related to wikis are shown here.
Title Keyword(s) Published in Language DateThis property is a special property in this wiki. Abstract R C
From Machu-Picchu to "rafting the urubamba river": Anticipating information needs via the entity-query graph Entity extraction
Implicit search
Query suggestions
Serendipity
WSDM 2013 - Proceedings of the 6th ACM International Conference on Web Search and Data Mining English 2013 We study the problem of anticipating user search needs, based on their browsing activity. Given the current web page p that a user is visiting we want to recommend a small and diverse set of search queries that are relevant to the content of p, but also non-obvious and serendipitous. We introduce a novel method that is based on the content of the page visited, rather than on past browsing patterns as in previous literature. Our content-based approach can be used even for previously unseen pages. We represent the topics of a page by the set of Wikipedia entities extracted from it. To obtain useful query suggestions for these entities, we exploit a novel graph model that we call EQGraph (Entity-Query Graph), containing entities, queries, and transitions between entities, between queries, as well as from entities to queries. We perform Personalized PageRank computation on such a graph to expand the set of entities extracted from a page into a richer set of entities, and to associate these entities with relevant query suggestions. We develop an efficient implementation to deal with large graph instances and suggest queries from a large and diverse pool. We perform a user study that shows that our method produces relevant and interesting recommendations, and outperforms an alternative method based on reverse IR. 0 0
Drawing a Data-Driven Portrait of Wikipedia Editors Wikipedia
Editors
Web usage
Expertise
WikiSym English August 2012 While there has been a substantial amount of research into the editorial and organizational processes within Wikipedia, little is known about how Wikipedia editors (Wikipedians) relate to the online world in general. We attempt to shed light on this issue by using aggregated log data from Yahoo!’s browser toolbar in order to analyze Wikipedians’ editing behavior in the context of their online lives beyond Wikipedia. We broadly characterize editors by investigating how their online behavior differs from that of other users; e.g., we find that Wikipedia editors search more, read more news, play more games, and, perhaps surprisingly, are more immersed in popular culture. Then we inspect how editors’ general interests relate to the articles to which they contribute; e.g., we confirm the intuition that editors are more familiar with their active domains than average users. Finally, we analyze the data from a temporal perspective; e.g., we demonstrate that a user’s interest in the edited topic peaks immediately before the edit. Our results are relevant as they illuminate novel aspects of what has become many Web users’ prevalent source of information. 0 0
A data-driven sketch of Wikipedia editors Editors
Expertise
Web usage
Wikipedia
WWW'12 - Proceedings of the 21st Annual Conference on World Wide Web Companion English 2012 Who edits Wikipedia? We attempt to shed light on this question by using aggregated log data from Yahoo!'s browser toolbar in order to analyzeWikipedians' editing behavior in the context of their online lives beyond Wikipedia. We broadly characterize editors by investigating how their online behavior differs from that of other users; e.g., we find that Wikipedia editors search more, read more news, play more games, and, perhaps surprisingly, are more immersed in pop culture. Then we inspect how editors' general interests relate to the articles to which they contribute; e.g., we confirm the intuition that editors show more expertise in their active domains than average users. Our results are relevant as they illuminate novel aspects of what has become many Web users' prevalent source of information and can help in recruiting new editors. Copyright is held by the author/owner(s). 0 0
Drawing a data-driven portrait of Wikipedia editors Editors
Expertise
Web usage
Wikipedia
WikiSym 2012 English 2012 While there has been a substantial amount of research into the editorial and organizational processes within Wikipedia, little is known about how Wikipedia editors (Wikipedians) relate to the online world in general. We attempt to shed light on this issue by using aggregated log data from Yahoo!'s browser toolbar in order to analyze Wikipedians' editing behavior in the context of their online lives beyond Wikipedia. We broadly characterize editors by investigating how their online behavior differs from that of other users; e.g., we find that Wikipedia editors search more, read more news, play more games, and, perhaps surprisingly, are more immersed in popular culture. Then we inspect how editors' general interests relate to the articles to which they contribute; e.g., we confirm the intuition that editors are more familiar with their active domains than average users. Finally, we analyze the data from a temporal perspective; e.g., we demonstrate that a user's interest in the edited topic peaks immediately before the edit. Our results are relevant as they illuminate novel aspects of what has become many Web users' prevalent source of information. 0 0
Mining web query logs to analyze political issues Opinion mining and sentiment analysis
Partisanship
Political leaning
Web search logs
Proceedings of the 3rd Annual ACM Web Science Conference, WebSci'12 English 2012 We present a novel approach to using anonymized web search query logs to analyze and visualize political issues. Our starting point is a list of politically annotated blogs (left vs. right). We use this list to assign a numerical political leaning to queries leading to clicks on these blogs. Furthermore, we map queries to Wikipedia articles and to fact-checked statements from politifact.com, as well as applying sentiment analysis to search results. With this rich, multi-faceted data set we obtain novel graphical visualizations of issues and discover connections between the different variables. Our findings include (i) an interest in "the other side" where queries about Democrat politicians have a right leaning and vice versa, (ii) evidence that "lies are catchy" and that queries pertaining to false statements are more likely to attract large volumes, and (iii) the observation that the more right-leaning a query it is, the more negative sentiments can be found in its search results. Copyright 0 0
Semantic full-text search with ESTER: Scalable, easy, fast Proceedings - IEEE International Conference on Data Mining Workshops, ICDM Workshops 2008 English 2008 We present a demo of ESTER, a search engine that combines the ease of use, speed and scalability of full-text search with the powerful semantic capabilities of ontologies. ESTER supports full-text queries, ontological queries and combinations of these, yet its interface is as easy as can be: A standard search field with semantic information provided interactively as one types. ESTER works by reducing all queries to two basic operations: prefix search and join, which can be implemented very efficiently in terms of both processing time and index space. We demonstrate the capabilities of ESTER on a combination of the English Wikipedia with the Yago ontology, with response times below 100 milliseconds for most queries, and an index size of about 4 GB. The system can be run both stand-alone and as a Web application. 0 0
ESTER: Efficient search on text, entities, and relations Interactive
Ontology
Proactive
Semantic search
Wikipedia
Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR'07 English 2007 We present ESTER, a modular and highly efficient system for combined full-text and ontology search. ESTER builds on a query engine that supports two basic operations: prefix search and join. Both of these can be implemented very efficiently with a compact index, yet in combination provide powerful querying capabilities. We show how ESTER can answer basic SPARQL graph-pattern queries on the ontology by reducing them to a small number of these two basic operations. ESTER further supports a natural blend of such semantic queries with ordinary full-text queries. Moreover, the prefix search operation allows for a fully interactive and proactive user interface, which after every keystroke suggests to the user possible semantic interpretations of his or her query, and speculatively executes the most likely of these interpretations. As a proof of concept, we applied ESTER to the English Wikipedia, which contains about 3 million documents, combined with the recent YAGO ontology, which contains about 2.5 million facts. For a variety of complex queries, ESTER achieves worst-case query processing times of a fraction of a second, on a single machine, with an index size of about 4 GB. Copyright 2007 ACM. 0 0
Efficient interactive query expansion with complete Search Index building
Interactive
Query expansion
Synsets
Wikipedia
Wordnet
International Conference on Information and Knowledge Management, Proceedings English 2007 We present an efficient realization of the following interactive search engine feature: as the user is typing the query, words that are related to the last query word and that would lead to good hits are suggested, as well as selected such hits. The realization has three parts: (i) building clusters of related terms, (ii) adding this information as artificial words to the index such that (iii) the described feature reduces to an instance of prefix search and completion. An efficient solution for the latter is provided by the CompleteSearch engine, with which we have integrated the proposed feature. For building the clusters of related terms we propose a variant of latent semantic indexing that, unlike standard approaches, is completely transparent to the user. By experiments on two large test-collections, we demonstrate that the feature is provided at only a slight increase in query processing time and index size. Copyright 2007 ACM. 0 0