From WikiPapers
Jump to: navigation, search

biography is included as keyword or extra keyword in 1 datasets, 0 tools and 10 publications.


Dataset Size Language Description
CoCoBi 5 MB German CoCoBi is a Corpus of Comparable Biographies in German and contains 400 annotated biographies of 141 famous people. Automatic annotation was done the same way and with the same tools as in WikiBiography. Biographies come from different sources, mainly, from Wikipedia and the Brockhaus Lexikon.


There is no tools for this keyword.


Title Author(s) Published in Language DateThis property is a special property in this wiki. Abstract R C
Extraction of biographical data from Wikipedia Viseur R. DATA 2013 - Proceedings of the 2nd International Conference on Data Technologies and Applications English 2013 Using the content of Wikipedia articles is common in academic research. However the practicalities are rarely analysed. Our research focuses on extracting biographical information about personalities from Belgium. Our research is divided into three sections. The first section describes the state of the art for data extraction from Wikipedia. A second section presents the case study about data extraction for biographies of Belgian personalities. Different solutions are discussed and the solution adopted is implemented. In the third section, the quality of the extraction is discussed. Practical recommendations for researchers wishing to use Wikipedia are also proposed on the basis of our case study. 0 0
Learning through massively co-authored biographies: Making sense of steve jobs on wikipedia through delegated voice Rughinis C.
Matei S.
Proceedings - 19th International Conference on Control Systems and Computer Science, CSCS 2013 English 2013 This paper discusses opportunities for learning about biographies through Wikipedia, The Free Encyclopaedia. We examine argumentation and interpretation practices in Steve Jobs's entry and its associated Talk pages, focusing on editors' debates on whether Jobs was an 'inventor'. We highlight argumentation from delegated voice as a core element of Wikipedian knowledge building; contributors' variable skills in engaging this NPOV mandated requirement account for their success or failure in promoting changes in page content and structure. Editors' concerns about topic relevance and page structure are particularly vulnerable to counter-argumentation from delegated voice. 0 0
Biographical social networks on Wikipedia: A cross-cultural study of links that made history Aragon P.
David Laniado
Andreas Kaltenbrunner
Yana Volkovich
WikiSym 2012 English 2012 It is arguable whether history is made by great men and women or vice versa, but undoubtably social connections shape history. Analysing Wikipedia, a global collective memory place, we aim to understand how social links are recorded across cultures. Starting with the set of biographies in the English Wikipedia we focus on the networks of links between these biographical articles on the 15 largest language Wikipedias. We detect the most central characters in these networks and point out culture-related peculiarities. Furthermore, we reveal remarkable similarities between distinct groups of language Wikipedias and highlight the shared knowledge about connections between persons across cultures. 0 0
Where's the bio? Databases, Wikipedia, and the web Soules A. New Library World English 2012 Purpose: This paper aims to compare biographical content for literary authors writing in English among Biography Reference Bank, Contemporary Authors Online, Wikipedia, and the web. Design/methodology/approach: A sample of 500 names was gathered from curricula and textbooks used in English courses and searched in the Contemporary Authors Online portion of Literature Resource Center, Biography Reference Bank, Wikipedia, and the web; the results and content were compared. Findings: Each source has core content plus its own unique offerings and specific challenges, as evidenced in searching, evaluative techniques such as authority and currency, and content. Research limitations/implications: This study can only offer a small part of the picture of what information resides where and a single snapshot in time. Practical implications: This study will help librarians decide whether to subscribe to a biographical database. It also reinforces the need for evidence-based practice in librarianship. Originality/value: While the study is only a small part of the picture, it still makes use of a significant sample size to validate/refute assumptions about the availability of biographical information and the sources studied. 0 0
Entre o agrupamento e a comunidade virtual: colaboração e conflitos na edição das biografias dos jogadores “Adriano” e “Ronaldo” na Wikipédia em português Carlos Frederico de Brito d’Andréa XXXIV Congresso Brasileiro de Ciências da Comunicação Portuguese September 2011 9 0
Processos editoriais auto-organizados na Wikipédia em português: a edição colaborativa de "Biografias de Pessoas Vivas" Carlos Frederico de Brito d’Andréa Portuguese September 2011 This dissertation maps and analyzes the dynamics of editions in a sample of articles of the Portuguese version of Wikipedia. We identify and discuss the self-organized and collaborative processes in its editorial network, as well as how the editors rewrite the articles over time. This research begins with conceptual considerations about the “encyclopedia that anyone can edit”, focusing on trends of the Portuguese version and specifically on the “Biographies of Living People”, which are characterized by the possibility of including, “in real time”, factual information about the life and work of influent people. The theoretical framework is composed by authors from different areas. In Text Linguistics, we discuss the concepts of text (BEAUGRANDE, 1997; COSCARELLI, 2006), textuality (COSTA VAL, 2004), retextualization and rewritting (DELL’ISOLA, 2007; MARCUSCHI, 2000; MATENCIO, 2002). Besides that, we discuss the editorial processes and professional activities (like copy editing) in the “production networks” of books and encyclopedias, especially after the use of digital technologies. In chapter 3, we discuss the networked editorial production based on the internet and inspired in “hacker culture” and “open source softwares”. In this context, the most important concepts are “commonbased peer production” (BENKLER, 2006), “The Wisdom of Crowds” (SUROWIECKI, 2007), “produsage” (BRUNS, 2008), “virtual community” e “crowdsourcing” (HAYTHORNTHWAITE, 2009). We also present the relationships between this new model and traditional editorial processes, like “networked book” and “wiki-journalism”. After that, we relate networked editorial production with complexity paradigm and discuss Wikipedia as a complex adaptive system (HOLLAND, 1995; LARSEN-FREEMAN e CAMERON, 2008) that, potentially, works in a self-organized and emergent dynamics (DEBRUN, 1996a, 1996b; DE WOLF e HOLVOET, 2005). The empirical study of this thesis is based in 91 “Biographies of Living People” about most influential Brazilian personalities in the year of 2009 according two national magazines (“Época” and “Isto É”). In the quantitative phase of this work, we extracted data in articles history pages using a software (WikipediAnalyserPT) developed for this research. After making statistical analyses, we compared the edition processes of these articles using variables as “total of editions”, “editions made by groups of editors” (registered, non-registered, administrators and bots), “protections”, “reversions” etc. At the qualitative stage, we detail the dynamics of edition of five of articles and analyze the rewrittings of the texts and the interactions between the editors. Three articles were chosen because the “key variables” are very similar: the biographies of “Franklin Martins” (a journalist that worked in president Lula's government), “Kátia Abreu” (a senator known for defending owners of very large land areas) and “Ricardo Teixeira” (a president of the Brazilian Football Confederation). After that, we analyze the dynamics of two of the most edited articles of the sample: the biographies about the famous soccer players "Adriano Leite Ribeiro" (nicknamed "The Emperor") and "Ronaldo Nazario of Lima (also known as "The Phenomenon"). In the three intermediate articles, we identified a relative stability (caused by a few number of editions monthly) interspersed with short periods of time with more editions and disputes. We also observed that a few editors made almost all the “important” editions. In the two more edited biographies, we noticed an uninterrupted movement of the editors, hundreds of vandalisms and many war editions. Although also in these articles only a few editions are preserved, we identify an “emergence” pattern characterized by disputes that encourage the collaboration among agents. At the conclusion, we discuss the possibilities and challenges of a “wikification” of editorial processes. 60 0
BioSnowball: Automated population of wikis Xiaojiang Liu
Zaiqing Nie
Yu N.
Wen J.-R.
Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining English 2010 Internet users regularly have the need to find biographies and facts of people of interest. Wikipedia has become the first stop for celebrity biographies and facts. However, Wiki-pedia can only provide information for celebrities because of its neutral point of view (NPOV) editorial policy. In this paper we propose an integrated bootstrapping framework named BioSnowball to automatically summarize the Web to generate Wikipedia-style pages for any person with a modest web presence. In BioSnowball, biography ranking and fact extraction are performed together in a single integrated training and inference process using Markov Logic Networks (MLNs) as its underlying statistical model. The bootstrapping framework starts with only a small number of seeds and iteratively finds new facts and biographies. As biography paragraphs on the Web are composed of the most important facts, our joint summarization model can improve the accuracy of both fact extraction and biography ranking compared to decoupled methods in the literature. Empirical results on both a small labeled data set and a real Web-scale data set show the effectiveness of BioSnowball. We also empirically show that BioSnowball outperforms the decoupled methods. 0 0
IT education 2.0 Sabin M.
Leone J.
SIGITE'09 - Proceedings of the 2009 ACM Special Interest Group for Information Technology Education English 2009 Today's networked computing and communications technologies have changed how information, knowledge, and culture are produced and exchanged. People around the world join online communities that are set up voluntarily and use their members' collaborative participation to solve problems, share interests, raise awareness, or simply establish social connections. Two online community examples with significant economic and cultural impact are the open source software movement and Wikipedia. The technological infrastructure of these peer production models uses current Web 2.0 tools, such as wikis, blogs, social networking, semantic tagging, and RSS feeds. With no control exercised by property-based markets or managerial hierarchies, commons-based peer production systems contribute to and serve the public domain and public good. The body of cultural, educational, and scientific work of many online communities is made available to the public for free and legal sharing, use, repurposing, and remixing. Higher education's receptiveness to these transformative trends deserves close examination. In the case of the Information Technology (IT) education community, in particular, we note that the curricular content, research questions, and professional skills the IT discipline encompasses have direct linkages with the Web 2.0 phenomenon. For that reason, IT academic programs should pioneer and lead efforts to cultivate peer production online communities. We state the case that free access and open engagement facilitated by technological infrastructures that support a peer production model benefit IT education. We advocate that these technologies be employed to strengthen IT educational programs, advance IT research, and revitalize the IT education community. Copyright 2009 ACM. 0 0
An unsupervised approach to biography production using Wikipedia Biadsy F.
Hirschberg J.
Elena Filatova
ACL-08: HLT - 46th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, Proceedings of the Conference English 2008 We describe an unsupervised approach to multi-document sentence-extraction based summarization for the task of producing biographies. We utilize Wikipedia to automatically construct a corpus of biographical sentences and TDT4 to construct a corpus of non-biographical sentences. We build a biographical-sentence classifier from these corpora and an SVM regression model for sentence ordering from the Wikipedia corpus. We evaluate our work on the DUC2004 evaluation data and with human judges. Overall, our system significantly outperforms all systems that participated in DUC2004, according to the ROUGE-L metric, and is preferred by human subjects. 0 0
Efficient time-travel on versioned text collections Berberich K.
Bedathur S.
Gerhard Weikum
Datenbanksysteme in Business, Technologie und Web, BTW 2007 - 12th Fachtagung des GI-Fachbereichs "Datenbanken und Informationssysteme" (DBIS), Proceedings 2007 The availability of versioned text collections such as the Internet Archive opens up opportunities for time-aware exploration of their contents. In this paper, we propose time-travel retrieval and ranking that extends traditional keyword queries with a temporal context in which the query should be evaluated. More precisely, the query is evaluated over all states of the collection that existed during the temporal context. In order to support these queries, we make key contributions in (i) defining extensions to well-known relevance models that take into account the temporal context of the query and the version history of documents, (ii) designing an immortal index over the full versioned text collection that avoids a blowup in index size, and (iii) making the popular NRA algorithm for top-k query processing aware of the temporal context. We present preliminary experimental analysis over the English Wikipedia revision history showing that the proposed techniques are both effective and efficient. 0 0