Term frequency dynamics in collaborative articles
|Term frequency dynamics in collaborative articles|
|Author(s)||Nunes S., Ribeiro C., David G.|
|Published in||DocEng2010 - Proceedings of the 2010 ACM Symposium on Document Engineering|
|Keyword(s)||Document dynamics, Term frequency, Wikipedia (Extra: Document collection, Document dynamics, Life span, Public resources, Term Frequency, Web document, Wikipedia, Dynamics, Information retrieval systems, World Wide Web, Information retrieval)|
|Article||BASE, CiteSeerX, Google Scholar|
|Web||Ask, Bing, Google (PDF), Yahoo!|
|Download and mirrors|
|Local copy||Not available|
|Remote mirror(s)||Not available|
|Export and share|
|BibTeX, CSV, RDF, JSON|
|Browse properties · List of conference papers|
Term frequency dynamics in collaborative articles is a 2010 conference paper written in English by Nunes S., Ribeiro C., David G. and published in DocEng2010 - Proceedings of the 2010 ACM Symposium on Document Engineering.
Documents on the World Wide Web are dynamic entities. Mainstream information retrieval systems and techniques are primarily focused on the latest version a document, generally ignoring its evolution over time. In this work, we study the term frequency dynamics in web documents over their lifespan. We use the Wikipedia as a document collection because it is a broad and public resource and, more important, because it provides access to the complete revision history of each document. We investigate the progression of similarity values over two projection variables, namely revision order and revision date. Based on this investigation we find that term frequency in encyclopedic documents - i.e. comprehensive and focused on a single topic - exhibits a rapid and steady progression towards the document's current version. The content in early versions quickly becomes very similar to the present version of the document. Copyright 2010 ACM.
- This section requires expansion. Please, help!
Probably, this publication is cited by others, but there are no articles available for them in WikiPapers.