List of GNU/Linux tools

From WikiPapers
Jump to: navigation, search
See also: List of tools.

This is a list of GNU/Linux tools available in WikiPapers. Currently, there are 3 tools for this operating system.

To create a new "tool" go to Form:Tool.

Tools

Tool Keyword(s) Language(s) Programming language(s) License Description Image
AVBOT English
Spanish
Python GPL AVBOT is an anti-vandalism bot in Spanish Wikipedia. It uses regular expressions and scores to detect vandalism. Avbot logo.png
Alternative MediaWiki parsers English PHP
Java
Ruby
Python
JavaScript
C++
Perl
Haskell
Alternative parsers is a compilation of various alternative MediaWiki parsers which are able or intended to translate MediaWiki's text markup syntax into something else.
AssessMediaWiki Spanish PHP AssessMediaWiki is an open-source web application that, connected to a MediaWiki installation, supports for hetero, self and peer to peer assessment procedures, whilst keeps track of compiled assessment data. Thus supervisors can obtain reports to help assessing students.
Authorship Tracking None Python BSD License Authorship Tracking This code implements the algorithms for tracking the authorship of text in revisioned content that have been published in WWW 2013: http://www2013.wwwconference.org/proceedings/p343.pdf

The idea consists in attributing each portion of text to the earliest revision where it appeared. For instance, if a revision contains the sentence "the cat ate the mouse", and the sentence is deleted, and reintroduced in a later revision (not necessarily as part of a revert), once re-introduced it is still attributed to its earliest author.

Precisely, the algorithm takes a parameter N. If a sequence of tokens of length equal or greater than N has appeared before, it is attributed to its earliest occurrence. See the paper for details.

The code works by building a trie-based representation of the whole history of the revisions, in an object of the class AuthorshipAttribution. Each time a new revision is passed to the object, the object updates its internal state and it computes the earliest attribution of the new revision, which can be then easily obtained. The object itself can be serialized (and de-serialized) using json-based methods.

To avoid the representation of the whole past history from growing too much, we remove from the object the information about content that has been absent from revisions (a) for at least 90 days, and (b) for at least 100 revisions. These are configurable parameters. With these choices, for the Wikipedia, the serialization of the object has size typically between 10 and 20 times the size of a typical revision, even for pages with very long revision lists. See paper for detailed experimental results.
Catdown English PHP Catdown is a tool to download images in Wikimedia Commons categories.
ClueBot C
C++
Python
PHP
Bash
ClueBot is an anti-vandalism bot in English Wikipedia.
Commons explorer English Python
PHP
GPL Commons explorer is a tool map for exploring Wikimedia Commons multimedia files by location and year.
CryptoDerk's Vandal Fighter English Java Open source
Dump-downloader Perl Apache License 2.0 dump-downloader Script to request and download the full history dump of all the pages in a MediaWiki. Meant to work for Wikia's wikis but I could work with other wikis. Source code here: https://github.com/Grasia/wiki-scripts/tree/master/wikia_dump_downloader
Igloo JavaScript Open source
Ikiwiki English Ikiwiki supports to store a wiki as a git repository.
Images for biographies English Python
PHP
GPL Images for biographies is a tool that suggests images for biographies in several Wikipedias.
Infobox2rdf English Perl GPL v3 infobox2rdf generates huge RDF datasets from the infobox data in Wikipedia dump files.
JWordNet-Similarity English Java
Java Wikipedia Library English Java LGPL Java Wikipedia Library is an application programming interface that allows to access all information in Wikipedia.
MediaWiki Utilities English Python MIT license MediaWiki Utilities is a collection of utilities for working with XML data dumps generated for Wikimedia projects and other MediaWiki wikis.
Natural Language Toolkit English Python
Perlwikipedia English Perl GPL v3 perlwikipedia is a high-level bot framework for interacting with MediaWiki wikis.
Python-wikitools English Python GPL v3
Pywikipediabot English Python MIT license pywikipediabot is a wiki robot framework. It includes a lot of functions to interact with a MediaWiki wiki. It uses MediaWiki API when available.
STiki English Java GPL STiki is an anti-vandalism tool that consists of server-side detection algorithms and a client-facing GUI. STiki logo.png
Sioc MediaWiki English Sioc MediaWiki is a RDF exporter for MediaWiki's wikis.
StatMediaWiki English Python GPLv3 StatMediaWiki is a project that aims to create a tool to collect and aggregate information available in a MediaWiki installation. Results are static HTML pages including tables and graphics that can help to analyze the wiki status and development, or a CSV file for custom processing. General hour activity-wikihaskell.png
Twinkle English JavaScript
Vandal Fighter English Java Vandal Fighter - Live RC.png
VandalSniper English Mono
Weka Java GPL
Wiki Category Matrix Visualization English Java Educational Community License Wiki Category Matrix Visualization is a tool that generates a visual representation of data sizes across topics of a multi-level category hierarchy in matrix form. It provides a "big picture" overview of topics in terms of categorization. Matrix-visualization-simplewiki.png
Wiki Edit History Analyzer English Java Wiki Edit History Analyzer processes the MediaWiki revision history and produces summaries of edit actions performed. Basic edit actions include insert, delete, replace, and move; high-level edit actions include spelling correction, wikify, etc.
Wiki Loves Monuments map English Python
HTML
PHP
GPLv3 Wiki Loves Monuments map is a map with geolocated monuments that require images. These map were used in Wiki Loves Monuments contest.
Wiki2XML parser English Python Wiki2XML parser parsers Wikipedia dump file into well-structured XML.
WikiAudit English Java GPL WikiAudit is a tool that given a Mediawiki wiki location, and set/range of IP addresses, produces a report of the edit history from those IPs. Cheap heuristics try to identify malicious behavior. Useful for network admins and conducting security investigations.
Wikichron English Python Affero GPL (code) WikiChron is a web tool for the analysis and visualization of the evolution of wiki online communities. It uses processed data of the history dumps of mediawiki wikis, computes different metrics on this data and plot it in interactive graphs. It allows to compare different wikis in the same graphs.

This tool will serve investigators in the task of inspecting the behavior of collaborative online communities, in particular wikis, and generate research hypotheses for further and deeper studies. WikiChron has been thought to be very easy to use and highly interactive from the very first beginning. It comes with a bunch of already downloaded and processed wikis from Wikia (but any MediaWiki wiki is supported), and with more than thirty metrics to visualize and compare between wikis.

Moreover, it can be useful in the case of wiki administrators who want to see, analyze and compare how the activity on their wikis is going.

WikiChron is available online here: http://wikichron.science
WikiEvidens English Python GPLv3 WikiEvidens is a visualization and statistical tool for wikis. Wikievidens0.0.6.png
WikiPrep English Perl GPL v2 WikiPrep is a Perl script for preprocessing Wikipedia XML dumps.
WikiSim English Java University of Edinburgh GNU license WikiSim is a knowledge collection and curation simulator.
WikiTeam tools English Python WikiTeam tools is a set of tools focused in wiki preservation and backups.
WikiVis (UM) English Java Educational Community License
Version 2.0
WikiVis (UM) provides an interactive visualization of the Wikipedia information space, primarily as a means of navigating the category hierarchy as well as the article network. The project is implemented in Java, utilizing the Java 3D package. WikiVis-UM-Logo.jpg
Wikia-census Python
Jupyter Notebooks
wikia-census A script to generate a census of all the Wikia's wikis.

Census collected and analysis here: https://www.kaggle.com/abeserra/wikia-census/

Source code here: https://github.com/Grasia/wiki-scripts/tree/master/wikia_census
Wikimedia counter Python
PHP
GPL Wikimedia counter is a near real-time edit counter for all Wikimedia projects. Wikimedia projects edits counter 2010-04-16.png
Wikipedia Extractor English Python GPL v3
Wikipedia-map-reduce English Java Apache License 2.0 Wikipedia-map-reduce is a java software library that allows analysis of Wikipedia at the revision-text level.
Wikiswarm English Java Wikiswarm generates code_swarm event logs from the Wikipedia API.
Wikokit Java EPLv1.0
LGPLv2.1
GPLv2
ALv2.0
New BSD License
wikokit (wiki tool kit) - several projects related to wiki.

wiwordik - machine-readable Wiktionary. A visual interface to the parsed English Wiktionary and Russian Wiktionary databases.
Java WebStart application + JavaFX, English interface.
742 languages extracted from the English Wiktionary.

423 languages extracted from the Russian Wiktionary.
Wiwordik-en.0.09.1094 scrollbox.jpg
Zawilinski Java Zawilinski a Java library that supports the extraction and analysis of grammatical data in Wiktionary.