From WikiPapers
Jump to: navigation, search

parser is included as keyword or extra keyword in 0 datasets, 4 tools and 1 publications.


There is no datasets for this keyword.


Tool Operating System(s) Language(s) Programming language(s) License Description Image
Alternative MediaWiki parsers Cross-platform English PHP
Alternative parsers is a compilation of various alternative MediaWiki parsers which are able or intended to translate MediaWiki's text markup syntax into something else.
Wiki2XML parser Cross-platform English Python Wiki2XML parser parsers Wikipedia dump file into well-structured XML.
Wikiq C++ wikiq is a simple and fast stream-based MediaWiki XML dump parser.


Title Author(s) Published in Language DateThis property is a special property in this wiki. Abstract R C
Design and implementation of the sweble wikitext parser: Unlocking the structured data of Wikipedia Hannes Dohrn
Dirk Riehle
WikiSym 2011 Conference Proceedings - 7th Annual International Symposium on Wikis and Open Collaboration English 2011 The heart of each wiki, including Wikipedia, is its content. Most machine processing starts and ends with this content. At present, such processing is limited, because most wiki engines today cannot provide a complete and precise representation of the wiki's content. They can only generate HTML. The main reason is the lack of well-defined parsers that can handle the complexity of modern wiki markup. This applies to Media Wiki, the software running Wikipedia, and most other wiki engines. This paper shows why it has been so difficult to develop comprehensive parsers for wiki markup. It presents the design and implementation of a parser for Wikitext, the wiki markup language of MediaWiki. We use parsing expression grammars where most parsers used no grammars or grammars poorly suited to the task. Using this parser it is possible to directly and precisely query the structured data within wikis, including Wikipedia. The parser is available as open source from 0 0