Sunday, January 16, 2011

Tracking a mediawiki wiki in git

There has been a lot of interest lately in git-backed wikis, like gollum, git-wiki, or gitit. It is appealing to both make it easy for casual editors to make a change (without cloning anything, installing git software, having to push, etc), but also provide a git repository for things like browsing and editing the content offline, being able to merge changes, and the like. The appeal strikes me as particularly strong if the content of the wiki is software (as is envisaged for wheat) or something software-like, such as mathematical proofs (as in wikiproofs).

So a wiki which is backed by git is cool, but what if the wiki is using mediawiki (perhaps the most popular wiki software, in use at wikipedia and other sites)? One might want the various features and extensions of mediawiki, or one might be looking to git-ize a wiki which has been around longer than this whole git-based wiki trend got going. One solution I've been playing with is
levitation-perl.

You first of all get an XML dump file from the wiki (most mediawikis produce these once a day and make them available for download). Then you download levitation-perl and install perl and some packages from
CPAN (as documented in levitation-perl's README and slightly elaborated at my fork). Then you can run levitation-perl (as described in the README, again) to import all the mediawiki changes into an empty git repository. Then you can push this git repository someplace public, clone it locally, and do all the good git stuff you are used to.

At least for me, moving changes from git to the wiki is manual, although that is perhaps in part a function of how wikiproofs works–the changes are checked for correctness as you edit the pages on the wiki, and I haven't yet bothered to try to install the correctness-checking software locally.

When the wiki changes and you want to get all those changes into git, get a new XML dump and run levitation-perl again. You don't need to run it in an empty git repository this time–I just run it in the git repository that I used for the previous levitation-perl run (on the other hand, when I want to check out files, I clone this repository elsewhere).

Levitation-perl hasn't gotten much attention lately, I think due to technical problems using it for something as large as wikipedia (and perhaps policy questions of what to use it for on wikipedia). But none of that really affects my use of it on wikiproofs, because wikiproofs is way smaller than wikipedia, and because my goals are mostly just wanting a way to work on wikiproofs when I don't have good internet
connectivity.