Tuesday, December 19, 2006

Database upgrades

Well, my project for the last few weeks has been database upgrades. This concept has been popularized by rails and to a certain extent the Database refactoring book, but it is something that just about any project could do, without a whole lot of mechanism. It only took us (myself and Stelligent) a few days to implement an automated upgrade scheme.

I guess the best way to describe the design is just to point to the Wiki page.

Some of the key decisions were/are:

(1) sql vs java. Is an upgrade script just a .sql file, or is it written in a real programming language like Java/ruby/perl/etc? We went with the .sql approach because it seems like the Simplest Thing Which Could Possibly work, and it seems like introducing a programming language could get needlessly complicated and/or hard to understand. Having said that, I do realize that we may someday hit a situation where it is really awkward to process data via SQL. But I'd rather cross that bridge when we come to it, given how many upgrades so far have been very simple (insert default value into new column, create new table with no rows, etc).

(2) Whether to provide upgrade scripts for every checkin which changes the database schema, or on coarser boundaries like releases? The latter is
sufficient if the goal of this exercise is merely to allow production users to upgrade. Requiring developers to blow away their test data can be done, but the real question I have there is how is a developer supposed to know when they need to do this. I've been on projects where a significant amount of the chatter in the team room is "is anyone else seeing a failure x?" "rebuild your database". This doesn't really seem great even if everyone is in the same room, and on a distributed project like MIFOS, it seems to me quite important to at least detect an out of date database on developer machines (and once you've done that, might as well just do the upgrades).

(3) Whether to provide downgrade scripts. The purpose, I guess, is to let someone try out a new version of the software knowing they can always go back if it didn't work out. Given all we are trying to do in terms of having tests, continuous integration, etc, to make upgrades go smoothly without needing to roll back, I'm not sure how much effort to put into downgrades. On the other hand, it may be hubris to think downgrades will never be needed.

(4) What tests to build. The interesting tests in the MIFOS case are in LatestTest. There are tests of the upgrade scripts (checking against latest-*.sql), the downgrade scripts (checking each one undoes the corresponding upgrade script). We're also thinking about some tests which would test that the upgrade scripts properly upgrade existing user-supplied data (as opposed to just schemas and data supplied with MIFOS). But this post is getting long already, so I'll leave that for another time (or for people to ask about).

No comments: