Wednesday, December 07, 2011


Sometimes you pick a programming idiom because it is what you are familiar with, because you think it is expected, or because it expresses clearly what the code you are writing is trying to do. Other times, it is just too hard to resist. Lately at work at least two of us have seen .count.count in our rails3 code, and at first were sure it must be a typo. The real story is more fun than that, see the nerdfeed blog for more.

Thursday, July 28, 2011

Using active record in rails migrations

Most rails developers have probably sooner or later run into the problem: if your migrations refer to active record classes and the active record classes change out from under the migration, old migrations won't work as desired any more. Whether this is a big problem or a minor annoyance depends on how often you run migrations, how many databases you have (typically one for each developer and one or more you deploy to), etc, but I've seen the problem even over the course of three developer machines and a day or two, as some refactoring made people unable to update their code and then run a only-slightly-older migration.

One solution, advocated in the "Data migrations" section of Code review: Ruby and Rails idioms is just to fall back to writing migrations in SQL, bypassing active record (with the exception of the low-level parts of active record which connect to the database). This has two problems. The first is that active record doesn't help you a lot with this kind of low-level SQL construction. The example in that block post uses string interpolation to construct SQL, which they can get away with in that example (because the columns are integers) but which blows up as soon as the quoting isn't correctly handled (in a migration, this is probably just a bug rather than a security hole, but search "SQL injection" if you are unfamiliar with the problems). The second problem is that active record just is a more expressive way to manipulate data. How many people use script/console rather than script/dbconsole to look around the database?

My recommended solution, also advocated in How to use models in your migrations (without killing kittens), is to define the classes within the migration. There's an example in that blog post, but the short summary is that if, for example, your migration wants to refer to Vendor, you put "class Vendor < ActiveRecord::Base; end" within the migration class. In some cases you might need to define a few has_many or belongs_to relationships (make sure to set class_name to refer to the migration-specific class), but the interesting (and surprising to me) thing is that I've found that in practice you don't need a whole lot of them. Just to give a few examples of what this gets you, think of things like calling find_or_create_by_name to skip creating a record if it already exists, or looking up an object by name and then using its ID in a subsequent SQL statement. If you are thinking "but I can do that in SQL", then I'm not sure I should try to convince you. But if you are thinking "yeah, that is easier / more-concise / more-readable in active record" then defining your classes in the migration gets you both this, and also lets you run migrations even after your code has continued to evolve.

Tuesday, June 28, 2011

Celebrate tau day with a few proofs

What better way to celebrate Tau Day than by trying your hand at writing a few proofs? Wikiproofs is a wiki which anyone can edit, and the goal is to build a library of proofs. They are written in a formal language, so the web site can check their correctness, and that is what makes it good for the tau day exercise: you get feedback about whether your proof is correct from the site as you go. The tau day exercises are intended to be of modest length and difficulty (so you can get them done on tau day) and explain everything you need to know about wikiproofs. Go to Wikiproofs tau day to play.

Friday, June 17, 2011

Help me proofread tau day exercises

In honor of tau day (a math holiday celebrated on June 28 every year), I have written some exercises at tau day. If you have a little time to go to that page and try to go through the exercises, I'd appreciate your feedback on writing style, whether they are too easy or too hard, whether they were too long or too short, and any other suggestions. I'm hoping this will be a fun game/exercise for anyone interested in math but I am trying to make it accessible to someone with no experience in formal proofs.

If you are here to read about programming, I intend to keep writing about that too, but the math proofs have been a good part of my hobby activities lately. It is kind of like programming anyway (processed by a computer, needs to follow pretty specific rules to work, can be addictive in some of the same ways).

Sunday, May 15, 2011

Scripting math proofs with RHilbert

Formal math projects like wikiproofs prove mathematical theorems in a way that a computer can verify. There could be several motivations for this, including finding/preventing errors in proofs, helping learners to understand a proof, or exploring the consequences of assuming a different set of axioms.

Wikiproofs and related projects like metamath require that the person writing the proof spell it out pretty explicitly. For example, if you have 1 + 2 = 3 and you need 2 + 1 = 3, you'll need to explicitly transform one into the other. Other proof systems, like coq and isabelle, have a fairly powerful prover which can notice that you have 1 + 2 = 3 and a + b = b + a and combine those to prove 2 + 1 = 3.

Enough background. What I've been playing with lately is a project I just started and which I am calling Hilbert. This is a marriage of a metamath-like proof engine (in this case hilbert-kernel) and a generic scripting language (in this case Ruby). Writing a full prover in Ruby is of course one direction this could eventually head, but I was thinking more in terms of simpler kinds of automation (perhaps there could be a routine called "commute as needed" which would be able to turn "1 + 4 = 3 + 2" into "4 + 1 = 3 + 2" by noticing the left hand side needs to be flipped and the right hand side doesn't). I'm hoping this system will be easy for people who find Ruby more comfortable than coq or isabelle (I might count myself among them, although of course I reserve the right to learn coq or isabelle some day). It also may have other benefits, like making it easier to develop hilbert-kernel itself (for example by running hilbert-kernel tests).

Update 20 May 2011: this is under active development, but the change which needed an update above is that I renamed the project from RHilbert to hilbert.

Tuesday, March 22, 2011

Ruby rescue gotcha

Today in the codebase at my day job, I found a particularly cute bug related to "rescue" in Ruby. This isn't a particularly unknown gotcha–I've read about it on the net at least once–but this is a particularly sweet (or devious) manifestation, and as far as I can tell it was purely accidental, not a contrived example.

The following example uses rspec, but the key thing is begin; 2 / 0 rescue NoMethodError; end versus begin; 2 / 0; rescue NoMethodError; end–what is the difference between these two statements?

require File.expand_path('spec_helper', File.dirname(__FILE__))

describe "enigma" do
it "fails, but why" do
lambda { begin; 2 / 0 rescue NoMethodError; end }.
should raise_error(ZeroDivisionError)

it "passes, but why" do
lambda { begin; 2 / 0; rescue NoMethodError; end }.
should raise_error(ZeroDivisionError)

I'll post the answer in a comment in a few days; feel free to post your answers as comments if you wish.

Thursday, March 03, 2011

Levitation-perl and deleted files

Ever since I started using levitation-perl, I was curious about how it handles deleted files. Well, I found out (the hard way). The symptom was that I did a routine merge and a lot of files showed up as conflicts (including ones which hadn't been edited, on the wiki or on the git side, for a long time). This is usually a symptom of git not being able to find a reasonable common ancestor.

Turned out that any files which had been deleted in mediawiki caused what amounts to a rewrite of the git history (as if those files had never been created). This is not specific to mediawiki/levitation; the same kind of thing would happen in pure git if you ran git filter-branch or similar mechanisms to delete a file from git and make sure it was erased from the history (see for example Rewriting History or The trouble with firmware).

The consequence of the rewritten history for my merge was that the common ancestor was very old (prior to when the deleted file was first created, about two years ago in my case), rather than a few days old as would be the case without the history rewrite.

How did I get my merge done? In my specific case, the content of the deleted files was nothing sensitive, so I was fine with having them remain in the git history. If they needed to stay gone from the history, I would probably have needed to follow RECOVERING FROM UPSTREAM REBASE in the git-rebase manpage.

My solution was to create a merge commit whose parents are the two corresponding commits in the rewritten and non-rewritten history. Since the git repository I was using is public, you can follow along. The commands here are somewhat edited (I've snipped out my dead ends and multiple invocations of gitk to see what I had at each step).

[browse gitk history to find that 681b7936 is one of the commits
with message "add domain and Domain"]
$ git checkout 681b793629d88729b919f40d0862884147db0d8d
Note: checking out '681b793629d88729b919f40d0862884147db0d8d'.

You are in 'detached HEAD' state. . . .
HEAD is now at 681b793... add domain and Domain
$ git checkout -b withdeletedfiles
Switched to a new branch 'withdeletedfiles'
$ gitk levitation/master&
[browse history to find that a6d0885... is the commit with message
"add domain and Domain"]
$ git checkout a6d0885
Note: checking out 'a6d0885'.

You are in 'detached HEAD' state. . . .

HEAD is now at a6d0885... add domain and Domain
$ git checkout -b withoutdeletedfiles
Switched to a new branch 'withoutdeletedfiles'
$ git merge -s ours withdeletedfiles
Merge made by ours.
$ git diff --stat -w withdeletedfiles withoutdeletedfiles
Main/W/P/.3a/WP:INDEX | 2 --
Wikiproofs/S/u/b/Subject index | 30 ------------------------------
2 files changed, 0 insertions(+), 32 deletions(-)
$ git checkout master
Switched to branch 'master'
$ git merge withoutdeletedfiles
Removing Main/W/P/.3a/WP:INDEX
Removing Wikiproofs/S/u/b/Subject index
Merge made by recursive.
Main/W/P/.3a/WP:INDEX | 2 --
Wikiproofs/S/u/b/Subject index | 30 ------------------------------
2 files changed, 0 insertions(+), 32 deletions(-)
delete mode 100644 Main/W/P/.3a/WP:INDEX
delete mode 100644 Wikiproofs/S/u/b/Subject index
[1]+ Done gitk
$ gitk&
[1] 3233
$ git show 01a25538177dbe768e130aa94f7d49be11a63733
commit 01a25538177dbe768e130aa94f7d49be11a63733
Merge: 4ba316c e908dc8
Author: Jim Kingdon
Date: Tue Mar 1 20:40:57 2011 -0500

Merge branch 'withoutdeletedfiles'

$ git diff --stat -w e908dc8..master
Interface/S/e/t/Set theory | 13 +-
Main/O/u/t/Out lines | 311 +++++++++++
[and the rest of the diffs I'd expect from the wiki to git]
22 files changed, 4626 insertions(+), 47 deletions(-)
$ git diff --stat -w 4ba316c..master
Main/W/P/.3a/WP:INDEX | 2 --
Wikiproofs/S/u/b/Subject index | 30 ------------------------------
2 files changed, 0 insertions(+), 32 deletions(-)
$ git push

Saturday, February 26, 2011

Four hour agile subcommittee meeting

Don't know whether to file this under "what happens when something becomes a buzzword", "people who don't get it", or "baby steps, baby steps", but today at standup (for an organization not traditionally agile, but whose avowed intention is to become agile), someone casually mentioned the words "four hour agile subcommittee meeting". It is one of the most wonderfully oxymoronic phrases I've heard in a while.

Sunday, January 16, 2011

Tracking a mediawiki wiki in git

There has been a lot of interest lately in git-backed wikis, like gollum, git-wiki, or gitit. It is appealing to both make it easy for casual editors to make a change (without cloning anything, installing git software, having to push, etc), but also provide a git repository for things like browsing and editing the content offline, being able to merge changes, and the like. The appeal strikes me as particularly strong if the content of the wiki is software (as is envisaged for wheat) or something software-like, such as mathematical proofs (as in wikiproofs).

So a wiki which is backed by git is cool, but what if the wiki is using mediawiki (perhaps the most popular wiki software, in use at wikipedia and other sites)? One might want the various features and extensions of mediawiki, or one might be looking to git-ize a wiki which has been around longer than this whole git-based wiki trend got going. One solution I've been playing with is

You first of all get an XML dump file from the wiki (most mediawikis produce these once a day and make them available for download). Then you download levitation-perl and install perl and some packages from
CPAN (as documented in levitation-perl's README and slightly elaborated at my fork). Then you can run levitation-perl (as described in the README, again) to import all the mediawiki changes into an empty git repository. Then you can push this git repository someplace public, clone it locally, and do all the good git stuff you are used to.

At least for me, moving changes from git to the wiki is manual, although that is perhaps in part a function of how wikiproofs works–the changes are checked for correctness as you edit the pages on the wiki, and I haven't yet bothered to try to install the correctness-checking software locally.

When the wiki changes and you want to get all those changes into git, get a new XML dump and run levitation-perl again. You don't need to run it in an empty git repository this time–I just run it in the git repository that I used for the previous levitation-perl run (on the other hand, when I want to check out files, I clone this repository elsewhere).

Levitation-perl hasn't gotten much attention lately, I think due to technical problems using it for something as large as wikipedia (and perhaps policy questions of what to use it for on wikipedia). But none of that really affects my use of it on wikiproofs, because wikiproofs is way smaller than wikipedia, and because my goals are mostly just wanting a way to work on wikiproofs when I don't have good internet