<?xml version='1.0' encoding='UTF-8'?><?xml-stylesheet href="http://www.blogger.com/styles/atom.css" type="text/css"?><feed xmlns='http://www.w3.org/2005/Atom' xmlns:openSearch='http://a9.com/-/spec/opensearchrss/1.0/' xmlns:georss='http://www.georss.org/georss' xmlns:gd='http://schemas.google.com/g/2005' xmlns:thr='http://purl.org/syndication/thread/1.0'><id>tag:blogger.com,1999:blog-37120426</id><updated>2011-12-07T20:37:38.732-08:00</updated><category term='versioning git levitation mediawiki'/><title type='text'>Jim Kingdon on Programming</title><subtitle type='html'></subtitle><link rel='http://schemas.google.com/g/2005#feed' type='application/atom+xml' href='http://jkingdon2000.blogspot.com/feeds/posts/default'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/37120426/posts/default?max-results=100'/><link rel='alternate' type='text/html' href='http://jkingdon2000.blogspot.com/'/><link rel='hub' href='http://pubsubhubbub.appspot.com/'/><author><name>Jim Kingdon</name><uri>http://www.blogger.com/profile/01857308320156877253</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><generator version='7.00' uri='http://www.blogger.com'>Blogger</generator><openSearch:totalResults>41</openSearch:totalResults><openSearch:startIndex>1</openSearch:startIndex><openSearch:itemsPerPage>100</openSearch:itemsPerPage><entry><id>tag:blogger.com,1999:blog-37120426.post-319602668458655756</id><published>2011-12-07T20:32:00.001-08:00</published><updated>2011-12-07T20:37:38.742-08:00</updated><title type='text'>count.count</title><content type='html'>Sometimes you pick a programming idiom because it is what you are familiar with, because you think it is expected, or because it expresses clearly what the code you are writing is trying to do. Other times, it is just too hard to resist. Lately at work at least two of us have seen &lt;code&gt;.count.count&lt;/code&gt; in our rails3 code, and at first were sure it must be a typo. The real story is more fun than that, see &lt;a href="http://nerdfeed.needfeed.com/blog/2011/12/count-count/"&gt;the nerdfeed blog&lt;/a&gt; for more.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/37120426-319602668458655756?l=jkingdon2000.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://jkingdon2000.blogspot.com/feeds/319602668458655756/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=37120426&amp;postID=319602668458655756' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/37120426/posts/default/319602668458655756'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/37120426/posts/default/319602668458655756'/><link rel='alternate' type='text/html' href='http://jkingdon2000.blogspot.com/2011/12/countcount.html' title='count.count'/><author><name>Jim Kingdon</name><uri>http://www.blogger.com/profile/01857308320156877253</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-37120426.post-9210296350440195051</id><published>2011-07-28T08:45:00.000-07:00</published><updated>2011-07-28T09:13:31.191-07:00</updated><title type='text'>Using active record in rails migrations</title><content type='html'>Most rails developers have probably sooner or later run into the problem: if your migrations refer to active record classes and the active record classes change out from under the migration, old migrations won't work as desired any more. Whether this is a big problem or a minor annoyance depends on how often you run migrations, how many databases you have (typically one for each developer and one or more you deploy to), etc, but I've seen the problem even over the course of three developer machines and a day or two, as some refactoring made people unable to update their code and then run a only-slightly-older migration.&lt;br /&gt;&lt;br /&gt;One solution, advocated in the "Data migrations" section of &lt;a href="http://robots.thoughtbot.com/post/8135270582/code-review-ruby-and-rails-idioms" &gt;Code review: Ruby and Rails idioms&lt;/a&gt; is just to fall back to writing migrations in SQL, bypassing active record (with the exception of the low-level parts of active record which connect to the database). This has two problems. The first is that active record doesn't help you a lot with this kind of low-level SQL construction. The example in that block post uses string interpolation to construct SQL, which they can get away with in that example (because the columns are integers) but which blows up as soon as the quoting isn't correctly handled (in a migration, this is probably just a bug rather than a security hole, but search "SQL injection" if you are unfamiliar with the problems). The second problem is that active record just is a more expressive way to manipulate data. How many people use script/console rather than script/dbconsole to look around the database?&lt;br /&gt;&lt;br /&gt;My recommended solution, also advocated in &lt;a href="http://gem-session.com/2010/03/how-to-use-models-in-your-migrations-without-killing-kittens" &gt;How to use models in your migrations (without killing kittens)&lt;/a&gt;, is to define the classes within the migration. There's an example in that blog post, but the short summary is that if, for example, your migration wants to refer to Vendor, you put "class Vendor &lt; ActiveRecord::Base; end" within the migration class. In some cases you might need to define a few has_many or belongs_to relationships (make sure to set class_name to refer to the migration-specific class), but the interesting (and surprising to me) thing is that I've found that in practice you don't need a whole lot of them. Just to give a few examples of what this gets you, think of things like calling find_or_create_by_name to skip creating a record if it already exists, or looking up an object by name and then using its ID in a subsequent SQL statement. If you are thinking "but I can do that in SQL", then I'm not sure I should try to convince you. But if you are thinking "yeah, that is easier / more-concise / more-readable in active record" then defining your classes in the migration gets you both this, and also lets you run migrations even after your code has continued to evolve.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/37120426-9210296350440195051?l=jkingdon2000.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://jkingdon2000.blogspot.com/feeds/9210296350440195051/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=37120426&amp;postID=9210296350440195051' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/37120426/posts/default/9210296350440195051'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/37120426/posts/default/9210296350440195051'/><link rel='alternate' type='text/html' href='http://jkingdon2000.blogspot.com/2011/07/using-active-record-in-rails-migrations.html' title='Using active record in rails migrations'/><author><name>Jim Kingdon</name><uri>http://www.blogger.com/profile/01857308320156877253</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-37120426.post-4264009127069242806</id><published>2011-06-28T03:00:00.000-07:00</published><updated>2011-06-28T03:01:23.476-07:00</updated><title type='text'>Celebrate tau day with a few proofs</title><content type='html'>What better way to celebrate &lt;a href="http://tauday.com/" &gt;Tau Day&lt;/a&gt; than by trying your hand at writing a few proofs? Wikiproofs is a wiki which anyone can edit, and the goal is to build a library of proofs. They are written in a formal language, so the web site can check their correctness, and that is what makes it good for the tau day exercise: you get feedback about whether your proof is correct from the site as you go. The tau day exercises are intended to be of modest length and difficulty (so you can get them done on tau day) and explain everything you need to know about wikiproofs. Go to &lt;a href="http://wikiproofs.org/wiki/index.php?title=Help:Tau_day" &gt;Wikiproofs tau day&lt;/a&gt; to play.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/37120426-4264009127069242806?l=jkingdon2000.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://jkingdon2000.blogspot.com/feeds/4264009127069242806/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=37120426&amp;postID=4264009127069242806' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/37120426/posts/default/4264009127069242806'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/37120426/posts/default/4264009127069242806'/><link rel='alternate' type='text/html' href='http://jkingdon2000.blogspot.com/2011/06/celebrate-tau-day-with-few-proofs.html' title='Celebrate tau day with a few proofs'/><author><name>Jim Kingdon</name><uri>http://www.blogger.com/profile/01857308320156877253</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-37120426.post-2286671332945973028</id><published>2011-06-17T19:52:00.000-07:00</published><updated>2011-06-17T19:54:15.705-07:00</updated><title type='text'>Help me proofread tau day exercises</title><content type='html'>In honor of &lt;a href="http://tauday.com" &gt;tau day&lt;/a&gt; (a math holiday celebrated on June 28 every year), I have written some exercises at &lt;a href="http://wikiproofs.org/wiki/index.php?title=Help:Tau_day" &gt;wikiproofs.org tau day&lt;/a&gt;. If you have a little time to go to that page and try to go through the exercises, I'd appreciate your feedback on writing style, whether they are too easy or too hard, whether they were too long or too short, and any other suggestions. I'm hoping this will be a fun game/exercise for anyone interested in math but I am trying to make it accessible to someone with no experience in formal proofs.&lt;br /&gt;&lt;br /&gt;If you are here to read about programming, I intend to keep writing about that too, but the math proofs have been a good part of my hobby activities lately. It is kind of like programming anyway (processed by a computer, needs to follow pretty specific rules to work, can be addictive in some of the same ways).&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/37120426-2286671332945973028?l=jkingdon2000.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://jkingdon2000.blogspot.com/feeds/2286671332945973028/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=37120426&amp;postID=2286671332945973028' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/37120426/posts/default/2286671332945973028'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/37120426/posts/default/2286671332945973028'/><link rel='alternate' type='text/html' href='http://jkingdon2000.blogspot.com/2011/06/help-me-proofread-tau-day-exercises.html' title='Help me proofread tau day exercises'/><author><name>Jim Kingdon</name><uri>http://www.blogger.com/profile/01857308320156877253</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-37120426.post-312414825167048931</id><published>2011-05-15T05:21:00.000-07:00</published><updated>2011-05-20T06:08:23.069-07:00</updated><title type='text'>Scripting math proofs with RHilbert</title><content type='html'>Formal math projects like &lt;a href="http://wikiproofs.org/"&gt;wikiproofs&lt;/a&gt; prove mathematical theorems in a way that a computer can verify. There could be several motivations for this, including finding/preventing errors in proofs, helping learners to understand a proof, or exploring the consequences of assuming a different set of axioms.&lt;br /&gt;&lt;br /&gt;Wikiproofs and related projects like metamath require that the person writing the proof spell it out pretty explicitly. For example, if you have 1 + 2 = 3 and you need 2 + 1 = 3, you'll need to explicitly transform one into the other. Other proof systems, like coq and isabelle, have a fairly powerful prover which can notice that you have 1 + 2 = 3 and a + b = b + a and combine those to prove 2 + 1 = 3.&lt;br /&gt;&lt;br /&gt;Enough background. What I've been playing with lately is a project I just started and which I am calling &lt;a href="http://github.com/jkingdon/hilbert"&gt;Hilbert&lt;/a&gt;. This is a marriage of a metamath-like proof engine (in this case &lt;a href="http://github.com/TheCount/hilbert-kernel"&gt;hilbert-kernel&lt;/a&gt;) and a generic scripting language (in this case Ruby). Writing a full prover in Ruby is of course one direction this could eventually head, but I was thinking more in terms of simpler kinds of automation (perhaps there could be a routine called "commute as needed" which would be able to turn "1 + 4 = 3 + 2" into "4 + 1 = 3 + 2" by noticing the left hand side needs to be flipped and the right hand side doesn't). I'm hoping this system will be easy for people who find Ruby more comfortable than coq or isabelle (I might count myself among them, although of course I reserve the right to learn coq or isabelle some day). It also may have other benefits, like making it easier to develop hilbert-kernel itself (for example by running hilbert-kernel tests).&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight:bold;"&gt;Update 20 May 2011:&lt;/span&gt; this is under active development, but the change which needed an update above is that I renamed the project from RHilbert to hilbert.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/37120426-312414825167048931?l=jkingdon2000.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://jkingdon2000.blogspot.com/feeds/312414825167048931/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=37120426&amp;postID=312414825167048931' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/37120426/posts/default/312414825167048931'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/37120426/posts/default/312414825167048931'/><link rel='alternate' type='text/html' href='http://jkingdon2000.blogspot.com/2011/05/scripting-math-proofs-with-rhilbert.html' title='Scripting math proofs with RHilbert'/><author><name>Jim Kingdon</name><uri>http://www.blogger.com/profile/01857308320156877253</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-37120426.post-5842740604252818958</id><published>2011-03-22T13:40:00.000-07:00</published><updated>2011-03-22T13:54:14.575-07:00</updated><title type='text'>Ruby rescue gotcha</title><content type='html'>Today in the codebase at my day job, I found a particularly cute bug related to "rescue" in Ruby. This isn't a particularly unknown gotcha–I've read about it on the net at least once–but this is a particularly sweet (or devious) manifestation, and as far as I can tell it was purely accidental, not a contrived example.&lt;br /&gt;&lt;br /&gt;The following example uses rspec, but the key thing is &lt;code&gt;begin; 2 / 0 rescue NoMethodError; end&lt;/code&gt; versus &lt;code&gt;begin; 2 / 0; rescue NoMethodError; end&lt;/code&gt;–what is the difference between these two statements?&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;require File.expand_path('spec_helper', File.dirname(__FILE__))&lt;br /&gt;&lt;br /&gt;describe "enigma" do&lt;br /&gt;  it "fails, but why" do&lt;br /&gt;    lambda { begin; 2 / 0 rescue NoMethodError; end }.&lt;br /&gt;      should raise_error(ZeroDivisionError)&lt;br /&gt;  end&lt;br /&gt;&lt;br /&gt;  it "passes, but why" do&lt;br /&gt;    lambda { begin; 2 / 0; rescue NoMethodError; end }.&lt;br /&gt;      should raise_error(ZeroDivisionError)&lt;br /&gt;  end&lt;br /&gt;end&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;I'll post the answer in a comment in a few days; feel free to post your answers as comments if you wish.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/37120426-5842740604252818958?l=jkingdon2000.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://jkingdon2000.blogspot.com/feeds/5842740604252818958/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=37120426&amp;postID=5842740604252818958' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/37120426/posts/default/5842740604252818958'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/37120426/posts/default/5842740604252818958'/><link rel='alternate' type='text/html' href='http://jkingdon2000.blogspot.com/2011/03/ruby-rescue-gotcha.html' title='Ruby rescue gotcha'/><author><name>Jim Kingdon</name><uri>http://www.blogger.com/profile/01857308320156877253</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-37120426.post-6888722825029331308</id><published>2011-03-03T17:48:00.000-08:00</published><updated>2011-03-03T17:54:21.452-08:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='versioning git levitation mediawiki'/><title type='text'>Levitation-perl and deleted files</title><content type='html'>Ever since I &lt;a href="http://jkingdon2000.blogspot.com/2011/01/tracking-mediawiki-wiki-in-git.html" &gt;started using levitation-perl&lt;/a&gt;, I was curious about how it handles deleted files. Well, I found out (the hard way). The symptom was that I did a routine merge and a lot of files showed up as conflicts (including ones which hadn't been edited, on the wiki or on the git side, for a long time). This is usually a symptom of git not being able to find a reasonable common ancestor.&lt;br /&gt;&lt;br /&gt;Turned out that any files which had been deleted in mediawiki caused what amounts to a rewrite of the git history (as if those files had never been created). This is not specific to mediawiki/levitation; the same kind of thing would happen in pure git if you ran git filter-branch or similar mechanisms to delete a file from git and make sure it was erased from the history (see for example &lt;a href="http://progit.org/book/ch6-4.html" &gt;Rewriting History&lt;/a&gt; or &lt;a href="http://lwn.net/Articles/421680/" &gt;The trouble with firmware&lt;/a&gt;).&lt;br /&gt;&lt;br /&gt;The consequence of the rewritten history for my merge was that the common ancestor was very old (prior to when the deleted file was first created, about two years ago in my case), rather than a few days old as would be the case without the history rewrite.&lt;br /&gt;&lt;br /&gt;How did I get my merge done? In my specific case, the content of the deleted files was nothing sensitive, so I was fine with having them remain in the git history. If they needed to stay gone from the history, I would probably have needed to follow RECOVERING FROM UPSTREAM REBASE in the git-rebase manpage.&lt;br /&gt;&lt;br /&gt;My solution was to create a merge commit whose parents are the two corresponding commits in the rewritten and non-rewritten history. Since the git repository I was using is public, you can &lt;a href="http://github.com/jkingdon/wikiproofs" &gt;follow along&lt;/a&gt;. The commands here are somewhat edited (I've snipped out my dead ends and multiple invocations of gitk to see what I had at each step).&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;[browse gitk history to find that 681b7936 is one of the commits&lt;br /&gt;with message "add domain and Domain"]&lt;br /&gt;$ git checkout 681b793629d88729b919f40d0862884147db0d8d&lt;br /&gt;Note: checking out '681b793629d88729b919f40d0862884147db0d8d'.&lt;br /&gt;&lt;br /&gt;You are in 'detached HEAD' state. . . .&lt;br /&gt;HEAD is now at 681b793... add domain and Domain&lt;br /&gt;$ git checkout -b withdeletedfiles&lt;br /&gt;Switched to a new branch 'withdeletedfiles'&lt;br /&gt;$ gitk levitation/master&amp;&lt;br /&gt;[browse history to find that a6d0885... is the commit with message&lt;br /&gt;"add domain and Domain"]&lt;br /&gt;$ git checkout a6d0885&lt;br /&gt;Note: checking out 'a6d0885'.&lt;br /&gt;&lt;br /&gt;You are in 'detached HEAD' state. . . .&lt;br /&gt;&lt;br /&gt;HEAD is now at a6d0885... add domain and Domain&lt;br /&gt;$ git checkout -b withoutdeletedfiles&lt;br /&gt;Switched to a new branch 'withoutdeletedfiles'&lt;br /&gt;$ git merge -s ours withdeletedfiles&lt;br /&gt;Merge made by ours.&lt;br /&gt;$ git diff --stat -w withdeletedfiles withoutdeletedfiles&lt;br /&gt; Main/W/P/.3a/WP:INDEX          |    2 --&lt;br /&gt; Wikiproofs/S/u/b/Subject index |   30 ------------------------------&lt;br /&gt; 2 files changed, 0 insertions(+), 32 deletions(-)&lt;br /&gt;$ git checkout master&lt;br /&gt;Switched to branch 'master'&lt;br /&gt;$ git merge withoutdeletedfiles&lt;br /&gt;Removing Main/W/P/.3a/WP:INDEX&lt;br /&gt;Removing Wikiproofs/S/u/b/Subject index&lt;br /&gt;Merge made by recursive.&lt;br /&gt; Main/W/P/.3a/WP:INDEX          |    2 --&lt;br /&gt; Wikiproofs/S/u/b/Subject index |   30 ------------------------------&lt;br /&gt; 2 files changed, 0 insertions(+), 32 deletions(-)&lt;br /&gt; delete mode 100644 Main/W/P/.3a/WP:INDEX&lt;br /&gt; delete mode 100644 Wikiproofs/S/u/b/Subject index&lt;br /&gt;[1]+  Done                    gitk&lt;br /&gt;$ gitk&amp;&lt;br /&gt;[1] 3233&lt;br /&gt;$ git show 01a25538177dbe768e130aa94f7d49be11a63733&lt;br /&gt;commit 01a25538177dbe768e130aa94f7d49be11a63733&lt;br /&gt;Merge: 4ba316c e908dc8&lt;br /&gt;Author: Jim Kingdon &lt;jkingdon@localhost.localdomain&gt;&lt;br /&gt;Date:   Tue Mar 1 20:40:57 2011 -0500&lt;br /&gt;&lt;br /&gt;    Merge branch 'withoutdeletedfiles'&lt;br /&gt;&lt;br /&gt;$ git diff --stat -w e908dc8..master&lt;br /&gt; Interface/S/e/t/Set theory                         |   13 +-&lt;br /&gt; Main/O/u/t/Out lines                               |  311 +++++++++++&lt;br /&gt;[and the rest of the diffs I'd expect from the wiki to git]&lt;br /&gt; 22 files changed, 4626 insertions(+), 47 deletions(-)&lt;br /&gt;$ git diff --stat -w 4ba316c..master&lt;br /&gt; Main/W/P/.3a/WP:INDEX          |    2 --&lt;br /&gt; Wikiproofs/S/u/b/Subject index |   30 ------------------------------&lt;br /&gt; 2 files changed, 0 insertions(+), 32 deletions(-)&lt;br /&gt;$ git push&lt;br /&gt;$ &lt;br /&gt;&lt;/pre&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/37120426-6888722825029331308?l=jkingdon2000.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://jkingdon2000.blogspot.com/feeds/6888722825029331308/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=37120426&amp;postID=6888722825029331308' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/37120426/posts/default/6888722825029331308'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/37120426/posts/default/6888722825029331308'/><link rel='alternate' type='text/html' href='http://jkingdon2000.blogspot.com/2011/03/levitation-perl-and-deleted-files.html' title='Levitation-perl and deleted files'/><author><name>Jim Kingdon</name><uri>http://www.blogger.com/profile/01857308320156877253</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-37120426.post-4752328156215787797</id><published>2011-02-26T04:35:00.000-08:00</published><updated>2011-02-26T04:45:49.728-08:00</updated><title type='text'>Four hour agile subcommittee meeting</title><content type='html'>Don't know whether to file this under "&lt;a href="http://martinfowler.com/bliki/SemanticDiffusion.html" &gt;what happens when something becomes a buzzword&lt;/a&gt;", "people who don't get it", or "baby steps, baby steps", but today at standup (for an organization not traditionally agile, but whose avowed intention is to become agile), someone casually mentioned the words "four hour agile subcommittee meeting". It is one of the most wonderfully oxymoronic phrases I've heard in a while.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/37120426-4752328156215787797?l=jkingdon2000.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://jkingdon2000.blogspot.com/feeds/4752328156215787797/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=37120426&amp;postID=4752328156215787797' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/37120426/posts/default/4752328156215787797'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/37120426/posts/default/4752328156215787797'/><link rel='alternate' type='text/html' href='http://jkingdon2000.blogspot.com/2011/02/four-hour-agile-subcommittee-meeting.html' title='Four hour agile subcommittee meeting'/><author><name>Jim Kingdon</name><uri>http://www.blogger.com/profile/01857308320156877253</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-37120426.post-3023833231243055539</id><published>2011-01-16T07:13:00.000-08:00</published><updated>2011-01-16T07:29:40.520-08:00</updated><title type='text'>Tracking a mediawiki wiki in git</title><content type='html'>There has been a lot of interest lately in git-backed wikis, like &lt;a href="https://github.com/github/gollum" &gt;gollum&lt;/a&gt;, &lt;a href="https://github.com/sr/git-wiki" &gt;git-wiki&lt;/a&gt;, or &lt;a href="http://gitit.johnmacfarlane.net/" &gt;gitit&lt;/a&gt;. It is appealing to both make it easy for casual editors to make a change (without cloning anything, installing git software, having to push, etc), but also provide a git repository for things like browsing and editing the content offline, being able to merge changes, and the like. The appeal strikes me as particularly strong if the content of the wiki is software (as is envisaged for &lt;a href="http://www.wheatfarm.org/" &gt;wheat&lt;/a&gt;) or something software-like, such as mathematical proofs (as in &lt;a href="http://www.wikiproofs.org" &gt;wikiproofs&lt;/a&gt;).&lt;br /&gt;&lt;br /&gt;So a wiki which is backed by git is cool, but what if the wiki is using &lt;a href="http://mediawiki.org/" &gt;mediawiki&lt;/a&gt; (perhaps the most popular wiki software, in use at wikipedia and other sites)? One might want the various features and extensions of mediawiki, or one might be looking to git-ize a wiki which has been around longer than this whole git-based wiki trend got going. One solution I've been playing with is&lt;br /&gt;&lt;a href="http://github.com/sbober/levitation-perl" &gt;levitation-perl&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;You first of all get an XML dump file from the wiki (most mediawikis produce these once a day and make them available for download). Then you download levitation-perl and install perl and some packages from&lt;br /&gt;CPAN (as documented in levitation-perl's README and slightly elaborated at &lt;a href="http://github.com/jkingdon/levitation-perl" &gt;my fork&lt;/a&gt;). Then you can run levitation-perl (as described in the README, again) to import all the mediawiki changes into an empty git repository. Then you can push this git repository someplace public, clone it locally, and do all the good git stuff you are used to.&lt;br /&gt;&lt;br /&gt;At least for me, moving changes from git to the wiki is manual, although that is perhaps in part a function of how wikiproofs works–the changes are checked for correctness as you edit the pages on the wiki, and I haven't yet bothered to try to install the correctness-checking software locally.&lt;br /&gt;&lt;br /&gt;When the wiki changes and you want to get all those changes into git, get a new XML dump and run levitation-perl again. You don't need to run it in an empty git repository this time–I just run it in the git repository that I used for the previous levitation-perl run (on the other hand, when I want to check out files, I clone this repository elsewhere).&lt;br /&gt;&lt;br /&gt;Levitation-perl hasn't gotten much attention lately, I think due to technical problems using it for something as large as wikipedia (and perhaps policy questions of what to use it for on wikipedia). But none of that really affects my use of it on wikiproofs, because wikiproofs is way smaller than wikipedia, and because my goals are mostly just wanting a way to work on wikiproofs when I don't have good internet&lt;br /&gt;connectivity.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/37120426-3023833231243055539?l=jkingdon2000.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://jkingdon2000.blogspot.com/feeds/3023833231243055539/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=37120426&amp;postID=3023833231243055539' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/37120426/posts/default/3023833231243055539'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/37120426/posts/default/3023833231243055539'/><link rel='alternate' type='text/html' href='http://jkingdon2000.blogspot.com/2011/01/tracking-mediawiki-wiki-in-git.html' title='Tracking a mediawiki wiki in git'/><author><name>Jim Kingdon</name><uri>http://www.blogger.com/profile/01857308320156877253</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-37120426.post-1846170658196095907</id><published>2008-10-10T19:20:00.000-07:00</published><updated>2008-10-10T19:24:07.891-07:00</updated><title type='text'>Writing an Eclipse plug-in</title><content type='html'>I recently started playing around with writing an Eclipse plug-in, and I thought I should share some first impressions about how easy it was.&lt;br /&gt;&lt;br /&gt;The motive was to more easily play with &lt;a href="http://www.metamath.org/" &gt;metamath&lt;/a&gt;, a system to automatically verify (not write) mathematics proofs.  Unless I've missed something, there is little activity on development environments for math proof systems, but it seems to me the need for good tools (like eclipse) is at least as great for math proofs as for software.&lt;br /&gt;&lt;br /&gt;In all cases this was based on the Eclipse which ships with Fedora 8, although I'm not aware of anything Fedora-specific.  I started with the eclipse help files for "Plug-in development environment".  It was relatively easy to create a project seeded with one of the example plug-ins which ships with eclipse (in my case, first the hello world one, and then com.example.witheditor which seemed like the most relevant to the plug-in which I was trying to write, which at least at the start will be a few simple decorations on top of text editing, similar to an emacs major mode or one of the syntax coloring modules for vi, gedit, etc).  Generally, the help files walked me through all I needed to get started.  I was somewhat puzzled with "how do I get back to the Overview once it is closed" until I figured that opening the plugin.xml file gives you a specialized view, including Overview, with links to click and forms to edit.&lt;br /&gt;&lt;br /&gt;The fact that Fedora ships with the source code to the standard Eclipse classes that a plug-in needs to hook into, combined with the good Eclipse features for navigating Java code, made things really easy.  I was pleasantly surprised the first time I control-clicked on an Eclipse method and got not only the arguments I needed to pass in, but also the commented source code.  Within a few hours, I had turned the example plugin into something which was at least starting to understand metamath syntax.  Not bad considering that this includes a fair bit of experimenting (e.g. playing with foreground and background colors) and learning about the platform.  I tend to find that Java and Eclipse make it easier to explore a large unfamiliar codebase, compared with a language like Ruby (where exploring large unfamiliar codebases, like Rails, has been a daily activity in my recent job), but I also give credit to developers of the Eclipse plug-in system.  The example pointed me to the relevant parts of the plug-in libraries, and the well-commented source code of the libraries themselves helped me poke around to figure out what pieces would do what I want.  Another huge win is the way that the eclipse plug-in development environment just worked out the box.  There was no messing around with CLASSPATHs, jars, ant and similar rigamarole: just go to the Overview page and click on "launch" and you are running.&lt;br /&gt;&lt;br /&gt;Error reporting was a problem: one of my first edits to the example passed a bad value to a constructor in a constant (this is the RGB constructor in IXMLColorConstants if anyone is following along).  There was a dialog box referring to an error log, but I have no idea where this error log might live (apparently not in my plug-in project).  Ideally, the plug-in development environment would have somehow showed an exception with a stack trace, or something of the sort.&lt;br /&gt;&lt;br /&gt;My plug-in so far can be found at &lt;a href="http://github.com/jkingdon/mmclipse" &gt;mmclipse&lt;/a&gt;.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/37120426-1846170658196095907?l=jkingdon2000.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://jkingdon2000.blogspot.com/feeds/1846170658196095907/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=37120426&amp;postID=1846170658196095907' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/37120426/posts/default/1846170658196095907'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/37120426/posts/default/1846170658196095907'/><link rel='alternate' type='text/html' href='http://jkingdon2000.blogspot.com/2008/10/writing-eclipse-plug-in.html' title='Writing an Eclipse plug-in'/><author><name>Jim Kingdon</name><uri>http://www.blogger.com/profile/01857308320156877253</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-37120426.post-6567233926422424470</id><published>2008-08-28T22:07:00.000-07:00</published><updated>2008-08-28T22:08:49.453-07:00</updated><title type='text'>Simple Design and Testing conference, suburban Chicago, Sep 12-14</title><content type='html'>I probably should have mentioned this a while ago, but coming up in about two weeks is the 2008 &lt;a href="http://www.sdtconf.com" &gt;Simple Design and Testing&lt;/a&gt; conference.  As with other open space conferences, the format is similar to Birds of a Feather (BoF) sessions and ranges from free-wheeling discussion to something slightly more closely approaching a presentation.  Topics typically center around things like agile development and object oriented design. The conference is free and registration is pretty simple - write a wiki page called a "position paper" saying what you want to learn and/or contribute.  Mine was &lt;a href="http://www.sdtconf.com/wiki/tiki-index.php?page=ErectorRubyRenderingLibrary" &gt;ErectorRubyRenderingLibrary&lt;/a&gt;.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/37120426-6567233926422424470?l=jkingdon2000.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://jkingdon2000.blogspot.com/feeds/6567233926422424470/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=37120426&amp;postID=6567233926422424470' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/37120426/posts/default/6567233926422424470'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/37120426/posts/default/6567233926422424470'/><link rel='alternate' type='text/html' href='http://jkingdon2000.blogspot.com/2008/08/simple-design-and-testing-conference.html' title='Simple Design and Testing conference, suburban Chicago, Sep 12-14'/><author><name>Jim Kingdon</name><uri>http://www.blogger.com/profile/01857308320156877253</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-37120426.post-2222868079500580087</id><published>2008-05-22T22:13:00.000-07:00</published><updated>2008-05-22T22:59:49.920-07:00</updated><title type='text'>Parse XML with saxophone</title><content type='html'>Programmers often want to parse an XML document and perform some actions as we do (for example, build up an in-memory data structure, write data to a database, print output to the console, etc).&lt;br /&gt;&lt;br /&gt;For the most part, there have only been two well-known ways of doing this.  The first is to read the XML document into a DOM, which is an in-memory tree representing the document.  Then you walk the DOM tree doing your own processing.  This is usually a pretty convenient way to go.  But it has several downsides, the most obvious and probably biggest of which is that if the XML document is too big to fit in memory it won't work.&lt;br /&gt;&lt;br /&gt;The second approach is a streaming API such as &lt;a href="http://www.saxproject.org" &gt;SAX&lt;/a&gt; or &lt;a href="http://www.xmlpull.org/" &gt;xmlpull&lt;/a&gt;.  SAX calls you whereas you call xmlpull, but either way you are getting a low-level stream of events (start tag, end tag, and text being the most important).  For simple tasks, this isn't so bad.  But when you have to assemble data from a few different parts of the document, you need to set variables to keep track of what you've gotten and what you are waiting for, and it is possible to make yourself a pretty big pile of spaghetti.  My co-workers and I had such a problem recently, and our answer was a small library, which we call &lt;a href="http://pivotalrb.rubyforge.org/svn/saxophone/trunk" &gt;saxophone&lt;/a&gt;.  Saxophone sits on top of SAX, and converts the raw stream of events into calls to handlers, and lets the handlers return data which get passed to other handlers.  The current implementation is in ruby, although the ideas should port to other languages if anyone has such a need.&lt;br /&gt;&lt;br /&gt;Here's a simple example.  We want to parse a web feed  and print out each of the titles.  That is, there will be, buried somewhere in the XML, something like &lt;tt&gt;&amp;lt;title&amp;gt;My Dog has Fleas&amp;lt;/title&amp;gt;&lt;/tt&gt; and we want to print out "My Dog has Fleas" and likewise for every other occurrence of a title tag.&lt;br /&gt;&lt;br /&gt;The full example can be found in the examples directory of saxophone, but the key part is:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;parser = Saxophone.new(&lt;br /&gt;  :title =&gt; lambda { | element | puts element.all_text() }&lt;br /&gt;)&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;This is saying that any time you see a "title" tag, call this handler, and provide it all the text directly under title via the all_text() method.&lt;br /&gt;&lt;br /&gt;This example doesn't demonstrate everything.  Handlers can return values, which are then available to higher handlers.  Handlers also have access to attributes directly.&lt;br /&gt;&lt;br /&gt;If people want to hear more about it, I can write some more documentation and examples for some of these other features.  But the key thing here is that we have a fairly concise way to (a) match the parts of the XML that we care about, and (b) synthesize the results of those matches into larger data structures (but only as far as we want - after we have gotten everything we need to, for example, write a database row, we can return that memory to the system).  This is all done in a streaming fashion.  That is, we don't need to store the whole document, or the whole result, in memory.&lt;br /&gt;&lt;br /&gt;The whole thing kind of reminds me of XSLT or XQuery.  In know in XSLT for sure, and probably XQuery, there are some tasks that turn out to be really awkward, and that's probably true of saxophone as well.  But saxophone seems like a good match for a few of the problems we've had.  And of course, having a library within your regular programming language (in this case Ruby) can also be a plus over a special-purpose language.&lt;br /&gt;&lt;br /&gt;Saxophone is free software, available &lt;a href="http://pivotalrb.rubyforge.org/svn/saxophone/trunk" &gt;here&lt;/a&gt; as part of the &lt;a href="http://pivotalrb.rubyforge.org/" &gt;pivotalrb&lt;/a&gt; project on rubyforge, which contains a variety of open source code released by the ruby/java consulting firm &lt;a href="http://pivotallabs.com" &gt;Pivotal&lt;/a&gt;.  I and my pair wrote saxophone as part of a Pivotal project.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/37120426-2222868079500580087?l=jkingdon2000.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://jkingdon2000.blogspot.com/feeds/2222868079500580087/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=37120426&amp;postID=2222868079500580087' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/37120426/posts/default/2222868079500580087'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/37120426/posts/default/2222868079500580087'/><link rel='alternate' type='text/html' href='http://jkingdon2000.blogspot.com/2008/05/parse-xml-with-saxophone.html' title='Parse XML with saxophone'/><author><name>Jim Kingdon</name><uri>http://www.blogger.com/profile/01857308320156877253</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-37120426.post-3450514672347878196</id><published>2008-04-03T07:25:00.000-07:00</published><updated>2008-04-03T07:59:26.620-07:00</updated><title type='text'>Draft of Erector talk</title><content type='html'>I'm giving a talk on &lt;a href="http://erector.rubyforge.org/" &gt;erector&lt;/a&gt; in about a week at the &lt;a href="http://www.dcrug.org/" &gt;Washington, DC Ruby User's Group&lt;/a&gt;.  Here's a draft of my talk; please provide feedback so I can improve the talk (for example, as comments at the bottom of this page).&lt;br /&gt;&lt;br /&gt;&lt;hr /&gt;&lt;br /&gt;Ruby's flexible and minimal syntax makes it well-suited to writing libraries for various tasks&lt;br /&gt;&lt;hr /&gt;&lt;br /&gt;Erector applies that to HTML (or XML) rendering&lt;br /&gt;&lt;hr /&gt;&lt;br /&gt;(start up erector in script/console and go through the following examples)&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;class Foo &lt; Erector::Widget&lt;br /&gt;  def render&lt;br /&gt;    html&lt;br /&gt;  end&lt;br /&gt;end&lt;br /&gt;Foo.new().to_s&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;Erector::Widget.new() do&lt;br /&gt;  html&lt;br /&gt;end.to_s&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;Erector::Widget.new() do&lt;br /&gt;  p "foo"&lt;br /&gt;end.to_s&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;Erector::Widget.new() do&lt;br /&gt;  text "hello"&lt;br /&gt;end.to_s&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;Erector::Widget.new() do&lt;br /&gt;  p do&lt;br /&gt;    text "hello"&lt;br /&gt;    b "world"&lt;br /&gt;  end&lt;br /&gt;end.to_s&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;Erector::Widget.new() do&lt;br /&gt;  a :href =&gt; "a.html" do&lt;br /&gt;    text "world"&lt;br /&gt;  end&lt;br /&gt;end.to_s&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;Erector::Widget.new() do&lt;br /&gt;  a "world", :href =&gt; "a.html" do&lt;br /&gt;end.to_s&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;Erector::Widget.new() do&lt;br /&gt;  text '&lt;&gt;&amp;'&lt;br /&gt;end.to_s&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;Hooking erector to rails:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;class WelcomeController &lt;&lt;br /&gt;  ApplicationController&lt;br /&gt;&lt;br /&gt;  def index&lt;br /&gt;    render :text =&gt;&lt;br /&gt;      Views::Welcome::Show.new().to_s&lt;br /&gt;  end&lt;br /&gt;&lt;br /&gt;end&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;Structuring views with the usual Ruby techniques: especially inheritance and methods&lt;br /&gt;&lt;hr /&gt;&lt;br /&gt;Responsibilities of a view mechanism:&lt;br /&gt;&lt;br /&gt;&lt;ul&gt;&lt;br /&gt;&lt;li&gt;quote output&lt;/li&gt;&lt;br /&gt;&lt;br /&gt;&lt;li&gt;balance start/end tags&lt;/li&gt;&lt;br /&gt;&lt;/ul&gt;&lt;br /&gt;&lt;br /&gt;&lt;hr /&gt;&lt;br /&gt;Comparison with ERB and Markaby, quoting:&lt;br /&gt;&lt;br /&gt;ERB: no, must call h&lt;br /&gt;&lt;br /&gt;Markaby: for strings, not blocks     &lt;br /&gt;&lt;br /&gt;Erector: yes*&lt;br /&gt;&lt;br /&gt;* except when you call raw&lt;br /&gt;&lt;hr /&gt;&lt;br /&gt;Comparison with ERB and Markaby, tag balancing:&lt;br /&gt;&lt;br /&gt;ERB: no&lt;br /&gt;&lt;br /&gt;Markaby: yes&lt;br /&gt;&lt;br /&gt;Erector: yes*&lt;br /&gt;&lt;br /&gt;* except when you call raw&lt;br /&gt;&lt;hr /&gt;&lt;br /&gt;Comparison with ERB and Markaby, subpages:&lt;br /&gt;&lt;br /&gt;ERB: partials, helpers&lt;br /&gt;&lt;br /&gt;Markaby: partials, helpers&lt;br /&gt;&lt;br /&gt;Erector: inheritance, methods&lt;br /&gt;&lt;hr /&gt;&lt;br /&gt;Similar libraries in other languages:&lt;br /&gt;&lt;br /&gt;&lt;ul&gt;&lt;br /&gt;&lt;li&gt;Java: &lt;a href="http://webiyo.sourceforge.net" &gt;webiyo.sf.net&lt;/a&gt;&lt;br /&gt;&lt;/ul&gt;&lt;br /&gt;&lt;hr /&gt;&lt;br /&gt;If there is time:&lt;br /&gt;&lt;ul&gt;&lt;br /&gt;&lt;li&gt;Calling helpers from erector&lt;/li&gt;&lt;br /&gt;&lt;li&gt;Writing helpers in erector&lt;/li&gt;&lt;br /&gt;&lt;/ul&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/37120426-3450514672347878196?l=jkingdon2000.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://jkingdon2000.blogspot.com/feeds/3450514672347878196/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=37120426&amp;postID=3450514672347878196' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/37120426/posts/default/3450514672347878196'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/37120426/posts/default/3450514672347878196'/><link rel='alternate' type='text/html' href='http://jkingdon2000.blogspot.com/2008/04/draft-of-erector-talk.html' title='Draft of Erector talk'/><author><name>Jim Kingdon</name><uri>http://www.blogger.com/profile/01857308320156877253</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-37120426.post-9221053214492987108</id><published>2008-02-20T18:06:00.000-08:00</published><updated>2008-02-20T18:20:22.377-08:00</updated><title type='text'>Hello world type example for jruby and mayfly</title><content type='html'>Given all the interest in the ruby programming language, it is natural to ask whether rails applications (or ruby applications more generally) could write their tests with mayfly.&lt;br /&gt;&lt;br /&gt;As a first step, I figured out how to run mayfly under &lt;a href="http://jruby.org" &gt;jruby&lt;/a&gt;.  Here's what I did.&lt;br /&gt;&lt;br /&gt;First, I installed jruby JRuby 1.1 RC 2as directed.  Then, I put the mayfly 0.3 jars in $JRUBY_HOME/lib (there are other ways to tell jruby where to look for them, but this seemed like the simplest).&lt;br /&gt;&lt;br /&gt;Then, I put the following in hellodb.rb:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;include Java&lt;br /&gt;import 'net.sourceforge.mayfly.Database'&lt;br /&gt;&lt;br /&gt;d = Database.new()&lt;br /&gt;d.execute("create table foo(x integer)")&lt;br /&gt;d.execute("insert into foo(x) values(5)")&lt;br /&gt;puts d.rowCount("foo")&lt;br /&gt;d.execute("insert into foo(x) values(6)")&lt;br /&gt;puts d.rowCount("foo")&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;Then running jruby will invoke mayfly:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;$ jruby hellodb.rb&lt;br /&gt;1&lt;br /&gt;2&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;The next step would be to figure out how to get Active Record talking to Mayfly.&lt;br /&gt;Haven't tried that yet.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/37120426-9221053214492987108?l=jkingdon2000.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://jkingdon2000.blogspot.com/feeds/9221053214492987108/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=37120426&amp;postID=9221053214492987108' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/37120426/posts/default/9221053214492987108'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/37120426/posts/default/9221053214492987108'/><link rel='alternate' type='text/html' href='http://jkingdon2000.blogspot.com/2008/02/hello-world-type-example-for-jruby-and.html' title='Hello world type example for jruby and mayfly'/><author><name>Jim Kingdon</name><uri>http://www.blogger.com/profile/01857308320156877253</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-37120426.post-4303692474029222169</id><published>2008-02-11T21:01:00.001-08:00</published><updated>2008-02-11T21:03:35.228-08:00</updated><title type='text'>Profiling with gprof (success on a short test program)</title><content type='html'>When last we discussed &lt;a href="http://jkingdon2000.blogspot.com/2007/07/profiling-mayfly.html" &gt;profiling Mayfly&lt;/a&gt;, I was profiling with &lt;a href="http://jiprof.sourceforge.net/"&gt;JIP&lt;/a&gt;. Brian Slesinsky, in a comment to that article, told me that he has found that JIP has a per-method cost which shows up in the profiling data (so that method calls appear to be more expensive than they really are).  He suggested the NetBeans profiler.&lt;br /&gt;&lt;br /&gt;Well, I was looking into how to install and use NetBeans (NetBeans, unlike Eclipse, does not ship with Fedora), and hadn't gotten much of anywhere until I had a lot of time (on a flight), and was left with seeng whether I could get anywhere with the tools which I have already installed.  That means &lt;a href="http://gcc.gnu.org/java/" &gt;gcj&lt;/a&gt; and &lt;a href="http://www.cs.utah.edu/dept/old/texinfo/as/gprof_toc.html" &gt;gprof&lt;/a&gt;. I got gprof working fine on a short test program (where it correctly identified the bottleneck), but didn't (yet) succeed in running it on mayfly.&lt;br /&gt;&lt;br /&gt;I suppose if people would find them helpful, I could upload my test programs and build scripts, but basically they boiled down to:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;gcj --main=Profilee -pg Profilee.java&lt;br /&gt;./a.out&lt;br /&gt;gprof &gt;profiler&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;Getting something like this invoked from ant is largely a boring but more or less straightforward matter (although it wasn't clear to me how the classpath relates to the class and java files specified on the command line).  But at least one of my invocations led to a runaway linker which made my machine swap for quite a while before I finally gave up.  So although it is premature to declare victory on this just yet, I did want to report on my success with the test program.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/37120426-4303692474029222169?l=jkingdon2000.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://jkingdon2000.blogspot.com/feeds/4303692474029222169/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=37120426&amp;postID=4303692474029222169' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/37120426/posts/default/4303692474029222169'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/37120426/posts/default/4303692474029222169'/><link rel='alternate' type='text/html' href='http://jkingdon2000.blogspot.com/2008/02/profiling-with-gprof-success-on-short.html' title='Profiling with gprof (success on a short test program)'/><author><name>Jim Kingdon</name><uri>http://www.blogger.com/profile/01857308320156877253</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-37120426.post-2135270927250942102</id><published>2007-12-13T13:38:00.000-08:00</published><updated>2007-12-13T14:03:09.982-08:00</updated><title type='text'>Avoid needless generality</title><content type='html'>I've been asking people what Mayfly features would be most useful to them.  I suppose I shouldn't be surprised that one of the answers I got was roughly "we use the syntax CONCAT(a, b, c) for string concatenation in MySQL and it would be nice if Mayfly had it instead of making us write it as a || b || c".  This fits right into the theory that often the most important features are the silliest (as I already discussed regarding &lt;a href="http://jkingdon2000.blogspot.com/2007/02/mayfly-starts-to-get-options-sooner.html" &gt;case sensitive table names&lt;/a&gt;).&lt;br /&gt;&lt;br /&gt;Anyway, having decided to work on CONCAT, my first instinct was to provide hypersonic-compatible stored procedures, which allow the user to define their own CONCAT (well, the two argument version anyway; I'm not sure hypersonic has stored procedures with variable numbers of arguments).  I started implementing stored procedures, and it became clear that there were plenty of corner cases (picking which overloaded method to call, error handling, type coercion, and of course the variable number of arguments issue).&lt;br /&gt;&lt;br /&gt;So not only did it seem easier to just add a CONCAT built-in, it is closer to what the user really wants.  Although defining a concatenate method and registering it with a CREATE ALIAS command isn't hard, there's no particularly compelling reason to make people do this.  Figuring out where to put the CREATE ALIAS invocation may also be more of a pain than it sounds, in a system like Mayfly with no static state shared between objects.&lt;br /&gt;&lt;br /&gt;There's a good chance I (or some other Mayfly contributor) will eventually get around to implementing the general stored procedures (this feature can be a handy way to make SQL code more portable between Mayfly and non-Mayfly databases, in cases where Mayfly doesn't yet have all the built-in functions that the other database has).  But jumping right to the general feature runs a risk of either (a) designing a feature which is supposed to be general, but which isn't really suited for any use case other than the original specific one, which could have been implemented in a way which is much more straightforward for the programmer and the user, and/or (b) designing a general feature which is too complex, too slow, takes too much time to implement, etc, relative to what is really needed.  So there's a lot to be said for special-purpose solutions.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/37120426-2135270927250942102?l=jkingdon2000.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://jkingdon2000.blogspot.com/feeds/2135270927250942102/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=37120426&amp;postID=2135270927250942102' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/37120426/posts/default/2135270927250942102'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/37120426/posts/default/2135270927250942102'/><link rel='alternate' type='text/html' href='http://jkingdon2000.blogspot.com/2007/12/avoid-needless-generality.html' title='Avoid needless generality'/><author><name>Jim Kingdon</name><uri>http://www.blogger.com/profile/01857308320156877253</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-37120426.post-366371760187542783</id><published>2007-12-03T17:59:00.000-08:00</published><updated>2007-12-03T18:34:16.537-08:00</updated><title type='text'>Simple Design and Testing Conference</title><content type='html'>Had the chance to talk about Mayfly at the &lt;a href="http://sdtconf.com/" &gt;Simple Design and Testing Conference&lt;/a&gt; 2007 conference last weekend.  As with other open space conferences I've been to, you don't know quite how a session is going to turn out, but my &lt;a href="http://mayfly.sourceforge.net/sdtconf2007/" &gt;notes&lt;/a&gt; include the sample code which I showed people.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/37120426-366371760187542783?l=jkingdon2000.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://jkingdon2000.blogspot.com/feeds/366371760187542783/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=37120426&amp;postID=366371760187542783' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/37120426/posts/default/366371760187542783'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/37120426/posts/default/366371760187542783'/><link rel='alternate' type='text/html' href='http://jkingdon2000.blogspot.com/2007/12/simple-design-and-testing-conference.html' title='Simple Design and Testing Conference'/><author><name>Jim Kingdon</name><uri>http://www.blogger.com/profile/01857308320156877253</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-37120426.post-3438919448692648890</id><published>2007-09-18T15:11:00.000-07:00</published><updated>2007-09-18T12:24:14.400-07:00</updated><title type='text'>Hibernate Annotations and You Ain't Gonna Need It</title><content type='html'>In the last post I showed a Hibernate &lt;a href="http://mayfly.sourceforge.net/hibernate-demo/" &gt;demo&lt;/a&gt;.  I wrote it to show how to hook Mayfly to Hibernate, but it also was my first experience with Hibernate Annotations.  I have long been hoping that Hibernate Annotations would ease the pain of hibernate, specifically by eliminating the need to keep looking back and forth between XML mapping files, Java code, and database schemas.&lt;br /&gt;&lt;br /&gt;Basically, my experience to date is quite positive.  The key thing to note about the demo is all the things which aren't there.  &lt;br /&gt;&lt;br /&gt;This starts with configuration.  There is no hibernate.cfg.xml file, no database username and password, no JDBC URL and driver (I guess some of those would come back if we were talking to a database like MySQL as opposed to Mayfly, but the hibernate.cfg.xml wouldn't).&lt;br /&gt;&lt;br /&gt;Next, and more importantly, look at Foo.  Here we just need an instance variable for each database column, an annotation @Entity at the start, and an annotation @Id on the primary key.  Since the instance variables are named the same as the columns, we don't need to specify a mapping between the two.    There's no XML mapping file.  I also don't bother with getters and setters. I can always add them later (that's why refactoring browsers have "encapsulate field"), and using them only where they are providing some value avoids a huge amount of clutter.  (Or to put it more provocatively, "You Ain't Gonna Need Getters and Setters").&lt;br /&gt;&lt;br /&gt;Just getting everything into Foo.java, rather than splitting it between Foo.java and Foo.hbm.xml, is a huge win.  You can't just control-click over to the XML file (in a browser like Eclipse), and even if you could you'd be looking back and forth rather than just having everything in one place as with annotations.&lt;br /&gt;&lt;br /&gt;So positive impressions confirmed.  Definitely plan to give annotations a try on a larger project next time I have a choice.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/37120426-3438919448692648890?l=jkingdon2000.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://jkingdon2000.blogspot.com/feeds/3438919448692648890/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=37120426&amp;postID=3438919448692648890' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/37120426/posts/default/3438919448692648890'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/37120426/posts/default/3438919448692648890'/><link rel='alternate' type='text/html' href='http://jkingdon2000.blogspot.com/2007/09/hibernate-annotations-and-you-aint.html' title='Hibernate Annotations and You Ain&apos;t Gonna Need It'/><author><name>Jim Kingdon</name><uri>http://www.blogger.com/profile/01857308320156877253</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-37120426.post-7144882884961446002</id><published>2007-09-18T14:55:00.000-07:00</published><updated>2007-09-18T11:57:52.394-07:00</updated><title type='text'>Demo of Mayfly and Hibernate</title><content type='html'>In April I &lt;a href="http://jkingdon2000.blogspot.com/2007/04/mayfly-and-hibernate.html" &gt;wrote&lt;/a&gt; a bit about hooking Mayfly to Hibernate.  Well, I finally got around to writing a &lt;a href="http://mayfly.sourceforge.net/hibernate-demo/" &gt;demo&lt;/a&gt; showing how to hook it all up in a small self-contained example.&lt;br /&gt;&lt;br /&gt;The demo is only a few dozen lines of code, but instead of repeating the whole thing here, I'll point out some interesting bits.  Each test creates its own Database object, so there is no shared state to clear out between tests.  The test then calls the openConnection on Database to get a JDBC connection, which the FooPersistence class then passes to the openSession method of Hibernate.&lt;br /&gt;&lt;br /&gt;Here I configure Hibernate by instantiating an AnnotationConfiguration, tell it the dialect (Hibernate can't figure this out from the connection, because it needs the dialect before the openSession call), and registering Foo as an entity.  If you configure Hibernate via XML mapping files, that would work too.&lt;br /&gt;&lt;br /&gt;That's pretty much all that is needed.  The dialect file is currently part of the demo, so copy it from there.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/37120426-7144882884961446002?l=jkingdon2000.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://jkingdon2000.blogspot.com/feeds/7144882884961446002/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=37120426&amp;postID=7144882884961446002' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/37120426/posts/default/7144882884961446002'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/37120426/posts/default/7144882884961446002'/><link rel='alternate' type='text/html' href='http://jkingdon2000.blogspot.com/2007/09/demo-of-mayfly-and-hibernate.html' title='Demo of Mayfly and Hibernate'/><author><name>Jim Kingdon</name><uri>http://www.blogger.com/profile/01857308320156877253</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-37120426.post-7519486802256334253</id><published>2007-07-30T07:27:00.000-07:00</published><updated>2007-07-30T08:30:21.520-07:00</updated><title type='text'>Profiling mayfly</title><content type='html'>One of the key goals for an in-memory database for unit testing is that it should be fast.  And mayfly is faster than the other databases I've tried for some things, like creating or altering tables.  However, it is some 3x slower than &lt;a href="http://www.hsqldb.org/" &gt;hypersonic&lt;/a&gt; for inserting rows ("insert into foo(a,b) values(3,4)" kinds of commands).  I've been meaning to profile it for a long time now, but I finally got around to it.&lt;br /&gt;&lt;br /&gt;First I added the &lt;a href="http://jiprof.sourceforge.net" &gt;JIP&lt;/a&gt; jar to the project and added a profile-test target to build.xml.  Because JIP doesn't work with &lt;a href="http://gcc.gnu.org/java/" &gt;gcj&lt;/a&gt;, I ran it under Sun java.  JIP writes a text file report, and I worked off that.&lt;br /&gt;&lt;br /&gt;The results surprised me.  They showed that 70% of the time was being consumed in the lexer and parser.  Conventional wisdom is that lexing and parsing just isn't your big bottleneck in a compiler these days, but perhaps that is more for an optimizing compiler which spends more time in code generation and optimization passes.  I inlined a few short methods which were bottlenecks (at the cost of a small amount of code duplication, but not so much as to be really shocking, and remember I was careful to only do this for the bottlenecks).  This was also a surprise: that method invocation seemed to be such a cost.  Given my limited knowledge of java internals, that sort of makes sense, but it kind of seems like a step backwards, in the sense that in the C/Pascal/etc days we made so much effort to make method invocation fast, and tell people they didn't need to make their code ugly to avoid method calls.  (At the risk of belaboring the obvious, even if method calls are expensive you still don't need to make your code (very) ugly to avoid them: it is only a handful of invocations which are actually going to make a difference in your run-time, and the profile shows you which ones).&lt;br /&gt;&lt;br /&gt;I was also able to streamline the non-parsing part of the code, mostly by taking out some extra steps (for example, transforming a column name to a Column and back to a name more times than needed).  Some of that had built up through a series of changes which had left in vestiges of previous ways of doing things.  So cleaning this up left the code simpler and clearer, as well as faster.&lt;br /&gt;&lt;br /&gt;Other changes, like changing Row to be a HashMap rather than a List, didn't seem to help at all (or even hurt slightly).  It has been conceptually a map for some time now, but apparently those linear searches were not particularly more expensive than the many calls to hashCode you get with the map.  I guess the fact that we don't expect more than a few dozen columns in a table is responsible.&lt;br /&gt;&lt;br /&gt;So what is next when I look at this again?  For the lexer, I may have run out of obvious ideas (given that it has duties like tracking the line and column numbers of every token, and I don't see giving up that feature, which provides good error messages).  For the parser, there is a lot of expression handling machinery that is involved in parsing the "3" in "insert into foo(a) values(3)".  Unless I think of a better way, having the top-level expression parser look for a literal followed by something like "," or ")", and going into a fast-path special case might be worth it.  I know that looks like a kluged-up wannabe bottom up parser, but I've been happy enough with &lt;a href="http://jkingdon2000.blogspot.com/2007/02/recursive-descent-parsers.html" &gt;recursive descent&lt;/a&gt; in other ways, that I have trouble seeing switching back to a parser generator.  As for the execution (building up rows, modifying the tables, etc), I'd have to look at the profile more.  Although I've seen some hot spots, fixed them (and perhaps created others), I don't have as much of an intuitive feel for what is slow here as I do for the lexing and parsing.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/37120426-7519486802256334253?l=jkingdon2000.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://jkingdon2000.blogspot.com/feeds/7519486802256334253/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=37120426&amp;postID=7519486802256334253' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/37120426/posts/default/7519486802256334253'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/37120426/posts/default/7519486802256334253'/><link rel='alternate' type='text/html' href='http://jkingdon2000.blogspot.com/2007/07/profiling-mayfly.html' title='Profiling mayfly'/><author><name>Jim Kingdon</name><uri>http://www.blogger.com/profile/01857308320156877253</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-37120426.post-7885845151727954253</id><published>2007-05-29T12:02:00.000-07:00</published><updated>2007-05-29T13:42:33.350-07:00</updated><title type='text'>Database upgrades: SQL versus code</title><content type='html'>When last I &lt;a href="http://jkingdon2000.blogspot.com/2006/12/database-upgrades.html" &gt;wrote&lt;/a&gt; about database upgrades, one of the design designs was whether to have a database upgrade be a hunk of code (as in rails and other systems I've worked with), or an SQL script (as in the non-automated upgrade scripts that were checked in to MIFOS prior to December's work).&lt;br /&gt;&lt;br /&gt;For the first cut (last December), I went with SQL, on the theory that (1) it might be easier for people to understand, especially sysadmins and others who wouldn't necessarily read much of the Java code, (2) many of the cases which came to mind, such as adding a column or adding a table, could be done this way, and (3) if the automated upgrade runs into trouble, it may be easier to run some scripts one at a time, perhaps with changes, rather than trying to mess around with Java (again, for some people).  But I did have in mind, even then, that I might need to add code upgrades at some point.&lt;br /&gt;&lt;br /&gt;Well, we found a &lt;a href="http://wiki.java.net/bin/view/Javatools/LookupValueOverwriting" &gt;case&lt;/a&gt; where the SQL scripts don't work.  The MIFOS database contains tables which store things like strings which are displayed to the user.  MIFOS ships with a whole bunch of these ("loan", "client", etc), but such strings can also be added by the microfinance institution.  In adding them (at least the way our database is currently set up), one needs to assign at least one ID which is not referenced directly from the Java code, but which also is referenced elsewhere in the database.  Although there are variants of SQL which have variables and the like (PL/SQL, PL/pgSQL, etc), I don't think MySQL has those kinds of extensions (and trying to turn SQL into a procedural language is somewhat awkward anyway).  So the solution will be to implement Java upgrades.  I have in mind keeping the ability to do SQL upgrades (that is, each upgrade is either a java upgrade or a SQL one).  That is largely to ease the transition (we have about 19 upgrades already, and won't need to convert them over all at once).  We'll also see whether writing upgrades in SQL, in those cases where it is possible, ends up being appealing or just a source of confusion.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/37120426-7885845151727954253?l=jkingdon2000.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://jkingdon2000.blogspot.com/feeds/7885845151727954253/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=37120426&amp;postID=7885845151727954253' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/37120426/posts/default/7885845151727954253'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/37120426/posts/default/7885845151727954253'/><link rel='alternate' type='text/html' href='http://jkingdon2000.blogspot.com/2007/05/database-upgrades-sql-versus-code.html' title='Database upgrades: SQL versus code'/><author><name>Jim Kingdon</name><uri>http://www.blogger.com/profile/01857308320156877253</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-37120426.post-3840019594432942609</id><published>2007-04-26T17:00:00.000-07:00</published><updated>2007-04-26T14:09:13.385-07:00</updated><title type='text'>Test speedups</title><content type='html'>The MIFOS tests (ApplicationTestSuite, or pretty much all the junit tests) run in 2600 seconds on my laptop.  When you are checking in several times a day, that adds up to a lot of time spent waiting for tests.  It also discourages good habits like running the tests frequently to find problems early, checking in frequently, certain kinds of experiments (for example "if I clean up this ugly code, will anything break?") and the like.&lt;br /&gt;&lt;br /&gt;There are a few reasons for this.  First of all, I'll agree with something that Eric Du or Li Mo (I forget which) said a month or so ago, which was that too many of our tests are integration tests (test many classes) as opposed to unit tests (test a small number of classes).  This is very much a case-by-case thing, so I guess I'll just mention the case which came up today: I was testing generateId in AccountBO.  And it turns out there is no need to access the database to test this method (see AccountBOTest).&lt;br /&gt;&lt;br /&gt;I've also known for a long time that many of the tests get caught in a familiar trap of writing many objects to the database (I need a client, and a client needs a group, and a group needs a center....).  Or of creating objects via the database when creating them in-memory would work just fine, be faster, and avoid problems with getting rid of them at the end of the test.&lt;br /&gt;&lt;br /&gt;But I was surprised at the speedups around getUserContext().  Someone (sorry, I tried finding out who in the archives but I didn't find it) posted some numbers to the MIFOS mailing list saying that &lt;br /&gt;replacing TestObjectFactory#getUserContext (which involves several database calls) with something faster (TestUtils#makeUser() is the usual choice) cut the run-time of a certain test in half (or something - the number varies from one test to the next and once I convinced myself that the speedup is significant I haven't really been measuring things further).&lt;br /&gt;&lt;br /&gt;Unfortunately, globally changing getUserContext to makeUser doesn't quite work - some tests fail that way.  But one of my projects lately has been going through tests and changing all those that can be changed.&lt;br /&gt;&lt;br /&gt;Getting tests to run fast can take work (especially if you don't have the luxury of writing fast tests from the start of a project).  But tests that run slowly don't tend to get run.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/37120426-3840019594432942609?l=jkingdon2000.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://jkingdon2000.blogspot.com/feeds/3840019594432942609/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=37120426&amp;postID=3840019594432942609' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/37120426/posts/default/3840019594432942609'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/37120426/posts/default/3840019594432942609'/><link rel='alternate' type='text/html' href='http://jkingdon2000.blogspot.com/2007/04/test-speedups.html' title='Test speedups'/><author><name>Jim Kingdon</name><uri>http://www.blogger.com/profile/01857308320156877253</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-37120426.post-650994499726590731</id><published>2007-04-03T18:35:00.000-07:00</published><updated>2007-04-03T15:36:01.753-07:00</updated><title type='text'>Mayfly and Hibernate</title><content type='html'>On MIFOS, we are successfully using Mayfly and Hibernate together, but there are some catches (and some future work - for me and/or other volunteers - in terms of making this work better).&lt;br /&gt;&lt;br /&gt;First of all, MIFOS is currently on Hibernate 3.0beta4.  I suspect later versions of Hibernate work too, but it would be nice to download Mayfly and Hibernate and try it out on some kind of "hello world" situation.&lt;br /&gt;&lt;br /&gt;Next, there's the Hibernate dialect.  The one we are using in MIFOS is checked in to MIFOS as &lt;a href="http://fisheye4.cenqua.com/browse/mifos/trunk/mifos/test/org/mifos/framework/util/helpers/MayflyDialect.java?r=HEAD" &gt;MayflyDialect&lt;/a&gt;.  It would be nice to submit this to Hibernate as a patch.  I have been meaning to do this, but just haven't gotten around to it.  For a while, I thought it might be changing frequently, but that hasn't been true lately.&lt;br /&gt;&lt;br /&gt;Anyway, enough of the boring stuff.  The interesting part is whether we can hook up the features of Mayfly which distinguish it from Hypersonic and the rest.  For example, let's take the feature of wanting to give each test a fresh database.  Suppose we have a static DataStore which contains all the tables and perhaps some data which all tests should start with (in MIFOS, getStandardStore() in &lt;a href="http://fisheye4.cenqua.com/browse/mifos/trunk/mifos/test/org/mifos/framework/util/helpers/DatabaseSetup.java?r=HEAD" &gt;DatabaseSetup&lt;/a&gt; ).  Now, in Hibernate one typically creates a SessionFactory once (not on every test, as it is expensive to make one), and the SessionFactory has the JDBC URL built in.  So how do we give a new Database for each test while being able to re-use the SessionFactory?  Well, what I've done so far is open my Session with the SessionFactory#openSession(Connection connection) method.  I'm probably best off just pointing to an example: testGetAllParameters in &lt;a href="http://fisheye4.cenqua.com/browse/mifos/trunk/mifos/test/org/mifos/application/reports/persistence/ReportsPersistenceTest.java?r=HEAD" &gt;ReportsPersistenceTest&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;So, anyone found better techniques?  This is a good subject for collaboration, not just because it is a way to share the work, but also because everyone's way of setting these things up tends to be slightly different.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/37120426-650994499726590731?l=jkingdon2000.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://jkingdon2000.blogspot.com/feeds/650994499726590731/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=37120426&amp;postID=650994499726590731' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/37120426/posts/default/650994499726590731'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/37120426/posts/default/650994499726590731'/><link rel='alternate' type='text/html' href='http://jkingdon2000.blogspot.com/2007/04/mayfly-and-hibernate.html' title='Mayfly and Hibernate'/><author><name>Jim Kingdon</name><uri>http://www.blogger.com/profile/01857308320156877253</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-37120426.post-644244202814622765</id><published>2007-03-15T17:49:00.000-07:00</published><updated>2007-03-15T14:49:16.086-07:00</updated><title type='text'>Web security: avoiding HTML injection</title><content type='html'>A shockingly high percentage of web applications have various kinds of security holes (or just bugs), and one of the biggest causes is failing to quote strings before putting them into an HTML page.  See for example &lt;a href="http://lwn.net/Articles/179569/" &gt;LWN: Cross-site scripting attacks&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;Most people have now figured out this issue to the point of realizing that your application needs to quote text as it outputs it and providing a function to escape HTML, such as the PHP &lt;a href="http://www.php.net/htmlspecialchars" &gt;htmlspecialchars&lt;/a&gt; or any number of locally written versions such as the MIFOS xmlEscape in &lt;a href="https://mifos.dev.java.net/source/browse/*checkout*/mifos/trunk/mifos/src/org/mifos/framework/struts/tags/MifosTagUtils.java" &gt;MifosTagUtils&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;However, requiring people to remember to call it is error prone.  A better approach is that HTML is one thing, and strings are another.  So inserting a string into an HTML document will quote it.  A few template systems do this (&lt;a href="http://tinytemplate.org/" &gt;tinytemplate&lt;/a&gt;, &lt;a href="http://amrita.sourceforge.jp/" &gt;Amrita&lt;/a&gt;, probably a few others).  Most HTML-generating libraries (&lt;a href="http://builder.rubyforge.org/" &gt;builder&lt;/a&gt;, DOM, &lt;a href="http://java.sun.com/webservices/docs/1.5/api/javax/xml/stream/XMLStreamWriter.html" &gt;XmlStreamWriter&lt;/a&gt;, etc) do it.  If you are using an older template system, like JSP, ASP, PHP, etc, start thinking about how to migrate to something which is secure by default.  Here's an &lt;a href="http://slesinsky.org/brian/code/hello_rails.html" &gt;article&lt;/a&gt; about these issues in Rails (recommending builder instead of rhtml).&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/37120426-644244202814622765?l=jkingdon2000.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://jkingdon2000.blogspot.com/feeds/644244202814622765/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=37120426&amp;postID=644244202814622765' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/37120426/posts/default/644244202814622765'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/37120426/posts/default/644244202814622765'/><link rel='alternate' type='text/html' href='http://jkingdon2000.blogspot.com/2007/03/web-security-avoiding-html-injection.html' title='Web security: avoiding HTML injection'/><author><name>Jim Kingdon</name><uri>http://www.blogger.com/profile/01857308320156877253</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-37120426.post-5222059108631995166</id><published>2007-02-14T10:41:00.000-08:00</published><updated>2007-02-14T11:40:08.417-08:00</updated><title type='text'>Recursive descent parsers</title><content type='html'>Many of us have occasional, rather than frequent, need for parsing a complicated text format (such as a programming language - I'm talking about things more complicated than, say, XML or comma separated values).  So often the first step is trying to remember how parser generators work (a parser generator being a tool which takes a grammar and produces a parser, such as &lt;a href="http://dickey.his.com/byacc/byacc.html" &gt;byacc&lt;/a&gt;, &lt;a href="http://www.antlr.org/" &gt;ANTLR&lt;/a&gt;, or &lt;a href="http://www.sablecc.org/" &gt;SableCC&lt;/a&gt;).  The latest one through this process is &lt;a href="http://martinfowler.com/bliki/HelloSablecc.html" &gt;Martin Fowler&lt;/a&gt;.  Through most of my career I've occasionally struggled with ANTLR and yacc, and only recently have come to the conclusion that I'm best off where many of us started: with a hand-written recursive descent parser.&lt;br /&gt;&lt;br /&gt;If I lose you with any of the parser jargon, you might need to look at the &lt;a href="http://en.wikipedia.org/wiki/Compilers:_Principles%2C_Techniques%2C_and_Tools" &gt;Dragon Book&lt;/a&gt;, but I'm trying to keep it to a minimum.&lt;br /&gt;&lt;br /&gt;Anyway, recursive descent is good because:&lt;br /&gt;&lt;br /&gt;* You can accomplish everything in your Integrated Development Environment, or pre-existing ant build file.  Parser generators, as with other kinds of generated code, impose extra steps every time you modify the grammar (extra manual steps and/or extra build scripting).&lt;br /&gt;&lt;br /&gt;* Can accept a wide range of grammars.  The only catch is that you need to eliminate &lt;a href="http://en.wikipedia.org/wiki/Left_recursion" &gt;left recursion&lt;/a&gt;, but in my&lt;br /&gt;experience the solution of a while or do-while loop ends up being even more elegant than the left-recursive grammar you started with.&lt;br /&gt;&lt;br /&gt;* Very nice to unit test.  Your unit test can call into any production of the grammar.  With most parser generators, you can only easily call the top-level production, and then you need to dig around in the tree it returns to find what you are supposed to assert on.&lt;br /&gt;&lt;br /&gt;* Easy to understand and debug.&lt;br /&gt;&lt;br /&gt;* Gives you complete flexibility to decide how you want to handle trees/actions/etc (often, building up &lt;a href="http://martinfowler.com/eaaCatalog/domainModel.html" &gt;domain&lt;/a&gt;/&lt;a href="http://en.wikipedia.org/wiki/Command_pattern" &gt;command&lt;/a&gt; objects from within the grammar will work well, with no extra tree layer.  Depends on the problem I suspect).&lt;br /&gt;&lt;br /&gt;* Provides refactoring and navigation tools for the grammar for free.  What productions refer to this rule?  Just hit "find usages".  How do I make a separate rule from part of this rule?  Just hit &lt;a href="http://www.refactoring.com/catalog/extractMethod.html" &gt;extract method&lt;/a&gt; (and you know it will still work - the ANTLR/yacc transformation which looks like extract method is not correctness-preserving in my bitter experience).&lt;br /&gt;&lt;br /&gt;Anyway, if you are still with me and want to see how this works, the next step is probably to look at the SQL parser that I wrote this way for Mayfly: the tests (some of them) are at &lt;a href="http://mayfly.cvs.sourceforge.net/mayfly/mayfly/test/net/sourceforge/mayfly/parser/ParserTest.java?revision=HEAD&amp;view=markup" &gt;ParserTest&lt;/a&gt; and the parser is in  &lt;a href="http://mayfly.cvs.sourceforge.net/mayfly/mayfly/src/net/sourceforge/mayfly/parser/Parser.java?revision=HEAD&amp;view=markup" &gt;Parser&lt;/a&gt;.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/37120426-5222059108631995166?l=jkingdon2000.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://jkingdon2000.blogspot.com/feeds/5222059108631995166/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=37120426&amp;postID=5222059108631995166' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/37120426/posts/default/5222059108631995166'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/37120426/posts/default/5222059108631995166'/><link rel='alternate' type='text/html' href='http://jkingdon2000.blogspot.com/2007/02/recursive-descent-parsers.html' title='Recursive descent parsers'/><author><name>Jim Kingdon</name><uri>http://www.blogger.com/profile/01857308320156877253</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-37120426.post-6773972740267188102</id><published>2007-02-10T19:15:00.000-08:00</published><updated>2007-01-29T13:12:56.415-08:00</updated><title type='text'>Mayfly starts to get options, sooner than I thought it would</title><content type='html'>OK, a few weeks ago I &lt;a href="http://jkingdon2000.blogspot.com/2007/01/is-sql-strongly-typed-language.html" &gt;mused&lt;/a&gt; about whether Mayfly was going to need some options (in that case, to set how it handles data types).  Well, most of the reasons I imagine for options still remain in the future: "some day we may need/want this".  But I did start adding an option to Mayfly.  What was the first option?  Some profound difference of philosophy about the data model which SQL should present?  Maybe the ever-popular "should SQL be more relational?" or the subtle and deep issues around handling of SQL NULL?&lt;br /&gt;&lt;br /&gt;No, nothing like that.&lt;br /&gt;&lt;br /&gt;It is for case sensitive table names.&lt;br /&gt;&lt;br /&gt;The SQL standard, and all databases I know except one, say that table names are case insensitive.&lt;br /&gt;&lt;br /&gt;I say, "all databases except one".  MIFOS of course is using that one (&lt;a href="http://dev.mysql.com/doc/refman/5.1/en/identifier-case-sensitivity.html" &gt;MySQL&lt;/a&gt;).  And it is worse than "MySQL makes table names case sensitive".  MySQL makes table names case sensitive &lt;i&gt;only if file names are case sensitive&lt;/i&gt; (typically Unix).  The practical effect of this is that half our team has case insensitive filenames and half has case sensitive ones, and the first group is often accidentally breaking the build (but only for the second group).&lt;br /&gt;&lt;br /&gt;I thought a bit about various solutions, and there's a lot to be said about just having Mayfly run in case sensitive mode (on all platforms).  So yeah, the CVS version of Mayfly has a method in Database called tableNamesCaseSensitive.  Give me a few more days and most of Mayfly should honor it (large parts already do).&lt;br /&gt;&lt;br /&gt;Yet another example of how it is hard to anticipate what will really be important (along with familiar cases like prioritizing software features only once you see what users look for and miss when they try to use a prototype, or only doing performance tuning once you have measured where the bottlenecks are).&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/37120426-6773972740267188102?l=jkingdon2000.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://jkingdon2000.blogspot.com/feeds/6773972740267188102/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=37120426&amp;postID=6773972740267188102' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/37120426/posts/default/6773972740267188102'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/37120426/posts/default/6773972740267188102'/><link rel='alternate' type='text/html' href='http://jkingdon2000.blogspot.com/2007/02/mayfly-starts-to-get-options-sooner.html' title='Mayfly starts to get options, sooner than I thought it would'/><author><name>Jim Kingdon</name><uri>http://www.blogger.com/profile/01857308320156877253</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-37120426.post-2791868860257082198</id><published>2007-01-29T12:15:00.000-08:00</published><updated>2007-02-28T17:18:29.191-08:00</updated><title type='text'>Follow your nose</title><content type='html'>Today's tale concerns what happens when we see code smells, and the twisty path we sometimes follow between getting a whiff of something, and reaching the real problem, and/or solution.&lt;br /&gt;&lt;br /&gt;It started innocently enough.  I was going through the compiler warnings from Eclipse, cleaning them up.  Most of these fixes are improvements but not especially big or difficult.  For example, getting rid of an unused variable, running the "organize imports" tool, adding missing @Override annotations, etc.  But then I saw one which was clearly pointing to something bigger.  A simplified version of the code with the warning is:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;int FLAT_INTEREST = 1;&lt;br /&gt;int DECLINING_INTEREST = 2;&lt;br /&gt;createLoan(FLAT_INTEREST);&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;and the warning was because DECLINING_INTEREST was never used.  Now, even without going through &lt;a href="http://martinfowler.com/bliki/HistoryIsNotBunk.html" &gt;code archeology&lt;/a&gt; I can pretty much guess the sequence of events:&lt;br /&gt;&lt;br /&gt;* Developer one creates a method &lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;createLoan(int interestType)&lt;br /&gt;// 1=flat, 2=declining&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;and various calls to it of the form&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;createLoan(1);&lt;br /&gt;createLoan(2);&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;* Developer two sees some of these calls, and in the process of trying to read this code, or write something similar, wants to make the code better say what the parameter means:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;int FLAT_INTEREST = 1;&lt;br /&gt;createLoan(FLAT_INTEREST);&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;She (or the next developer) also figures out that "flat" is as opposed to "declining" and that two means declining.  Hence the:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;int DECLINING_INTEREST = 2;&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;All of this is all well and good.  Sure, it is easy to find fault with the idea that local variables are a good place for these constants, but the local variable is better than what we had before - &lt;tt&gt;createLoan(1)&lt;/tt&gt; - and we should be willing to improve code clarity one step at a time, rather than trying to solve everything at once.&lt;br /&gt;&lt;br /&gt;A next small step could be to turn these local variables into constants in some central place, but I instead looked at the implementation of createLoan.  It was something along the lines of:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;createLoan(int interestType) {&lt;br /&gt;  return new Loan(&lt;br /&gt;  InterestType.fromInt(&lt;br /&gt;    interestType));&lt;br /&gt;}&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;That is, this int was getting turned into an&lt;br /&gt;enum anyway.  Well, once we see that there is already an enum, we realize most of the job is already done, and we end up with something like the following:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;createLoan(InterestType type) { . . . }&lt;br /&gt;&lt;br /&gt;createLoan(InterestType.FLAT);&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;Well, actually there were too many callers to convert everything in one step, but I converted some, and kept the createLoan(int) method as a transitional aid, as I've &lt;a href="http://jkingdon2000.blogspot.com/2006/11/enums-are-good-thing.html" &gt;described previously&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;(For those people wanting to see the actual code in MIFOS, the code which had had the warnings was getLoanAccount in CollectionSheetHelperTest, the method which I called createLoan above really is createLoanOffering in TestObjectFactory and the enum is InterestType).&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/37120426-2791868860257082198?l=jkingdon2000.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://jkingdon2000.blogspot.com/feeds/2791868860257082198/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=37120426&amp;postID=2791868860257082198' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/37120426/posts/default/2791868860257082198'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/37120426/posts/default/2791868860257082198'/><link rel='alternate' type='text/html' href='http://jkingdon2000.blogspot.com/2007/01/follow-your-nose.html' title='Follow your nose'/><author><name>Jim Kingdon</name><uri>http://www.blogger.com/profile/01857308320156877253</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-37120426.post-5992782015074471692</id><published>2007-01-22T08:41:00.000-08:00</published><updated>2007-01-22T15:07:27.828-08:00</updated><title type='text'>Reviewing code: in-person versus written</title><content type='html'>It has become a fairly widespread belief (although somewhat less often practiced), that software benefits from having other developers look at it.  Such &lt;a href="http://en.wikipedia.org/wiki/Code_review" &gt;&lt;b&gt;code review&lt;/b&gt;&lt;/a&gt; is designed to find flaws, seek out forgotten corner cases, improve code style, and any number of other improvements.&lt;br /&gt;&lt;br /&gt;One of the key questions is whether to do this in writing or face to face.  Most open source projects rely on written feedback, with the most two common forms being (a) send a patch to a mailing list, and get a description of what could be improved (the &lt;a href="http://www.mozilla.org/hacking/code-review-faq.html" &gt;mozilla code review&lt;/a&gt; system is particularly formalized), or (b) check the code in, and other developers look at it there (if not as a deliberate effort to review it, then in the course of their own work).  By contrast, a face to face review can be anything from a formal event which sits a bunch of developers in a conference room with the code on a projector, or the back-and-forth of &lt;a href="http://martinfowler.com/bliki/PairProgrammingMisconceptions.html" &gt;pair programming&lt;/a&gt;, or anything in between ("hey, could you look at this?").&lt;br /&gt;&lt;br /&gt;As it turns out, this morning I got two unrelated examples of how the written method can be frustrating.&lt;br /&gt;&lt;br /&gt;The first story starts a week and a half ago.  A new MIFOS developer had checked in some code.  Not surprisingly for a new developer, it had a variety of problems both large and small.  I sent an email describing some of the ones which seemed most important and/or easiest to find (oh, and reverted the checkin because it broke the build, but fortunately that's much easier with subversion than it would have been with CVS).  Today the developer checked in a fixed version.  It certainly is improved (I'm pleased to say the unit tests pass).  I noticed a variety of smaller things.  Now I have to figure out which ones to fix myself and which ones to complain about.  In the second case am I going to wait another week and a half to get a reaction? And how do I try to calibrate how good is "good enough" and what we should worry about later?  Now, those are valid questions no matter how the suggestions are delivered.  And a one-week feedback loop is better than waiting many months, until the developers thought they had something which was "finished".  But it is certainly harder to come to an understanding, or build up a team set of practices, if each little point is going to require a number of back-and-forth emails or phone calls.&lt;br /&gt;&lt;br /&gt;My second example is a much smaller one, which actually makes it easier to see the point without getting hung up in larger issues of how you manage a project.  A well-known columnist wrote an online article which had a typo in one of the examples.  I wrote in and said "you have an extra curly brace on the last line; perhaps you copy-pasted it from a few lines up".  He wrote back and said "no, the curly should be there" (thinking of the instance a few lines up, which was similar in syntax).  This caused me to pause a bit.  "Well, I'm pretty sure I'm right, but is it worth worrying about?" and even "how can I express this most clearly, because obviously what I said at first needed too much re-reading for a busy author to do", and so on.  Well, I did write the second email ("no, the one at the end" or however I said it) and the author wrote back that I was right and fixed it (no doubt feeling at least a touch sheepish).&lt;br /&gt;&lt;br /&gt;Not an especially big deal.  But I contrast it with what would have happened if this had happened while pair programming.  I would have said "oh, there's an extra curly".  Perhaps I would have just taken the keyboard and removed it.  But the worst case would be closer to "no, it matches the one up here" "not that curly, this curly (pointing with finger/mouse/etc)" "oh yeah, you're right (reaches for keyboard and deletes it)".  Not only would it have gotten fixed faster, but with a minimum of tension (not that pair programming is immune from tension, but that's a subject for another article).&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/37120426-5992782015074471692?l=jkingdon2000.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://jkingdon2000.blogspot.com/feeds/5992782015074471692/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=37120426&amp;postID=5992782015074471692' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/37120426/posts/default/5992782015074471692'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/37120426/posts/default/5992782015074471692'/><link rel='alternate' type='text/html' href='http://jkingdon2000.blogspot.com/2007/01/reviewing-code-in-person-versus-written.html' title='Reviewing code: in-person versus written'/><author><name>Jim Kingdon</name><uri>http://www.blogger.com/profile/01857308320156877253</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-37120426.post-7412134672392969913</id><published>2007-01-10T11:22:00.000-08:00</published><updated>2007-01-10T08:20:26.755-08:00</updated><title type='text'>Is SQL a strongly typed language?</title><content type='html'>A &lt;a href="http://en.wikipedia.org/wiki/Strongly-typed_programming_language" &gt;strongly typed language&lt;/a&gt; can mean different things, but here I look at how SQL stacks up against some of the definitions of strong typing (and to keep it practical, what different SQL implementations do and do not let you do).&lt;br /&gt;&lt;br /&gt;* Static typing as opposed to dynamic typing: SQL is statically (strongly) typed in this sense; each column has a type.&lt;br /&gt;&lt;br /&gt;* compile-time checks for type constraint violations: Here we need to define the difference between compile-time and run-time in SQL.  Basically I would call "compile-time" to be the call to Connection#prepareStatement and "run-time" to be the call to PreparedStatement#executeUpdate.  An alternate (probably mostly equivalent) definition is that a "compile-time" check happens even if there are no rows to operate on.  A browse through the Mayfly testsuite for "rowsPresent" flags will show cases in which SQL implementations differ on whether a particular check is compile-time or run-time, although the popularity of query optimizers tends to mean that checks happen at compile time (in those cases where I've checked; most of the Mayfly acceptance tests don't distinguish the two cases).&lt;br /&gt;&lt;br /&gt;* complex, fine-grained type system: SQL is more fine-grained than systems of the "everything is a string" variety (there are different syntaxes for '5', 5, 5.0, and x'5'), but only recent versions of SQL try to add things like structure/record types.&lt;br /&gt;&lt;br /&gt;* omit implicit type conversion.  The databases I've tested (Mayfly,MySQL,Postgres,Derby, and Hypersonic) all refuse to read 'foo' as zero (if looking for an integer).  All tested databases allow storing an integer into a DECIMAL column (that is, you can say INSERT INTO FOO(x) VALUES(5) and you don't need to say 5.0 even if x is of type DECIMAL).  There are also plenty of cases in between (for example, INSERT INTO FOO(x) VALUES(9) into a string column, which works in Hypersonic, MySQL and Postgres, but not Derby or Mayfly).&lt;br /&gt;&lt;br /&gt;Anyway, I could go on, either with gory details of what does and does not work (for which I'm better off just having you look at the the Mayfly acceptance tests), or with more general philosophy on types (which dictates what kinds of cases to look for), but I'll cut to the chase: What will be most useful for Mayfly users?&lt;br /&gt;&lt;br /&gt;For now, I am generally leaning in the direction of making Mayfly picky - it seems better to catch any errors early (when writing/running tests), rather than later (when trying to deploy on different databases, for example).  It is my experience so far that MIFOS doesn't seem to play loose with the type system (which is probably mainly a reflection on Hibernate), so I feel somewhat vindicated in this judgment.  As with many things for Mayfly, I realize there are other situations (most notably, if you want to Mayfly-ize an existing application without having to modify it).  So the question is whether Mayfly should be &lt;a href="http://www.loudthinking.com/arc/000496.html" &gt;opinionated software&lt;/a&gt;?  I'm generally of the mind that software works best with a clear idea of what it is supposed to do, and that software which tries to accomodate every possible answer to "how should X work?" tends to just get a bunch of poorly-thought-out configuration choices, none of which end up being quite right (for &lt;i&gt;any&lt;/i&gt; given situation, or opinion).  On the other hand, I'm assuming that Mayfly sooner or later will have some kind of "opinion manager" where you can pick, say, "please enforce the practices considered best by the mayfly developers", or "maximize MySQL compatibility" or even define your own (much like the code formatting configuration in IDEA or Eclipse).&lt;br /&gt;&lt;br /&gt;Whether strong typing is a case for options, I don't know, however.  Fear says that of course things would break and we need an escape hatch.  But I am beginning to wonder whether that breakage is small enough to just wait and see whether this becomes a problem.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/37120426-7412134672392969913?l=jkingdon2000.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://jkingdon2000.blogspot.com/feeds/7412134672392969913/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=37120426&amp;postID=7412134672392969913' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/37120426/posts/default/7412134672392969913'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/37120426/posts/default/7412134672392969913'/><link rel='alternate' type='text/html' href='http://jkingdon2000.blogspot.com/2007/01/is-sql-strongly-typed-language.html' title='Is SQL a strongly typed language?'/><author><name>Jim Kingdon</name><uri>http://www.blogger.com/profile/01857308320156877253</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-37120426.post-8899640531087562835</id><published>2007-01-04T11:40:00.000-08:00</published><updated>2007-01-04T08:40:28.264-08:00</updated><title type='text'>new Integer(5) versus Integer.valueOf(5)</title><content type='html'>Seems that &lt;a href="http://findbugs.sourceforge.net/" &gt;findbugs&lt;/a&gt; warns you if you call &lt;a href="http://java.sun.com/j2se/1.5.0/docs/api/java/lang/Integer.html#Integer(int)" &gt;new Integer(5)&lt;/a&gt; instead of the (new with Java 1.5) &lt;a href="http://java.sun.com/j2se/1.5.0/docs/api/java/lang/Integer.html#valueOf(int)" &gt;Integer.valueOf(5)&lt;/a&gt;.  The point of the latter is that it might return you an existing object rather than creating a new instance.&lt;br /&gt;&lt;br /&gt;I'll get back to the Integer.valueOf case, but on the general topic of trying to avoid object creation, there has been a long and largely unhappy history of this in Java.  For example, see &lt;a href="http://www-128.ibm.com/developerworks/java/library/j-jtp09275.html" &gt;Allocation is faster than you think, and getting faster&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;To summarize the possible problems with object caching and pooling:&lt;br /&gt;&lt;br /&gt;* One can accidentally end up sharing a mutable object where the simple design calls for an unshared object.  One way to avoid this problem is just to use immutable objects, for example &lt;a href="http://joda-time.sourceforge.net/" &gt;joda-time&lt;/a&gt; objects instead of &lt;a href="http://java.sun.com/j2se/1.4.2/docs/api/java/util/Date.html" &gt;java.util.Date&lt;/a&gt; objects.  In the Integer.valueOf example, Integer is immutable, so we don't have this problem.&lt;br /&gt;&lt;br /&gt;* Pooling almost always complicates the code.  Not so much of an issue for Integer.valueOf, in the sense that the standard library has the extra code, we just need to figure out whether to call it.&lt;br /&gt;&lt;br /&gt;* Object pools can cause synchronization bottlenecks.  There are of course complicated solutions, like separate pools for each thread.  In the Integer case, this is Integer.valueOf's problem (and the Sun J2SE 1.5 implementation solves it by just allocating a fixed size pool on startup).&lt;br /&gt;&lt;br /&gt;* Object pools tend to increase memory consumption.  Often the performance hit of chewing up extra memory (for a long time) will exceed the allocation/deallocation overhead (which may involve a short-lived object, the cheapest kind).  Again, in the Integer.valueOf case that's someone else's problem not ours (and in the Sun J2SE 1.5 implementation anyway, the size of the object pool is fixed at JVM startup and won't change based on anything we do).&lt;br /&gt;&lt;br /&gt;So having exhausted the usual arguments against object pools, I conclude that it is in fact a good thing to call Integer.valueOf instead of new Integer.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/37120426-8899640531087562835?l=jkingdon2000.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://jkingdon2000.blogspot.com/feeds/8899640531087562835/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=37120426&amp;postID=8899640531087562835' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/37120426/posts/default/8899640531087562835'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/37120426/posts/default/8899640531087562835'/><link rel='alternate' type='text/html' href='http://jkingdon2000.blogspot.com/2007/01/new-integer5-versus-integervalueof5.html' title='new Integer(5) versus Integer.valueOf(5)'/><author><name>Jim Kingdon</name><uri>http://www.blogger.com/profile/01857308320156877253</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-37120426.post-6480222215254721727</id><published>2007-01-03T12:53:00.000-08:00</published><updated>2007-01-03T12:53:07.587-08:00</updated><title type='text'>Open Source, Free or Libre?</title><content type='html'>It might be just flamebait to even mention it, but those of us who are writing software which conforms to the &lt;a href="http://opensource.org/docs/definition.php" &gt;Open Source Definition&lt;/a&gt; or the &lt;a href="http://www.gnu.org/philosophy/free-sw.html" &gt;Four Freedoms&lt;/a&gt; (most often both) sooner or later need to decide whether to call it &lt;b&gt;open source software&lt;/b&gt;, &lt;b&gt;free software&lt;/b&gt;, or &lt;b&gt;libre software&lt;/b&gt;, with the most controversy tending to surround whether people are trying to water down the meaning of open source (for example see &lt;a href="http://lwn.net/Articles/211800/" &gt;What is open source?&lt;/a&gt;).&lt;br /&gt;&lt;br /&gt;The only reason I bother to write about this is that Martin Fowler, in his &lt;a href="http://martinfowler.com/bliki/SemanticDiffusion.html" &gt;Semantic Diffusion&lt;/a&gt; article, points out that this is completely par for the course when a concept is getting popular.  He has some great examples of this (I am old enough to remember how for a time everything was described as "object-oriented").&lt;br /&gt;&lt;br /&gt;So if open source is like the other terms Fowler discusses, there isn't a need to start a big panic.  As long as we have a critical mass of people using the word open source to refer to software meeting the open source definition, the odds are good that the meaning won't drift too far and too permanently.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/37120426-6480222215254721727?l=jkingdon2000.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://jkingdon2000.blogspot.com/feeds/6480222215254721727/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=37120426&amp;postID=6480222215254721727' title='3 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/37120426/posts/default/6480222215254721727'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/37120426/posts/default/6480222215254721727'/><link rel='alternate' type='text/html' href='http://jkingdon2000.blogspot.com/2007/01/open-source-free-or-libre.html' title='Open Source, Free or Libre?'/><author><name>Jim Kingdon</name><uri>http://www.blogger.com/profile/01857308320156877253</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>3</thr:total></entry><entry><id>tag:blogger.com,1999:blog-37120426.post-3892978001348705722</id><published>2006-12-19T16:31:00.000-08:00</published><updated>2006-12-19T13:31:44.394-08:00</updated><title type='text'>Database upgrades</title><content type='html'>Well, my project for the last few weeks has been database upgrades.  This concept has been popularized by &lt;a href="http://railsmanual.com/class/ActiveRecord%3A%3AMigration/1.1.6" &gt;rails&lt;/a&gt; and to a certain extent the &lt;a href="http://databaserefactoring.com/" &gt;Database refactoring&lt;/a&gt; book, but it is something that just about any project could do, without a whole lot of mechanism.  It only took us (myself and &lt;a href="http://www.stelligent.com/" &gt;Stelligent&lt;/a&gt;) a few days to implement an automated upgrade scheme.&lt;br /&gt;&lt;br /&gt;I guess the best way to describe the design is just to point to the &lt;a href="http://wiki.java.net/bin/view/Javatools/DatabaseStandards" &gt;Wiki page&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;Some of the key decisions were/are:&lt;br /&gt;&lt;br /&gt;(1) sql vs java.  Is an upgrade script just a .sql file, or is it written in a real programming language like Java/ruby/perl/etc?  We went with the .sql approach because it seems like the Simplest Thing Which Could Possibly work, and it seems like introducing a programming language could get needlessly complicated and/or hard to understand.  Having said that, I do realize that we may someday hit a situation where it is really awkward to process data via SQL.  But I'd rather cross that bridge when we come to it, given how many upgrades so far have been very simple (insert default value into new column, create new table with no rows, etc).&lt;br /&gt;&lt;br /&gt;(2) Whether to provide upgrade scripts for every checkin which changes the database schema, or on coarser boundaries like releases?  The latter is&lt;br /&gt;sufficient if the goal of this exercise is merely to allow production users to upgrade.  Requiring developers to blow away their test data can be done, but the real question I have there is how is a developer supposed to know when they need to do this.  I've been on projects where a significant amount of the chatter in the team room is "is anyone else seeing a failure x?" "rebuild your database".  This doesn't really seem great even if everyone is in the same room, and on a distributed project like MIFOS, it seems to me quite important to at least detect an out of date database on developer machines (and once you've done that, might as well just do the upgrades).&lt;br /&gt;&lt;br /&gt;(3) Whether to provide downgrade scripts.  The purpose, I guess, is to let someone try out a new version of the software knowing they can always go back if it didn't work out.  Given all we are trying to do in terms of having tests, continuous integration, etc, to make upgrades go smoothly without needing to roll back, I'm not sure how much effort to put into downgrades.  On the other hand, it may be hubris to think downgrades will never be needed.&lt;br /&gt;&lt;br /&gt;(4) What tests to build.  The interesting tests in the MIFOS case are in &lt;a href="https://mifos.dev.java.net/source/browse/mifos/trunk/mifos/test/org/mifos/framework/persistence/LatestTest.java" &gt;LatestTest&lt;/a&gt;.  There are tests of the upgrade scripts (checking against latest-*.sql), the downgrade scripts (checking each one undoes the corresponding upgrade script).  We're also thinking about some tests which would test that the upgrade scripts properly upgrade existing user-supplied data (as opposed to just schemas and data supplied with MIFOS).  But this post is getting long already, so I'll leave that for another time (or for people to ask about).&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/37120426-3892978001348705722?l=jkingdon2000.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://jkingdon2000.blogspot.com/feeds/3892978001348705722/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=37120426&amp;postID=3892978001348705722' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/37120426/posts/default/3892978001348705722'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/37120426/posts/default/3892978001348705722'/><link rel='alternate' type='text/html' href='http://jkingdon2000.blogspot.com/2006/12/database-upgrades.html' title='Database upgrades'/><author><name>Jim Kingdon</name><uri>http://www.blogger.com/profile/01857308320156877253</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-37120426.post-4010127502351011990</id><published>2006-12-14T17:38:00.000-08:00</published><updated>2006-12-15T09:23:25.220-08:00</updated><title type='text'>MIFOS programmer intern position in Washington, DC</title><content type='html'>If you are a programmer looking for an internship in Washington, DC, check out &lt;a href="http://www.grameenfoundation.org/get_involved/career_opportunities/employment_opportunities/software_dev_intern/" &gt;Software Development Intern, Mifos&lt;/a&gt;.  Much of the time would be spent pairing with me and/or working on some of the things I write about here.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/37120426-4010127502351011990?l=jkingdon2000.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://jkingdon2000.blogspot.com/feeds/4010127502351011990/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=37120426&amp;postID=4010127502351011990' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/37120426/posts/default/4010127502351011990'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/37120426/posts/default/4010127502351011990'/><link rel='alternate' type='text/html' href='http://jkingdon2000.blogspot.com/2006/12/mifos-programmer-intern-position-in.html' title='MIFOS programmer intern position in Washington, DC'/><author><name>Jim Kingdon</name><uri>http://www.blogger.com/profile/01857308320156877253</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-37120426.post-3527869113962758226</id><published>2006-12-14T13:52:00.000-08:00</published><updated>2006-12-14T14:54:43.236-08:00</updated><title type='text'>SQL dumps and topological sorts</title><content type='html'>A few weeks ago I wrote a SQL dump utility for Mayfly.  That is, take a database and dump it out as SQL statements, capable of re-creating the database.  I played around with it for a while, thought I had it in a fairly complete state, and didn't think about it much until I was ready to use it (for some database upgrade work which I should write about too, but not now).&lt;br /&gt;&lt;br /&gt;Turns out I forgot a key step: try out the dumper on some reasonably large and/or real world data set.  In this case, the data set is the MIFOS SQL schema (and master data checked in, which has various records which the application itself needs). And the test is a fairly simple, automated one: dump out the data and try to reload it.  This test worked fine with some small SQL scripts I wrote for the Mayfly test suite, but when I dumped the MIFOS schema/data, it failed because the foreign keys were out of order.  That is, in the schema, a foreign key has to refer to a table which already exists (constraining the order in which we dump CREATE TABLE statements), and a row has to refer to a row which already exists (constraining the order in which we dump INSERT statements).&lt;br /&gt;&lt;br /&gt;Sounds simple, right?  Just sort the tables, using a java.util.Comparator which returns -1 or 1 based on whether there is a foreign key between the two tables, right?  I'm slightly embarrassed to admit that I actually implemented this, found that it worked on a few small test cases, noticed it failed on MIFOS ("gee, that's funny, I'll need to look into that"), and proceeded to other problems (my immediate need did not demand that the SQL file be reloadable into a database, just that it represent what is in the database).&lt;br /&gt;&lt;br /&gt;So here's an exercise for the reader: what was wrong with my implementation?  (feel free to answer in the comments section if you want).&lt;br /&gt;&lt;br /&gt;Anyway, after some research on Wikipedia, I realized that what I needed was a topological sort (and, fortunately, that I didn't also need something like the Floyd-Warshall algorithm).&lt;br /&gt;&lt;br /&gt;Are there libre Java topological sort libraries out there?  I didn't look very hard, but didn't see any (on apache commons, I saw something in the "commons sandbox", but it wasn't clear that it was ever finished, or that it even is still there).&lt;br /&gt;&lt;br /&gt;I took an hour or two to understand the topological sort algorithm in Wikipedia, and another two hours or so to implement it, and it will probably be another hour or two to have the SQL dump utility call it, so I don't really have a pressing need for a library any more.  That is, unless I find maintaining my implementation ends up being a lot of work, which could happen although I suspect I'm over the hard part in just getting to this point.  It is one of those algorithms which seems scary until you understand it, but pretty simple thereafter.  It's only a hundred or so lines of code and about twice that for tests (depending of course on what you count as the topological sort code and what you count just as setting up various things to send into it).&lt;br /&gt;&lt;br /&gt;I was a little bit surprised that a topological sort was so unfamiliar to me.  I guess this just means that we don't need it all that often, although when we do need it, we need it bad.  By "need it bad", I mean my suspicion is that no amount of ad hoc algorithm building of the sort of "well, these two tables are out of order, let's try switching them" is likely to converge on a working, fast-enough algorithm.  Not that I tried the ad hoc approach (unless you count my Comparator misstep).  I wouldn't say this example undermines my faith in agile practices like &lt;a href="http://www.artima.com/intv/evolution.html" &gt;evoluationary design&lt;/a&gt; or "running code speaks louder than words" (see for example the line "Too much talk, not enough code. Type!" in the &lt;a href="http://www.objectmentor.com/resources/articles/xpepisode.htm" &gt;bowling score&lt;/a&gt; pair programming example).  But it does indicate that there are times when it pays off to stop coding long enough to go read up on how others have solved similar problems.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/37120426-3527869113962758226?l=jkingdon2000.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://jkingdon2000.blogspot.com/feeds/3527869113962758226/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=37120426&amp;postID=3527869113962758226' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/37120426/posts/default/3527869113962758226'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/37120426/posts/default/3527869113962758226'/><link rel='alternate' type='text/html' href='http://jkingdon2000.blogspot.com/2006/12/sql-dumps-and-topological-sorts.html' title='SQL dumps and topological sorts'/><author><name>Jim Kingdon</name><uri>http://www.blogger.com/profile/01857308320156877253</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-37120426.post-983251729864925572</id><published>2006-11-29T12:01:00.000-08:00</published><updated>2006-11-29T13:58:27.789-08:00</updated><title type='text'>Needles, haystacks, and log4j</title><content type='html'>There was an &lt;a href="http://sourceforge.net/mailarchive/message.php?msg_id=37533955 &lt;br /&gt;" &gt;interesting bug report&lt;/a&gt; that came in to the MIFOS mailing list recently.&lt;br /&gt;&lt;br /&gt;Someone was posting that they couldn't start MIFOS (that's not the interesting part).  The interesting part was: "then I am Getting so many errors, List of the Errors is as follows." and lots of &lt;a href="http://logging.apache.org/log4j/" &gt;log4j&lt;/a&gt; output (in fact, too much for the mailing list archive program to show it all, so you'll need to take my word for what was there).&lt;br /&gt;&lt;br /&gt;Most of the log4j output which was INFO messages which meant nothing at all was wrong.  The trouble started with a WARN which started "org.hibernate.cfg.SettingsFactory  -&lt;br /&gt;Could not obtain connection metadata"&lt;br /&gt;and proceeded with a stacktrace.  Then another 19 or so INFO messages (not related to the error, as far as I can tell).  Then another WARN, this one even more cryptic than the last: "org.hibernate.util.JDBCExceptionReporter  - SQL Error:&lt;br /&gt;1045, SQLState: 28000".  Then finally an ERROR which fairly directly said what was wrong: "org.hibernate.util.JDBCExceptionReporter  - Access&lt;br /&gt;denied for user 'root'@'localhost' (using password:&lt;br /&gt;YES)".&lt;br /&gt;&lt;br /&gt;In other words, this was a simple problem (the database user and password that had been supplied to MIFOS were not set up in MySQL) but the actual error message was buried in some 1500 lines of red herrings.&lt;br /&gt;&lt;br /&gt;It's no wonder that software gets a reputation for being hard to install/configure/run, when tracking down the simple problems involves this level of looking for a needle in a haystack.&lt;br /&gt;&lt;br /&gt;For MIFOS, the low-hanging fruit seems pretty clear: make sure the default log4j logging level is set to WARN (in fact, I would have changed this already, except I couldn't find where it is being set - which is another good log4j rant but one for another time).  Then all those INFO messages wouldn't be there.  Bonus points would be given for: (1) reporting the real error once instead of 3 times (probably best done within Hibernate), and (2) making it so that one can go to localhost:8080/mifos (that is, the URL which would have had the application, had it started) and see an error message (or at least a hint - like "application failed to start - see xxx for detail").&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/37120426-983251729864925572?l=jkingdon2000.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://jkingdon2000.blogspot.com/feeds/983251729864925572/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=37120426&amp;postID=983251729864925572' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/37120426/posts/default/983251729864925572'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/37120426/posts/default/983251729864925572'/><link rel='alternate' type='text/html' href='http://jkingdon2000.blogspot.com/2006/11/needles-haystacks-and-log4j.html' title='Needles, haystacks, and log4j'/><author><name>Jim Kingdon</name><uri>http://www.blogger.com/profile/01857308320156877253</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-37120426.post-4965545760542165778</id><published>2006-11-16T15:32:00.000-08:00</published><updated>2006-11-16T17:02:00.395-08:00</updated><title type='text'>SQL DELETE of all rows not as easy as you'd think</title><content type='html'>So clearing out the contents of a table from an SQL database is a relatively common operation.  Tests might do it to start from nothing, or MIFOS's own testdbinsertionscript.sql does it so that the tests can have some sample data which is a bit different than what we supply for production.&lt;br /&gt;&lt;br /&gt;Sounds simple, right?  Just execute:&lt;br /&gt;&lt;br /&gt;&lt;blockquote&gt;&lt;br /&gt;DELETE FROM TABLENAME&lt;br /&gt;&lt;/blockquote&gt;&lt;br /&gt;&lt;br /&gt;And in fact that works most of the time.&lt;br /&gt;&lt;br /&gt;But there is a fairly common case in which things&lt;br /&gt;might not be quite that simple.  Suppose that each row of the table points to a parent.  For example:&lt;br /&gt;&lt;br /&gt;&lt;blockquote&gt;&lt;br /&gt;create table foo(id integer primary key,&lt;br /&gt;  name varchar(255),&lt;br /&gt;  parent integer,&lt;br /&gt;  foreign key(parent) references foo(id)&lt;br /&gt;);&lt;br /&gt;insert into foo values(1, 'Eve', null);&lt;br /&gt;insert into foo values(10, 'Seth', 1);&lt;br /&gt;insert into foo values(101, 'Enos', 10);&lt;br /&gt;&lt;/blockquote&gt;&lt;br /&gt;&lt;br /&gt;(For the non-SQL-aware, the FOREIGN KEY stuff just means what I said in words - that the parent points to another record in the table).&lt;br /&gt;&lt;br /&gt;Now in this case suppose we try to delete a row:&lt;br /&gt;&lt;br /&gt;&lt;blockquote&gt;&lt;br /&gt;delete from foo where id = 1&lt;br /&gt;&lt;/blockquote&gt;&lt;br /&gt;&lt;br /&gt;This should fail, and does, because to delete the record for Eve would leave the record for Seth pointing to nothing.&lt;br /&gt;&lt;br /&gt;But now try:&lt;br /&gt;&lt;br /&gt;&lt;blockquote&gt;&lt;br /&gt;delete from foo&lt;br /&gt;&lt;/blockquote&gt;&lt;br /&gt;&lt;br /&gt;If the database deleted the records one at a time, and applied all the usual rules, then it might fail (depending on in what order the database processes the records).  In fact that is what you see in MySQL, and the developers of MySQL have offered a way around this by adding an ORDER BY to their DELETE statement.&lt;br /&gt;&lt;br /&gt;Hypersonic is much like MySQL, except it seems not to honor any ORDER BY.&lt;br /&gt;&lt;br /&gt;Postgres and Derby, on the other hand are smarter: they just will delete all the rows (I don't know whether they look at foreign keys as a group rather than row-by-row, or what, but the observation is that the delete Just Works).&lt;br /&gt;&lt;br /&gt;Right now Mayfly is like Hypersonic/MySQL, without the chance to specify ORDER BY.  I guess the Postgres/Derby behavior is the right one (although I'll have to think about how to implement it - if it were a simple change I would have just done it, rather than all this whining).  Somehow ORDER BY doesn't feel right to me.  It seems to be based too much on a model of &lt;i&gt;how&lt;/i&gt; delete is to operate, and not enough on &lt;i&gt;what result&lt;/i&gt; delete is supposed to produce.&lt;br /&gt;&lt;br /&gt;For now, I worked around this in MIFOS by first clearing the parent pointers and then deleting the rows:&lt;br /&gt;&lt;br /&gt;&lt;blockquote&gt;&lt;br /&gt;update foo set parent = null&lt;br /&gt;delete from foo&lt;br /&gt;&lt;/blockquote&gt;&lt;br /&gt;&lt;br /&gt;That could get complicated if one were not allowing NULL in this column.  But for this situation, it seems like a pretty painless workaround (this particular test data setup isn't a performance bottleneck, so there is no need to worry about that).&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/37120426-4965545760542165778?l=jkingdon2000.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://jkingdon2000.blogspot.com/feeds/4965545760542165778/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=37120426&amp;postID=4965545760542165778' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/37120426/posts/default/4965545760542165778'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/37120426/posts/default/4965545760542165778'/><link rel='alternate' type='text/html' href='http://jkingdon2000.blogspot.com/2006/11/sql-delete-of-all-rows-not-as-easy-as.html' title='SQL DELETE of all rows not as easy as you&apos;d think'/><author><name>Jim Kingdon</name><uri>http://www.blogger.com/profile/01857308320156877253</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-37120426.post-5495074585973904406</id><published>2006-11-14T08:27:00.000-08:00</published><updated>2006-11-14T08:47:24.547-08:00</updated><title type='text'>Press mention in newsforge</title><content type='html'>I make no attempt here to log all the mentions of Grameen or even MIFOS (especially since the Nobel prize), but here's one in newsforge: &lt;a href="http://trends.newsforge.com/trends/06/11/10/1933211.shtml?tid=138" &gt;Microfinance and open source: natural partners&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;Newsforge is one of the better open source news sites.  I mean, no one can match &lt;a href="http://lwn.net" &gt;LWN&lt;/a&gt;'s Weekly Edition for relevance and good writing, but most of the time that I click on a newsforge article, I end up informed.  Just to pick another example from today, their article about the &lt;a href="http://programming.newsforge.com/programming/06/11/14/1525244.shtml?tid=54&amp;tid=138" &gt;reaction to Sun's Java plans&lt;/a&gt; is spot-on.  It points to some relevant mailing list threads and avoids getting caught up in the hype.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/37120426-5495074585973904406?l=jkingdon2000.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://jkingdon2000.blogspot.com/feeds/5495074585973904406/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=37120426&amp;postID=5495074585973904406' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/37120426/posts/default/5495074585973904406'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/37120426/posts/default/5495074585973904406'/><link rel='alternate' type='text/html' href='http://jkingdon2000.blogspot.com/2006/11/press-mention-in-newsforge.html' title='Press mention in newsforge'/><author><name>Jim Kingdon</name><uri>http://www.blogger.com/profile/01857308320156877253</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-37120426.post-1885304535741426730</id><published>2006-11-10T15:20:00.000-08:00</published><updated>2006-11-10T15:45:56.592-08:00</updated><title type='text'>Testing equals and hashCode</title><content type='html'>This isn't a post about whether it is a good idea to implement equals and hashCode in all your classes, and if so how (check all fields, check some kind of identifier, check fields except the boring ones, etc).&lt;br /&gt;&lt;br /&gt;No, I'm assuming that you have decided to implement equals and hashCode, either because you like working that way, or because you are using a package like Hibernate which encourages/requires it.&lt;br /&gt;&lt;br /&gt;So now the question is: being good test-driven developers that we are, how do we write the tests for our equals and hashCode methods?  Many of us have probably read the javadoc for Object#equals (the so-called equals contract), and started out&lt;br /&gt;writing things like:&lt;br /&gt;&lt;br /&gt;assertTrue(one.equals(two));&lt;br /&gt;assertTrue(two.equals(one));&lt;br /&gt;assertFalse(one.equals(null));&lt;br /&gt;&lt;br /&gt;etc.&lt;br /&gt;&lt;br /&gt;And that's about right.  But it seems like this is a framework waiting to happen (well, framework is probably too grandiose a word for something which probably doesn't need to be more than a hundred or so lines of code and just affects tests for equals and hashCode, but hey, people have called things frameworks for less).&lt;br /&gt;&lt;br /&gt;Are there any good ones out there in Apache commons or the other usual places?  I've seen some really bad ones, but generally have just ended up writing them myself.  I'm enclosing the one I'm currently using in both Mayfly and MIFOS.&lt;br /&gt;&lt;br /&gt;The one thing it doesn't do super-well is test transitivity.  You can give it a bunch of things which should all be equals to each other, and it tests that they all are, but it doesn't do any transitivity tests for not-equals.  I think it is pretty clear how to fix that: instead of just passing in a bunch A of things equals to each other, pass in several bunches: A, B, and C.  Each object within A should be equals to the others in A, but none of the ones in B and C.  Likewise for B and C (the mathematically experienced of you will recognized these "bunches" as equivalence classes).  In fact, I started to implement this today, and I got a bit hung up on whether it reads as nicely as what I have now.  Somehow, passing in Object[][] { new Object[] { a1,a2}} just seemed like too many levels of [] and {} and such.  I don't know if my concern is justified.&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;public static void assertAllEqual(Object[] objects) {&lt;br /&gt;    /**&lt;br /&gt;     * The point of checking each pair is to make sure that equals is&lt;br /&gt;     * transitive per the contract of {@link Object#equals(java.lang.Object)}.&lt;br /&gt;     */&lt;br /&gt;    for (int i = 0; i &lt; objects.length; i++) {&lt;br /&gt;        Assert.assertFalse(objects[i].equals(null));&lt;br /&gt;        for (int j = 0; j &lt; objects.length; j++) {&lt;br /&gt;            assertIsEqual(objects[i], objects[j]);&lt;br /&gt;        }&lt;br /&gt;    }&lt;br /&gt;}&lt;br /&gt;&lt;br /&gt;public static void assertIsEqual(Object one, Object two) {&lt;br /&gt;    Assert.assertTrue(one.equals(two));&lt;br /&gt;    Assert.assertTrue(two.equals(one));&lt;br /&gt;    Assert.assertEquals(one.hashCode(), two.hashCode());&lt;br /&gt;}&lt;br /&gt;&lt;br /&gt;public static void assertIsNotEqual(Object one, Object two) {&lt;br /&gt;    assertReflexiveAndNull(one);&lt;br /&gt;    assertReflexiveAndNull(two);&lt;br /&gt;    Assert.assertFalse(one.equals(two));&lt;br /&gt;    Assert.assertFalse(two.equals(one));&lt;br /&gt;}&lt;br /&gt;&lt;br /&gt;public static void assertReflexiveAndNull(Object object) {&lt;br /&gt;    Assert.assertTrue(object.equals(object));&lt;br /&gt;    Assert.assertFalse(object.equals(null));&lt;br /&gt;}&lt;br /&gt;&lt;/pre&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/37120426-1885304535741426730?l=jkingdon2000.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://jkingdon2000.blogspot.com/feeds/1885304535741426730/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=37120426&amp;postID=1885304535741426730' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/37120426/posts/default/1885304535741426730'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/37120426/posts/default/1885304535741426730'/><link rel='alternate' type='text/html' href='http://jkingdon2000.blogspot.com/2006/11/testing-equals-and-hashcode.html' title='Testing equals and hashCode'/><author><name>Jim Kingdon</name><uri>http://www.blogger.com/profile/01857308320156877253</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-37120426.post-116303107183673060</id><published>2006-11-08T15:55:00.000-08:00</published><updated>2006-11-10T15:18:53.262-08:00</updated><title type='text'>Mayfly SQL Dump produces SQL that Mayfly can read</title><content type='html'>My project of the last few days has been to write a dump utility so that Mayfly can output a database in SQL (similar to mysqldump and similar tools provided with most databases).&lt;br /&gt;&lt;br /&gt;Mayfly's dumper can now output CREATE TABLE statements with all of Mayfly's current data types, and likewise INSERT statements for the rows.&lt;br /&gt;&lt;br /&gt;So the milestone is that I can now take the standard MIFOS data from the unit tests (DatabaseSetup#getStandardStore()), give it to the dumper, load that dump file back into Mayfly, dump it again, and the first and second dumps will have identical contents.&lt;br /&gt;&lt;br /&gt;Now, if the dump just leaves out parts of the data/metadata (as it currently does with constraints, auto-increment values, and binary columns), then this test won't complain (the first dump will omit something, and the reload will just load something different).  But it still seems like the dumper might not be too far from finished: this test at least implies that the dumper doesn't blow up on anything in the MIFOS data/metadata, and doesn't generate any invalid SQL.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/37120426-116303107183673060?l=jkingdon2000.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://jkingdon2000.blogspot.com/feeds/116303107183673060/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=37120426&amp;postID=116303107183673060' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/37120426/posts/default/116303107183673060'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/37120426/posts/default/116303107183673060'/><link rel='alternate' type='text/html' href='http://jkingdon2000.blogspot.com/2006/11/mayfly-sql-dump-produces-sql-that.html' title='Mayfly SQL Dump produces SQL that Mayfly can read'/><author><name>Jim Kingdon</name><uri>http://www.blogger.com/profile/01857308320156877253</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-37120426.post-116283213776836600</id><published>2006-11-06T08:55:00.000-08:00</published><updated>2006-11-10T15:18:53.195-08:00</updated><title type='text'>First MIFOS unit tests pass with Mayfly</title><content type='html'>(This was actually from 2 Nov 2006)&lt;br /&gt;&lt;br /&gt;So, one of my main projects lately (last 2 months or so) has been getting the MIFOS unit tests to work with an &lt;a href="http://wiki.java.net/bin/view/Javatools/InMemoryDatabase"&gt;in-memory database&lt;/a&gt;.  For a while the task was just to get &lt;a href="http://mayfly.sourceforge.net/"&gt;Mayfly&lt;/a&gt; to read the MIFOS SQL files (mifosdbcreationscript.sql and mifosmasterdata.sql) - I could measure progress by how many lines into the script before Mayfly gave an error.&lt;br /&gt;&lt;br /&gt;After that, the task was to get Hibernate to talk to Mayfly.  This was considered successful when a simple Hibernate call could get an object from data which had been in the database (I later found out that there were other corners of Hibernate I needed to worry about).&lt;br /&gt;&lt;br /&gt;Then there was running a MIFOS test (one of the existing unit tests, which have been running with MySQL until now).  I started with FeePersistenceTest (chosen more or less at random).&lt;br /&gt;&lt;br /&gt;First step was making it through the initialization code in TestCaseInitializer.  This mostly just worked, but there was one interesting surprise.  There was a join of 80 rows by 500 rows by 500 rows (written with implicit joins and WHERE, not INNER JOIN and ON), and that was too much for the naive "build the cartesian product first and then start applying WHERE conditions" algorithm that Mayfly had.  Now, one can argue that a unit test should be whittling down its dataset, and that might be how we end up going, but one of my ideas for MIFOS and Mayfly is to see how far we can get while avoiding some of those familiar unit testing slimmings.  (As another example, if I run into a piece of MySQL-specific SQL, I tend to rewrite the SQL to be portable, or add the feature to Mayfly, rather than build an abstraction layer which lets MIFOS generate different flavours of SQL).  Anyway, back to joins.  I built a simple query optimizer which got me past this.&lt;br /&gt;&lt;br /&gt;Oh, yeah, and there was all the ALTER TABLE work I did so Mayfly could execute some/all of the Iteration*.sql files (as it turns out, I'm not sure I needed this quite yet, but I should soon).&lt;br /&gt;&lt;br /&gt;So next various things failed as FeePersistenceTest created its test objects and such.  I've been fixing those one at a time.  In fact, I've&lt;br /&gt;been beginning to worry about how much work might be left, given that I don't have any particularly good way to estimate how many of these features remain.  Well, this morning I saw an odd symptom - instead of the usual 6 failing tests, I only saw 4.  That's right, 2 had passed. Looking at what had failed, I saw 2 easy features to implement on&lt;br /&gt;my laptop at lunch, and once I checked that in, all 6 were passing!&lt;br /&gt;&lt;br /&gt;Now, when I tried running CenterBOTest (second test picked at random), there were a whole new set of failures. But still, to be over the FeePersistenceTest hump is quite exciting.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/37120426-116283213776836600?l=jkingdon2000.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://jkingdon2000.blogspot.com/feeds/116283213776836600/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=37120426&amp;postID=116283213776836600' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/37120426/posts/default/116283213776836600'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/37120426/posts/default/116283213776836600'/><link rel='alternate' type='text/html' href='http://jkingdon2000.blogspot.com/2006/11/first-mifos-unit-tests-pass-with.html' title='First MIFOS unit tests pass with Mayfly'/><author><name>Jim Kingdon</name><uri>http://www.blogger.com/profile/01857308320156877253</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-37120426.post-116264359146821753</id><published>2006-11-04T04:10:00.000-08:00</published><updated>2006-11-10T15:18:53.120-08:00</updated><title type='text'>Enums are a good thing</title><content type='html'>Yesterday I dove into the tests looking for something to clean up.  I started with the NonUniqueObjectException we're getting in one test (and swallowing), but in the process of trying to look around to see what the two objects might be that make uniqueness not exist, I found other code smells.&lt;br /&gt;&lt;br /&gt;So I'm looking at code which (simplified) looked something like:&lt;br /&gt;&lt;br /&gt;createClient(Short.valueOf("3"), "A test client")&lt;br /&gt;&lt;br /&gt;The pain involved in using short instead of int is the first glaring thing, but actually what that really should have been was an enum:&lt;br /&gt;&lt;br /&gt;createClient(CustomerStatus.CLIENT_ACTIVE, "A test client")&lt;br /&gt;&lt;br /&gt;If those two things look basically the same to you, I'd suggest thinking a little harder about where you are spending your brain power while reading/maintaining this code.  Sure, once you've come up to speed you can probably remember that "3" here means active, but shouldn't you have the computer keep track of that?  And if you are just learning this code, or forgot that detail, then "3" is totally mystifying - in fact what got me onto this tangent is that I was wondering whether it was an ID which, duplicated, had something to do with the Hibernate non-unique exception.&lt;br /&gt;&lt;br /&gt;One more detail: how did I fix this?  The createClient method had about 150 callers (fortunately with good test coverage).  So I didn't want to fix them all at once.  I created my new createClient:&lt;br /&gt;&lt;br /&gt;createClient(CustomerStatus status, String name)&lt;br /&gt;&lt;br /&gt;and had it call the old one (or maybe vice-versa, the point is having one call the other rather than a copy-paste, since it is so easy to look up the enum from the short, or vice-versa):&lt;br /&gt;&lt;br /&gt;createClient(short status, String name)&lt;br /&gt;&lt;br /&gt;I then started fixing up callers.  I think I got to about 100 before I got bored.  So I checked in the 100, and I can get to the other 50 some other day.&lt;br /&gt;&lt;br /&gt;I suppose I could also turn this into a rant about how helpful Java's strong typing is, because with the enum I know (as I'm typing, thanks to Eclipse, not just at run-time) what that first argument to createClient is.  But that's a debate which goes back at least to the 1960's.  I'll just say that since we are paying the price (extra syntax, mainly) for compile-time types, we should get the payoff.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/37120426-116264359146821753?l=jkingdon2000.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://jkingdon2000.blogspot.com/feeds/116264359146821753/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=37120426&amp;postID=116264359146821753' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/37120426/posts/default/116264359146821753'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/37120426/posts/default/116264359146821753'/><link rel='alternate' type='text/html' href='http://jkingdon2000.blogspot.com/2006/11/enums-are-good-thing.html' title='Enums are a good thing'/><author><name>Jim Kingdon</name><uri>http://www.blogger.com/profile/01857308320156877253</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry></feed>
