Friday, October 10, 2008

Writing an Eclipse plug-in

I recently started playing around with writing an Eclipse plug-in, and I thought I should share some first impressions about how easy it was.

The motive was to more easily play with metamath, a system to automatically verify (not write) mathematics proofs. Unless I've missed something, there is little activity on development environments for math proof systems, but it seems to me the need for good tools (like eclipse) is at least as great for math proofs as for software.

In all cases this was based on the Eclipse which ships with Fedora 8, although I'm not aware of anything Fedora-specific. I started with the eclipse help files for "Plug-in development environment". It was relatively easy to create a project seeded with one of the example plug-ins which ships with eclipse (in my case, first the hello world one, and then com.example.witheditor which seemed like the most relevant to the plug-in which I was trying to write, which at least at the start will be a few simple decorations on top of text editing, similar to an emacs major mode or one of the syntax coloring modules for vi, gedit, etc). Generally, the help files walked me through all I needed to get started. I was somewhat puzzled with "how do I get back to the Overview once it is closed" until I figured that opening the plugin.xml file gives you a specialized view, including Overview, with links to click and forms to edit.

The fact that Fedora ships with the source code to the standard Eclipse classes that a plug-in needs to hook into, combined with the good Eclipse features for navigating Java code, made things really easy. I was pleasantly surprised the first time I control-clicked on an Eclipse method and got not only the arguments I needed to pass in, but also the commented source code. Within a few hours, I had turned the example plugin into something which was at least starting to understand metamath syntax. Not bad considering that this includes a fair bit of experimenting (e.g. playing with foreground and background colors) and learning about the platform. I tend to find that Java and Eclipse make it easier to explore a large unfamiliar codebase, compared with a language like Ruby (where exploring large unfamiliar codebases, like Rails, has been a daily activity in my recent job), but I also give credit to developers of the Eclipse plug-in system. The example pointed me to the relevant parts of the plug-in libraries, and the well-commented source code of the libraries themselves helped me poke around to figure out what pieces would do what I want. Another huge win is the way that the eclipse plug-in development environment just worked out the box. There was no messing around with CLASSPATHs, jars, ant and similar rigamarole: just go to the Overview page and click on "launch" and you are running.

Error reporting was a problem: one of my first edits to the example passed a bad value to a constructor in a constant (this is the RGB constructor in IXMLColorConstants if anyone is following along). There was a dialog box referring to an error log, but I have no idea where this error log might live (apparently not in my plug-in project). Ideally, the plug-in development environment would have somehow showed an exception with a stack trace, or something of the sort.

My plug-in so far can be found at mmclipse.

Thursday, August 28, 2008

Simple Design and Testing conference, suburban Chicago, Sep 12-14

I probably should have mentioned this a while ago, but coming up in about two weeks is the 2008 Simple Design and Testing conference. As with other open space conferences, the format is similar to Birds of a Feather (BoF) sessions and ranges from free-wheeling discussion to something slightly more closely approaching a presentation. Topics typically center around things like agile development and object oriented design. The conference is free and registration is pretty simple - write a wiki page called a "position paper" saying what you want to learn and/or contribute. Mine was ErectorRubyRenderingLibrary.

Thursday, May 22, 2008

Parse XML with saxophone

Programmers often want to parse an XML document and perform some actions as we do (for example, build up an in-memory data structure, write data to a database, print output to the console, etc).

For the most part, there have only been two well-known ways of doing this. The first is to read the XML document into a DOM, which is an in-memory tree representing the document. Then you walk the DOM tree doing your own processing. This is usually a pretty convenient way to go. But it has several downsides, the most obvious and probably biggest of which is that if the XML document is too big to fit in memory it won't work.

The second approach is a streaming API such as SAX or xmlpull. SAX calls you whereas you call xmlpull, but either way you are getting a low-level stream of events (start tag, end tag, and text being the most important). For simple tasks, this isn't so bad. But when you have to assemble data from a few different parts of the document, you need to set variables to keep track of what you've gotten and what you are waiting for, and it is possible to make yourself a pretty big pile of spaghetti. My co-workers and I had such a problem recently, and our answer was a small library, which we call saxophone. Saxophone sits on top of SAX, and converts the raw stream of events into calls to handlers, and lets the handlers return data which get passed to other handlers. The current implementation is in ruby, although the ideas should port to other languages if anyone has such a need.

Here's a simple example. We want to parse a web feed and print out each of the titles. That is, there will be, buried somewhere in the XML, something like <title>My Dog has Fleas</title> and we want to print out "My Dog has Fleas" and likewise for every other occurrence of a title tag.

The full example can be found in the examples directory of saxophone, but the key part is:

parser =
:title => lambda { | element | puts element.all_text() }

This is saying that any time you see a "title" tag, call this handler, and provide it all the text directly under title via the all_text() method.

This example doesn't demonstrate everything. Handlers can return values, which are then available to higher handlers. Handlers also have access to attributes directly.

If people want to hear more about it, I can write some more documentation and examples for some of these other features. But the key thing here is that we have a fairly concise way to (a) match the parts of the XML that we care about, and (b) synthesize the results of those matches into larger data structures (but only as far as we want - after we have gotten everything we need to, for example, write a database row, we can return that memory to the system). This is all done in a streaming fashion. That is, we don't need to store the whole document, or the whole result, in memory.

The whole thing kind of reminds me of XSLT or XQuery. In know in XSLT for sure, and probably XQuery, there are some tasks that turn out to be really awkward, and that's probably true of saxophone as well. But saxophone seems like a good match for a few of the problems we've had. And of course, having a library within your regular programming language (in this case Ruby) can also be a plus over a special-purpose language.

Saxophone is free software, available here as part of the pivotalrb project on rubyforge, which contains a variety of open source code released by the ruby/java consulting firm Pivotal. I and my pair wrote saxophone as part of a Pivotal project.

Thursday, April 03, 2008

Draft of Erector talk

I'm giving a talk on erector in about a week at the Washington, DC Ruby User's Group. Here's a draft of my talk; please provide feedback so I can improve the talk (for example, as comments at the bottom of this page).

Ruby's flexible and minimal syntax makes it well-suited to writing libraries for various tasks

Erector applies that to HTML (or XML) rendering

(start up erector in script/console and go through the following examples)

class Foo < Erector::Widget
def render
end do
end.to_s do
p "foo"
end.to_s do
text "hello"
end.to_s do
p do
text "hello"
b "world"
end.to_s do
a :href => "a.html" do
text "world"
end.to_s do
a "world", :href => "a.html" do
end.to_s do
text '<>&'

Hooking erector to rails:

class WelcomeController <

def index
render :text =>


Structuring views with the usual Ruby techniques: especially inheritance and methods

Responsibilities of a view mechanism:

  • quote output

  • balance start/end tags

Comparison with ERB and Markaby, quoting:

ERB: no, must call h

Markaby: for strings, not blocks

Erector: yes*

* except when you call raw

Comparison with ERB and Markaby, tag balancing:

ERB: no

Markaby: yes

Erector: yes*

* except when you call raw

Comparison with ERB and Markaby, subpages:

ERB: partials, helpers

Markaby: partials, helpers

Erector: inheritance, methods

Similar libraries in other languages:

If there is time:

  • Calling helpers from erector

  • Writing helpers in erector

Wednesday, February 20, 2008

Hello world type example for jruby and mayfly

Given all the interest in the ruby programming language, it is natural to ask whether rails applications (or ruby applications more generally) could write their tests with mayfly.

As a first step, I figured out how to run mayfly under jruby. Here's what I did.

First, I installed jruby JRuby 1.1 RC 2as directed. Then, I put the mayfly 0.3 jars in $JRUBY_HOME/lib (there are other ways to tell jruby where to look for them, but this seemed like the simplest).

Then, I put the following in hellodb.rb:

include Java
import 'net.sourceforge.mayfly.Database'

d =
d.execute("create table foo(x integer)")
d.execute("insert into foo(x) values(5)")
puts d.rowCount("foo")
d.execute("insert into foo(x) values(6)")
puts d.rowCount("foo")

Then running jruby will invoke mayfly:

$ jruby hellodb.rb

The next step would be to figure out how to get Active Record talking to Mayfly.
Haven't tried that yet.

Monday, February 11, 2008

Profiling with gprof (success on a short test program)

When last we discussed profiling Mayfly, I was profiling with JIP. Brian Slesinsky, in a comment to that article, told me that he has found that JIP has a per-method cost which shows up in the profiling data (so that method calls appear to be more expensive than they really are). He suggested the NetBeans profiler.

Well, I was looking into how to install and use NetBeans (NetBeans, unlike Eclipse, does not ship with Fedora), and hadn't gotten much of anywhere until I had a lot of time (on a flight), and was left with seeng whether I could get anywhere with the tools which I have already installed. That means gcj and gprof. I got gprof working fine on a short test program (where it correctly identified the bottleneck), but didn't (yet) succeed in running it on mayfly.

I suppose if people would find them helpful, I could upload my test programs and build scripts, but basically they boiled down to:

gcj --main=Profilee -pg
gprof >profiler

Getting something like this invoked from ant is largely a boring but more or less straightforward matter (although it wasn't clear to me how the classpath relates to the class and java files specified on the command line). But at least one of my invocations led to a runaway linker which made my machine swap for quite a while before I finally gave up. So although it is premature to declare victory on this just yet, I did want to report on my success with the test program.