Wednesday, January 10, 2007

Is SQL a strongly typed language?

A strongly typed language can mean different things, but here I look at how SQL stacks up against some of the definitions of strong typing (and to keep it practical, what different SQL implementations do and do not let you do).

* Static typing as opposed to dynamic typing: SQL is statically (strongly) typed in this sense; each column has a type.

* compile-time checks for type constraint violations: Here we need to define the difference between compile-time and run-time in SQL. Basically I would call "compile-time" to be the call to Connection#prepareStatement and "run-time" to be the call to PreparedStatement#executeUpdate. An alternate (probably mostly equivalent) definition is that a "compile-time" check happens even if there are no rows to operate on. A browse through the Mayfly testsuite for "rowsPresent" flags will show cases in which SQL implementations differ on whether a particular check is compile-time or run-time, although the popularity of query optimizers tends to mean that checks happen at compile time (in those cases where I've checked; most of the Mayfly acceptance tests don't distinguish the two cases).

* complex, fine-grained type system: SQL is more fine-grained than systems of the "everything is a string" variety (there are different syntaxes for '5', 5, 5.0, and x'5'), but only recent versions of SQL try to add things like structure/record types.

* omit implicit type conversion. The databases I've tested (Mayfly,MySQL,Postgres,Derby, and Hypersonic) all refuse to read 'foo' as zero (if looking for an integer). All tested databases allow storing an integer into a DECIMAL column (that is, you can say INSERT INTO FOO(x) VALUES(5) and you don't need to say 5.0 even if x is of type DECIMAL). There are also plenty of cases in between (for example, INSERT INTO FOO(x) VALUES(9) into a string column, which works in Hypersonic, MySQL and Postgres, but not Derby or Mayfly).

Anyway, I could go on, either with gory details of what does and does not work (for which I'm better off just having you look at the the Mayfly acceptance tests), or with more general philosophy on types (which dictates what kinds of cases to look for), but I'll cut to the chase: What will be most useful for Mayfly users?

For now, I am generally leaning in the direction of making Mayfly picky - it seems better to catch any errors early (when writing/running tests), rather than later (when trying to deploy on different databases, for example). It is my experience so far that MIFOS doesn't seem to play loose with the type system (which is probably mainly a reflection on Hibernate), so I feel somewhat vindicated in this judgment. As with many things for Mayfly, I realize there are other situations (most notably, if you want to Mayfly-ize an existing application without having to modify it). So the question is whether Mayfly should be opinionated software? I'm generally of the mind that software works best with a clear idea of what it is supposed to do, and that software which tries to accomodate every possible answer to "how should X work?" tends to just get a bunch of poorly-thought-out configuration choices, none of which end up being quite right (for any given situation, or opinion). On the other hand, I'm assuming that Mayfly sooner or later will have some kind of "opinion manager" where you can pick, say, "please enforce the practices considered best by the mayfly developers", or "maximize MySQL compatibility" or even define your own (much like the code formatting configuration in IDEA or Eclipse).

Whether strong typing is a case for options, I don't know, however. Fear says that of course things would break and we need an escape hatch. But I am beginning to wonder whether that breakage is small enough to just wait and see whether this becomes a problem.

No comments: