Typing system vs. Java

Thu Aug 2 12:18:22 EDT 2001

On Thu, 2 Aug 2001, Donn Cave wrote:

> | [ snipped 10 lines of Python ]
> | [ snipped 30 - 40 lines of type-checking OCaml ]
> |
> | c.l.ocaml?). A veteren Pythonista won't have too many chances for causing
> | that bug because the Python version will be so short and simple.
>
> A Perl version could be shorter yet, maybe half the lines.  So even
> fewer bugs?

Two points: (1) yes, in general there is a correlation between lines of
code and number of bugs. (Don't bother posting a complex one-liner as
counter-evidence - this is a general and not an all-encompassing rule).
(2) I said short *and* simple.

Your strict type-checking snipped of code was the perfect example of this
- there were a ton of places where I, the developer, could have screwed up
in writing that mess. The complexity of the Python one was so much lower
there's just not as much that I could do wrong. Yes, that type checking
probably did prevent those type-related bugs, but the added complexity
opens the door for a slew of different bugs.

> | - If the function later needs to be modified to be more flexible with
> | different types or even just a little extra functionality, the Python one
> | is gonna be a breeze. I'd feel confident about a quick change followed by
> | some quick testing. I'd feel much more nervous about modifying the other
> | one.
>
> Fine!  You'll overcome your nervousness, sit down and start coding
> in those new types or whatever.  Maybe later, you'd do it with more
> confidence.  But confident or not, OCaml won't let you go until you
> have accounted for those new types, wherever necessary in the program.
> OCaml doesn't let your confidence become a liability.

?? The point is that, not only is the initial cost of writing the
type-checking one dramatically higher, maintenance on the code is also
much, much higher. When you come back to the function after not looking at
it for a few months, that's a lot of code to make sense of. Fiddling with
it has a high risk of introducing new bugs (not necessarily type-related
bugs) because of the much higher complexity.

This shouldn't strike you as odd. If you have an assembly function and a C
function that do the same thing and you need to change them, which one
will you most likely break? If you have a complex implementation of a
function and a simple, straightforward one, and you change them both,
which one will most likely break? 99 times out of 100 (or more), the short
and simple functions are harder to break. It's just a fact of life that
the more balls you have to keep in the air at one time, the more likely
you are to drop one.

> | - After a little experience using Python, the cost of finding that bug
> | will drop dramatically (how long would it take you, Donn, to figure it
> | out? 30 seconds maybe?) as will the likelihood of that bug occurring, but
> | the cost of creating the OCaml type-checking version will remain more or
> | less constant - you'll always have to write about that much code. So
> | you'll fight your way through the learning curve only to find that by
> | doing so what you learned just went down in value. ;-)
>
> Perversely missing the point.  For sure, once the bug came to my
> attention, it took much less than 30 seconds to find it.  But should
> we count the time it took my client to get me a traceback?

This is a bit of an exaggeration, but I'll play along. First of all, look
at this specific bug. How in the world did you manage to ship that? Did
you test at all? This wasn't exactly hard to reproduce or find. Now look
at it in the general sense. Yes, you should count the time it took your
client to give you traceback, install a solution, etc, but it should count
for each and every bug, not just this one. I can guarantee one of two
things (and maybe both): either the complex type-checking one has more
bugs in it due to the added complexity OR it took so long to write it
correctly that before you were a third of the way done your evil twin
underbid you, delivered a working Python version, fixed that one bug,
delivered a new version with better features, delivered another version to
suit the changing needs of the business, delivered another version with
new features, and got showered with money.

I mean, just *look* at that strict type-checking version of code. Wow! All
that complexity for such a simple piece of functionality. What in the
world would it look like if you needed to do something complex? I mean,
yes, you may have solved those darn type-related bugs, but how many more
bugs did you introduce in the process?

> it the solution took getting to the client, properly installed?
> If I spent extra hours wrestling with a few lines of code, and the
> reward is a modest improvement in reliability, it might well be worth
> it even if I don't do space shuttle gigs.

Yes, it might be, and you do have to judge for yourself. In general,
though, wrestling with code is a Bad Thing, especially if something so
simple is causing the wrestle. Your code may end up reliable in the
type-checking sense but be less reliable overall, sort of a
complexity-induced flakiness.

> me.  Or I could test the bejeezus out of it, which would surely take
> much longer,

Thorough testing should happen on both versions, and the type-safe version
would definitely take a lot longer to test thoroughly. Every time I look
at it I just go, "wow."

> and the magic inputs to get to that bug might or might
> not come up.

Huh? There's nothing magic about thorough testing.

> But your criticisms seem a little misguided.  If your point is
> that Python is good for quick hacks, and OCaml isn't, fine.  I'll
> buy that.  But what about type checking and serious programming?

What in the world? Sorry if that's what you thought I was saying, because
that's not it at all. I _am_ saying that, because more complexity and more
lines of code tend to mean more bugs, the type-safe version nearly ruins
my chances of success in a reasonable time frame.

> I guess you don't want any more examples of problems type checking
> could solve?

Why would that be the case? Because the examples so far have reinforced
the notion that today's type-checking systems can fix those problems but
only at a tremendous cost (increased development time/effort, increased
testing time, increased maintenance time)?