Is there a "Large Scale Python Software Design" ?

Thu Oct 21 23:08:05 EDT 2004

Andreas Kostyrka wrote:
> On Tue, Oct 19, 2004 at 07:16:01AM -0700, Jonathan  Ellis wrote:
> > Testing is good; preventing entire classes of errors from ever
> > happening at all is better, particularly when you get large.
Avoiding
> > connectedness helps, but that's not always possible.

> What classes of errors are completely avoided by "static typing" as
> implemented by C++ (Java)? Just out of curiosity, because this is
> usually stated as "true by axiomatic definition" in this kind of
> discussions.

As one example: in this codebase (closer to 700 kloc than 500 by this
time, if it matters) the very oldest code used a Borland wrapper over
JDBC.  At the time, it allowed doing things JDBC version 1 did not; by
the time I got fed up, JDBC version 3 had caught up and far surpassed
Borland's API.  There was also a lot of JDBC code that was suboptimal
-- for the application I worked on, it almost always made sense to use
a PreparedStatement rather than a simple Statement, but because binding
parameters in jdbc is something of a PITA we often went with the
Statement anyway.  Both the Borland-style and the JDBC code also dealt
with calls to stored procedures, most of them not in CallableStatements
(the "right" way to do this).

I volunteered to write a more friendly wrapper over JDBC than Borland's
that would handle caching of [Prepared|Callable]Statement objects and
parameter binding transparently, nothing fancy (in particular my select
methods returned ResultSets, where Borland had their own class for
this) and rewrite these thousands of calls to use the new API.  Of
course I wrote scripts to do this; 5 or 6, each handling a different
aspect.

To write unit tests for this by hand would have been obscene.  (As an
aside, writing unit tests for anything that deals with many tables in a
database is a PITA already and usually ends up not really a "unit" test
anymore.)  Even generating unit tests with more scripts would have
required a significantly deeper semantic understanding of the code
being filtered, and hence a lot more work.

As it was, with the compiler letting me know when I screwed up and
improve my scripts accordlingly, out of the thousands of calls, I
ultimately had to do a few dozen by hand (because that was less work
than getting my scripts able to deal with the very worst examples), and
the compiler let me know what those were.  After the process was
complete, QA turned up (over several weeks) 4 or 5 places where I'd
broken things despite the static checking, which I considered a very
good success ratio.

-Jonathan