[XML-SIG] Developer's Day

Paul Prescod paul@prescod.net
Sun, 19 Dec 1999 12:00:04 -0600


Fredrik Lundh wrote:
> 
> professional software development has always been
> (and will always be) about making the right tradeoffs.

James Clark's incredibly fast parsers for both XML and (the much, much,
harder) SGML written in both C and Java show that there is no need to
trade-off anything.

> "You told us to use Python for this million dollar
> system but halfways through its second day of
> operation, we realized that the production XML
> files were large enough to bring the server back-
> bone to its knees.  We now have several gigabytes
> sitting in the input queue, and no way to catch
> up.  The system simply isn't fast enough."

Expat can chew through 17 megabytes in 7 seconds on my laptop working
under the crippling presence of Windows NT. The only trick is getting
the Python binding fast enough. I'm not clear on why, of all the
scripting language communities, we Python people are the only ones with
an antipathy towards expat. I mean rather than debate about the various
tradeoffs we could get standards conformance, performance and reduce our
maintenance burden by sharing maintenance with the other users of expat:

 * Mozilla
 * Perl
 * TCL
 * Javascript

Anyhow, let me ask whether in the *standard library* it is more
important to support the XML specification properly or to be able to
handle the gigabyte documents that most people are unlikely to ever
encounter.
-- 
 Paul Prescod  - ISOGEN Consulting Engineer speaking for himself
Three things see no end: A loop with exit code done wrong
A semaphore untested, and the change that comes along
http://www.geezjan.org/humor/computers/threes.html