[XML-SIG] Developer's Day position paper

Andrew M. Kuchling akuchlin@mems-exchange.org
Fri, 21 Jan 2000 15:27:58 -0500 (EST)


Paul Prescod writes:
>I don't see where you get that figure. Actual parsing takes up a small
>fraction of xmllib's time. If you reduce that to zero you still don't
>speed up the entire process much. If you double it, you don't slow

<slaps self silly>  Of course, you're right; the vast majority of the
time is spent in calling Python code, not parsing, and the calling
time shouldn't scale as badly as I thought.  

I've been reading through the sgmlop.c code.  While it probably
wouldn't be too difficult to add the missing features (there are
already FIXMEs in the code where you would do DTD-related parsing) and
tighten up the parser's strictness when in XML mode, it is a
significant bit of work.  Also, given that Expat handles multiple
encodings, while sgmlop gains its speed from being able to use code
like this:

            while (ISALNUM(*p) || *p == '.')
                if (++p >= end)
                    goto eol;

I wonder if the added dereferences of a character encoding translation
table would knock sgmlop's speed down to Expat's?

-- 
A.M. Kuchling			http://starship.python.net/crew/amk/
It would be nice to be unfailingly, perpetually, remorselessly funny, day in
and day out, year in and year out until somebody murdered you, now wouldn't it?
    -- Robertson Davies, _The Diary of Samuel Marchbanks_