[Python-Dev] Performance of various marshallers

Tue, 02 Oct 2001 13:44:11 -0700

Skip Montanaro wrote:
> 
>...
> 
> XML-RPC's relationship to Unicode is ill-defined.  The spec that Dave Winer
> wrote requires all data to be US-ASCII, so XML-RPC isn't really
> XML-compliant.  (You'll have to take up issues of standards compliance with
> Dave.)

Most XML-RPC implementations support Unicode, Dave Winer
notwithstanding. Plus, the XML-RPC spec says nothing to indicate that
XML-RPC documents may not be encoded in either of XML's two built-in
encodings (even if the data is restricted to ASCII values).

> Still, Unicode or not, the notion that XML-RPC is a data serialization
> mechanism instead of a compound data markup language means you don't need to
> provide hooks for processing each element, so full-blown XML parsers tend to
> be overkill as py-xmlrpc demonstrates. 

I don't see how that follows. py-xmlrpc needs to handle <struct>
different than <array> so it needs to have a "hook" for each of those
element types. Having a fixed list of hooks or an extensible array of
them should not be much different from a performance point of view.

Yes, an incomplete XML parser could be faster if it ignores Unicode,
ignores character references, and does not do all of the error checking
required by the spec. I'm not sure if this would really improve
performance anyhow.

py-xmlrpc is probably faster because it doesn't call out to Python code
until the entire message has been parsed. xmlrpclib on the other hand,
is entirely written in Python. Is there a Python XML-RPC implementation
that uses no Python code but does use a true XML parser?

> ...  No matter how hard Shilad finds it
> to add Unicode support to his package, it's still likely to be miles ahead
> of other XML parsers.

I think you are exaggerating the benefit of having a fixed vocabulary.
There is hardly any performance boost possible based on that one detail.

 Paul Prescod