[ANN] pyxser-0.2r --- Python XML Serialization

Stefan Behnel stefan_ml at behnel.de
Sun Apr 19 15:08:50 EDT 2009


Daniel Molina Wegener wrote:
> Stefan Behnel <stefan_ml at behnel.de>
> on Sunday 19 April 2009 02:25
> wrote in comp.lang.python:
> 
> 
>> Daniel Molina Wegener wrote:
>>>         * Every serilization is made into unicode objects.
>> Hmm, does that mean that when I serialise, I get a unicode object back?
>> What about the XML declaration? How can a user create well-formed XML from
>> your output? Or is that not the intention?
> 
>   Yes, if you serialize an object you get an XML string as
> unicode object, since unicode objects supports UTF-8 and
> some other encodings.

That's not what I meant. I was wondering why you chose to use a unicode
string instead of a byte string (which XML is defined for). If your only
intention is to deserialise the unicode string into a tree, that may be
acceptable. However, as soon as you start writing the data to a file or
through a network pipe, or pass it to an XML parser, you'd better make it
well-formed XML. So you either need to encode it as UTF-8 (for which you do
not need a declaration), or you will need to encode it in a different byte
encoding, and then prepend a declaration yourself. In any case, this is a
lot more overhead (and cumbersome for users) than writing out a correctly
serialised byte string directly.

You seemed to be very interested in good performance, so I don't quite
understand why you want to require an additional step with a relatively
high performance impact that only makes it harder for users to use the tool
correctly.

Stefan



More information about the Python-list mailing list