[Python-Dev] XML codec?

M.-A. Lemburg mal at egenix.com
Fri Nov 9 14:22:35 CET 2007


On 2007-11-09 14:10, Walter Dörwald wrote:
> Martin v. Löwis wrote:
>>>> Yes, an XML parser should be able to use UTF-8, UTF-16, UTF-32, etc
>>>> codecs to do the encoding.  There's no need to create a magical
>>>> mystery codec to pick out which though.
>>> So the code is good, if it is inside an XML parser, and it's bad if it
>>> is inside a codec?
>> Exactly so. This functionality just *isn't* a codec - there is no
>> encoding. Instead, it is an algorithm for *detecting* an encoding.
> 
> And what do you do once you've detected the encoding? You decode the
> input, so why not combine both into an XML decoder?

FWIW: I'm +1 on adding such a codec.

It makes working with XML data a lot easier: you simply don't have to
bother with the encoding of the XML data anymore and can just let the
codec figure out the details. The XML parser can then work directly
on the Unicode data.

Whether it needs to be in C or not is another question (I would have
done this in Python since performance is not really an issue), but since
the code is already written, why not use it ?

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Nov 09 2007)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

:::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! ::::


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611


More information about the Python-Dev mailing list