[XML-SIG] Character encodings and expat

Lars Marius Garshol larsga@garshol.priv.no
27 Oct 2000 11:05:46 +0200


* Martin v. Loewis
|
| Once xmlproc is capable of producing Unicode, it will certainly
| understand all encodings that the Python 2.0 encoding machinery knows
| of; that includes "latin1".

Yup.  I plan to teach xmlproc the IANA registry, so that this should
not be a problem with xmlproc.

However, it is a problem that Python does not support any of the Far
East encodings yet.  Does anyone know if there are any plans to change
that? 
 
| We should also strive for teaching expat to use the Python encoding
| machinery, but that may be more difficult. Any volunteers?

I don't think it's really all that difficult.  It should be possible
to use the Python codec system to produce utf-16, and then you feed
this to expat and fix the encoding as "utf-16" in the call to
ParserCreate.

The only possible stumbling block is when expat discovers an XML
declaration that says something other than "utf-16"...

--Lars M.