[XML-SIG] Changes in pyexpat.c
Martin v. Loewis
martin@loewis.home.cs.tu-berlin.de
Wed, 27 Sep 2000 23:16:46 +0200
> Probably because the checked-in 4DOM is out of date. We've
> hesitated checking in the 4Suite 0.9.x version because of all the
> flux and not wanting to contribute to the confusion (and not being
> sure whether we had much bandwidth to help sort out any resulting
> confusion).
>
> However, it's time to do the right thing, so...
Yes, I was going to ask whether PyXML could get a new copy of 4DOM...
> Do we check the latest 4DOM and back-port the output encoding stuff to PyXML
> (it's all in ext/Printer.py)
Sounds like a good plan to me.
> I haven't had a chance to play with Python 2.0, so I'm not sure how
> hard the port would be. Here is the representative snippet from
> ext/Printer.py
It should not be too difficult to have this working on all Python
versions.
> from xml.unicode.iso8859 import wstring
> wstring.install_alias('ISO-8859-1', 'ISO_8859-1:1987')
try:
import codecs #will fail on 1.5
def utf8_to_code(string,encoding):
encoder = codecs.lookup(encoding)[0] # encode,decode,reader,writer
return encoder(unicode(string,"utf-8"))[0] # result,size
except ImportError:
def utf8_to_code(string,encoding):
#raise exception?
#support some trivial cases, e.g. latin1?
#try wstrop?
return string # silently return utf-8...
> #Note: Pass through to wstrop. This means we don't play nice and
> #Escape characters that are not in the target encoding.
> ws = wstring.from_utf8(new_string)
> new_string = ws.encode(encoding)
> #This version would skip all untranslatable chars: see wstrop.c
> #new_string = ws.encode(encoding, 1)
new_string = utf8_to_code(new_string,encoding)
Regards,
Martin