XML can't read Unicode shock horror. News at 11.
Dale Strickland-Clark
dale at riverhall.NOTHANKS.co.uk
Thu Nov 1 05:51:49 EST 2001
Paul Prescod <paulp at ActiveState.com> wrote:
>Dale Strickland-Clark wrote:
>>
>> ...
>>
>> Is there any chance that this might be elevated?
>>
>> Non-unicode XML is a bit restrictive. :-(
>
>I think Martin was trying to make the point that this works okay:
>
>dom = xml.dom.minidom.parseString(u'<node/>'.encode("utf-8"))
>
>I agree with you that minidom should probably do this automatically.
>
> Paul Prescod
That's not much good if my XML document happens to start with:
<?xml version="1.0" encoding="UTF-16"?>
To quote from the O'Reilly book, "XML In A Nutshell" p71: "An XML
parser is required to handle the UTF-16 and UTF-8 encodings or
Unicode." And I expect similar is stated in the XML DOM spec if I had
time to look for it.
--
Dale Strickland-Clark
Riverhall Systems Ltd
More information about the Python-list
mailing list