[XML-SIG] Q: minidom and iso-8859-1

J.R. van Ossenbruggen Jacco.van.Ossenbruggen@cwi.nl
Tue, 12 Sep 2000 10:37:00 +0200


On Tue, Sep 12 2000 "J.R. van Ossenbruggen" wrote:
> On Tue, Sep 12 2000 "Martin v. Loewis" wrote:
> > > BTW, I expected that converting to UTF-8 would also print the right
> > > result, but it didn't.  What am I missing here?
> > 
> > Hard to say. Converting to UTF-8 will print the right result, which
> > result did you get?
>
> # Input file test.xml:
> #
> #  <?xml version="1.0" encoding='iso-8859-1'?>
> #  <test>Grønbæck</test>"""
> 
> import sys
> import xml.dom.minidom
> p=xml.dom.minidom.parse('test.xml')
> 
> print p.documentElement.childNodes[0].nodeValue.encode('UTF-8')
> # wrong; prints: Grønbæck

Oops, I think I begin to understand what is going on.  The UTF-8
indeed prints the right result, it was just not the result I (encoding
newbie) expected.

I think I just asked myself the wrong question (how was the original
XML encoded) while I should have asked myself in what encoding I want
to have the output in.

Jacco