[XML-SIG] unicode

Lars Marius Garshol larsga@garshol.priv.no
10 Aug 2001 10:05:41 +0200


* Mark McEahern
|
| xml.dom.minidom.parseString() doesn't accept unicode input--is this
| a bug?

* Martin v. Loewis
| 
| I'd say it is a well-known limitation. In addition, it is questionable
| what the parser should do if you have, say
| 
| u"<?xml version='1.0' encoding='koi8-r'><foo/>"

Not really. In this case the parser should give a warning and just
continue, I think, since this is something that could quite reasonably
happen if your own code handles the conversion. This case is really no
different from passing a SAX InputSource with a character stream.
(BTW, does that work? It should.)
 
| If you have the need to parse Unicode strings, I'd recommend to encode
| them first.

I consider that a workaround. I agree that it's the best one, but we
should aim to fix this.

--Lars M.