SAX-Parser entity
Jason Orendorff
jason at jorendorff.com
Sat Mar 2 02:11:50 EST 2002
fabi kreutz wrote:
> where Character 19 in Row 29 is the occurence of an ü.
Yep, if you are going to use non-ascii characters you have to
specify the encoding in the XML document itself, so the XML parser
knows what's going on.
> After browsing the FAQs I changed the default encoding in site.py to
> iso-8859-1, which had some nice effect, but not on minidom.
What you need is:
<?xml version="1.0" encoding="ISO-8859-1"?>
It must be the very first thing in the XML document; i.e. the two
characters "<?" must be the first two bytes.
(Alternatively, ü can also be written in XML as ü .
See http://www.unicode.org/charts/PDF/U0080.pdf for more
codepoints and http://www.unicode.org/charts/ for oodles
more still.)
*** In general, your life as a developer will be much easier
once you grok Unicode.
(I don't know why people can't just read the XML standard and
figure this out for themselves. I mean, come on, guys, it's only
40 pages of incredibly dense gibberish. <wink>)
## Jason Orendorff http://www.jorendorff.com/
More information about the Python-list
mailing list