Trouble saving unicode text to file

John Machin sjmachin at lexicon.net
Sat May 7 21:30:36 EDT 2005


On Sat, 7 May 2005 17:25:28 -0500, Skip Montanaro <skip at pobox.com>
wrote:

>
>    Svennglenn> Traceback (most recent call last):
>    Svennglenn>   File "D:\Documents and
>    Svennglenn> Settings\Daniel\Desktop\Programmering\aaotest\aaotest2\aaotest2.pyw",
>    Svennglenn> line 5, in ?
>    Svennglenn>     titel = unicode(titel)
>    Svennglenn> UnicodeDecodeError: 'ascii' codec can't decode byte 0xe5 in position 0:
>    Svennglenn> ordinal not in range(128)
>
>Try:
>
>    import codecs
>
>    titel = "åäö"
>    titel = unicode(titel, "iso-8859-1")
>    fil = codecs.open("testfil.txt", "w", "iso-8859-1")
>    fil.write(titel)
>    fil.close()
>

I tried that, with this result:

C:\junk>python skip.py
sys:1: DeprecationWarning: Non-ASCII character '\xe5' in file skip.py
on line 3, but no encoding declared; see http://www.python.org
/peps/pep-0263.html for details

1. An explicit PEP 263 declaration (which the OP already had!) should
be used, rather than relying on the default, which doesn't work in
general if you substituted say Polish or Russian for Swedish.

2. My bet is that 'cp1252' is more likely to be appropriate for the OP
than 'iso-8859-1'. The encodings are quite different in range(0x80,
0xA0). They coincidentally give the same result for the OP's limited
sample. However if for example the OP needs to use the euro character
which is 0x80 in cp1252, it wouldn't show up as a problem in the
limited scripts we've been playing with so far, but 0x80 in the script
is sure not going to look like a euro in Tkinter if it's being decoded
via iso-8859-1. Your rationale for using iso-8859-1 when the OP had
already mentioned cp1252 was ... what?






More information about the Python-list mailing list