unicode question...

Alex Martelli aleaxit at yahoo.com
Tue Nov 7 15:46:28 EST 2000


"Bjorn Pettersen" <bjorn at roguewave.com> wrote in message
news:mailman.973625063.1676.python-list at python.org...
> I'm on Win2k and would like to print a couple of unicode characters to a
> file so I can open them in another application to view them. My current
> attempts have not been successful...
    [snip]
> >>> s
> u'\u0141\u0142'
> >>> fp = open('foo.txt','w')
> >>> fp.write(s)
> Traceback (most recent call last):
>   File "<stdin>", line 1, in ?
> UnicodeError: ASCII encoding error: ordinal not in range(128)

The default encoding method is 'ascii'.  You probably want
to use 'utf-16'.  Therefore...:

>>> s=u'\u0141\u0142'

>>> s
u'\u0141\u0142'
>>> print s
Traceback (innermost last):
  File "<pyshell#2>", line 1, in ?
    print s
UnicodeError: ASCII encoding error: ordinal not in range(128)
>>> print s.encode('utf-16')
ÿþAB
>>> s.encode('utf-16')
'\377\376A\001B\001'
>>>

Note that the first 2 bytes in an utf-16 encoding are the
byte-order marks (little or big endian).  They only go
at the _start_ of a utf-16 encoded Unicode textfile, so
you may need to append a [2:] slicing (to all but the
first string being written) if you write several strings
to a file one after the other (there may be less-kludgy
ways, of course...).


Alex






More information about the Python-list mailing list