Unicode problems, yet again
John Machin
sjmachin at lexicon.net
Sat Apr 23 22:57:53 EDT 2005
On Sun, 24 Apr 2005 03:15:02 +0200, Ivan Voras
<ivoras at something.ortheother> wrote:
>I have a string fetched from database, in iso8859-2, with 8bit
>characters,
"8bit characters"?? Maybe you did once, or you thought you did, but
what you have now is a Unicode string, and socket.write() is expecting
an ordinary string.
> and I'm trying to send it over the network, via a socket:
>
> File "E:\Python24\lib\socket.py", line 249, in write
> data = str(data) # XXX Should really reject non-string non-buffers
>UnicodeEncodeError: 'ascii' codec can't encode character u'\u0161' in
>position 123: ordinal not in range(128)
Like it says, you have passed it a *UNICODE* string that has u'\u0161'
(the small s with caron) at position 123.
>
>The other end knows it should expect this encoding, so how to send it?
>
If the other end wants an encoding, then you should *encode* it, like
this:
>>> us = u'\u0161'
>>> s = us.encode('iso8859_2')
>>> s
'\xb9'
>>> str(us)
Traceback (most recent call last):
File "<stdin>", line 1, in ?
UnicodeEncodeError: 'ascii' codec can't encode character u'\u0161' in
position 0: ordinal not in range(128)
>>> str(s)
'\xb9'
>>> # looks like socket.write() might be happier with this.
>(Does anyone else feel that python's unicode handling is, well...
>suboptimal at least?)
Your posting gives no evidence for such a conclusion.
More information about the Python-list
mailing list