Unicode problems, yet again

Kent Johnson kent37 at tds.net
Sat Apr 23 22:24:52 EDT 2005


Ivan Voras wrote:
> I have a string fetched from database, in iso8859-2, with 8bit 
> characters, and I'm trying to send it over the network, via a socket:
> 
>   File "E:\Python24\lib\socket.py", line 249, in write
>     data = str(data) # XXX Should really reject non-string non-buffers
> UnicodeEncodeError: 'ascii' codec can't encode character u'\u0161' in 
> position 123: ordinal not in range(128)
> 
> The other end knows it should expect this encoding, so how to send it?

I think maybe the string from the database is a unicode string, not 8-bit. What happens if you write 
data.encode('iso8859-2') ?

> 
> (Does anyone else feel that python's unicode handling is, well... 
> suboptimal at least?)

It can be confusing and surprising, yes. Suboptimal...well, I wouldn't want to say that I could do 
better...

Kent



More information about the Python-list mailing list