Character set woes with binary data

Terry Reedy tjreedy at udel.edu
Sun Apr 1 14:36:45 EDT 2007


"Michael B. Trausch" <fd0man at gmail.com> wrote in message 
news:1175415685.21349.79.camel at pepper.trausch.us...
| The protocol calls for binary data to be transmitted, and I cannot seem
| to be able to do it, because I get this error:

| UnicodeDecodeError: 'ascii' codec can't decode byte 0xff in position 0:
| ordinal not in range(128)

| When putting the MIME segments (listed line-by-line in a Python list)
| together to transmit them.

Python byte strings currently serve double duty: text and binary blobs.
Best not to mix the two uses.  Since most string usage is for text, most 
string methods are text methods and are not appropriate for binary data. 
As you discovered.

In the present case, do you really need to join the mix of text and binary 
data *before* sending it?  Just send the pre-text, the binary data, and 
then the post-text and they will be joined in the transmission stream.  The 
receiving site should not know the difference.

| It seems that Python thinks it knows better than I do, though.

Python is doing what you told it to do.  See below.

|   I want to send this binary data straightaway to the server.  :-)

Then do just that, as I suggested above.  You are *not* sending it 
'straightaway'.  It you did, you would have no problem..  Instead, you are 
doing a preliminary mixing, which I suspect is not needed.

| Is there any way to tell Python to ignore the situation and treat the
| entire thing as simply a stream of bytes?

Don't tell Python to treat the byte streams as interpreted text by using a 
text method.  If you really must join before sending, write your own binary 
join function using either '+' or a slices into a preallocated array (from 
the array module).

Terry Jan Reedy






More information about the Python-list mailing list