Python 3 how to convert a list of bytes objects to a list of strings?

Cameron Simpson cs at cskk.id.au
Fri Aug 28 05:09:22 EDT 2020


On 28Aug2020 08:56, Chris Green <cl at isbd.net> wrote:
>Stefan Ram <ram at zedat.fu-berlin.de> wrote:
>> Chris Angelico <rosuav at gmail.com> writes:
>> >But this is a really good job for a list comprehension:
>> >sss = [str(word) for word in bbb]
>>
>>   Are you all sure that "str" is really what you all want?
>>
>Not absolutely, you no doubt have been following other threads related
>to this one.  :-)

It is almost certainly not what you want. You want some flavour of 
bytes.decode. If the BytesParser doesn't cope, you may need to parse the 
headers as some kind of text (eg ISO8859-1) until you find a 
content-transfer-encoding header (which still applies only to the body, 
not the headers).

>> |>>> b = b"b"
>> |>>> str( b )
>> |"b'b'"
>>
>>   Maybe try to /decode/ the bytes?
>>
>> |>>> b.decode( "ASCII" )
>> |'b'
>>
>>
>Therein lies the problem, the incoming byte stream *isn't* ASCII, it's
>an E-Mail message which may, for example, have UTF-8 or other encoded
>characters in it.  Hopefully it will have an encoding given in the
>header but that's only if the sender is 'well behaved', one needs to
>be able to handle almost anything and it must be done without 'manual'
>interaction.

POP3 is presumably handing you bytes containing a message. If the Python 
email.BytesParser doesn't handle it, stash the raw bytes _elsewhere_ in 
a distinct file in some directory.

    with open('evil_msg_bytes', 'wb') as f:
        for bs in bbb:
            f.write(bs)

No interpreation requires, since parsing failed. Then you can start 
dealing with these exceptions. _Do not_ write unparsable messages into 
an mbox!

Cheers,
Cameron Simpson <cs at cskk.id.au>


More information about the Python-list mailing list