Aw: Re: Python 3 how to convert a list of bytes objects to a list of strings?

Chris Angelico rosuav at gmail.com
Fri Aug 28 08:34:48 EDT 2020


On Fri, Aug 28, 2020 at 10:32 PM Richard Damon <Richard at damon-family.org> wrote:
>
> This might be one of the cases where Python 2's lack handling of string
> vs bytes was an advantage.
>
> If he was just scanning the message for specific ASCII strings, then not
> getting the full message decoded write is unlikely to have been causing
> problems.
>
> Python2 handled that sort of case quite easily. Python 3 on the other
> hand, will have issue converting the byte message to a string, since
> there isn't a single encoding that you could use for all of it all the
> time. This being 'fussier' does make sure that the program is handling
> all the text 'properly', and would be helpful if some of the patterns
> being checked for contained 'extended' (non-ASCII) characters.
>
> One possible solution in Python3 is to decode the byte string using an
> encoding that allows all 256 byte values, so it won't raise any encoding
> errors, just give your possibly non-sense characters for non-ASCII text.

Why? If you want to work with bytes, work with bytes. There's no
reason to decode in a meaningless way. Python 3 can handle the job of
searching a bytestring for ASCII text just fine.

Also, if you're parsing an email message, you can and should be doing
so with respect to the encoding(s) stipulated in the headers, after
which you will have valid Unicode text.

Please don't spread misinformation like this.

ChrisA


More information about the Python-list mailing list