Python 3 how to convert a list of bytes objects to a list of strings?

Chris Green cl at isbd.net
Sat Aug 29 11:50:09 EDT 2020


Chris Green <cl at isbd.net> wrote:
> Dennis Lee Bieber <wlfraed at ix.netcom.com> wrote:
> > On Fri, 28 Aug 2020 12:26:07 +0100, Chris Green <cl at isbd.net> declaimed the
> > following:
> > 
> > 
> > 
> > >Maybe I shouldn't but Python 2 has been managing to do so for several
> > >years without any issues.  I know I *could* put the exceptions in a
> > >bucket somewhere and deal with them separately but I'd really rather
> > >not.
> > >
> > 
> >         In Python2 "string" IS BYTE-STRING. It is never UNICODE, and ignores
> > any encoding.
> > 
> >         So, for Python3, the SAME processing requires NOT USING "string" (which
> > is now Unicode) and ensuring that all literals are b"stuff", and using the
> > methods of the bytes data type.
> > 
> Now I'm beginning to realise that *this* may well be what I need to
> do, after going round in several convoluted circles! :-)
> 
However the problem appears to be that internally in Python 3 mailbox
class there is an assumption that it's being given 'ascii'.  Here's
the error (and I'm doing no processing of the message at all):-

    Traceback (most recent call last):
      File "/home/chris/.mutt/bin/filter.py", line 102, in <module>
        mailLib.deliverMboxMsg(dest, msg, log)
      File "/home/chris/.mutt/bin/mailLib.py", line 52, in deliverMboxMsg
        mbx.add(msg)
      File "/usr/lib/python3.8/mailbox.py", line 603, in add
        self._toc[self._next_key] = self._append_message(message)
      File "/usr/lib/python3.8/mailbox.py", line 758, in _append_message
        offsets = self._install_message(message)
      File "/usr/lib/python3.8/mailbox.py", line 830, in _install_message
        self._dump_message(message, self._file, self._mangle_from_)
      File "/usr/lib/python3.8/mailbox.py", line 215, in _dump_message
        gen.flatten(message)
      File "/usr/lib/python3.8/email/generator.py", line 116, in flatten
        self._write(msg)
      File "/usr/lib/python3.8/email/generator.py", line 181, in _write
        self._dispatch(msg)
      File "/usr/lib/python3.8/email/generator.py", line 214, in _dispatch
        meth(msg)
      File "/usr/lib/python3.8/email/generator.py", line 432, in
    _handle_text
        super(BytesGenerator,self)._handle_text(msg)
      File "/usr/lib/python3.8/email/generator.py", line 249, in
    _handle_text
        self._write_lines(payload)
      File "/usr/lib/python3.8/email/generator.py", line 155, in
    _write_lines
        self.write(line)
      File "/usr/lib/python3.8/email/generator.py", line 406, in write
        self._fp.write(s.encode('ascii', 'surrogateescape'))
    UnicodeEncodeError: 'ascii' codec can't encode characters in position 0-3: ordinal not in range(128)

Any message with other than ASCII in it is going to have bytes >128
unless it's encoded some way to make it 7-bit and that's not going to
happen in the general case.

-- 
Chris Green
·


More information about the Python-list mailing list