[issue39384] Email parser creates a message object that can't be flattened as bytes.
Mark Sapiro
report at bugs.python.org
Wed Feb 5 23:30:03 EST 2020
Mark Sapiro <mark at msapiro.net> added the comment:
I've researched this further, and I know how this happens. The original message contains a text/html part (in my case, the only part) which contains a base64 or quoted-printable body which when decoded contains non-ascii. It is parsed correctly by email.message_from_bytes.
It is then processed by Mailman's content filtering which retrieves html payload via
part.get_payload(decode=True).decode(ctype, errors='replace'))
where part is the text/html part and ctype is 'utf-8' in this case. It then uses elinks, lynx or some other configured command to convert the html payload to plain text and that plain text still contains non-ascii.
It then replaces the payload and sets the content type via
del part['content-transfer-encoding']
part.set_payload(plain_text)
part.set_type('text/plain')
And this results in a message which can't be flattened as_bytes.
The issue is set_payload() should encode the payload appropriately and in fact, it does if an appropriate charset is given, so this is our error in not providing a charset= argument to set_payload.
Closing this and the corresponding PR.
----------
stage: patch review -> resolved
status: open -> closed
_______________________________________
Python tracker <report at bugs.python.org>
<https://bugs.python.org/issue39384>
_______________________________________
More information about the Python-bugs-list
mailing list