[issue45066] email parser fails to decode quoted-printable rfc822 message attachemnt

anarcat report at bugs.python.org
Tue Aug 31 14:23:35 EDT 2021


anarcat <anarcat at debian.org> added the comment:

looking at email.feedparser.FeedParser._parse_gen(), it looks like this is going to be really hard to fix, because the parser just happily recurses into the sub-part without ever checking the CTE (content-transfer-encoding). that's typically only done on "get_payload()", which is obviously not called there because we're streaming the email in.

in general, it looks like support for quoted-printable, as a CTE (which is https://datatracker.ietf.org/doc/html/rfc2045#section-6.7), seems to be spotty at best. multipart/ parts will raise the (undocumented) exception InvalidMultipartContentTransferEncodingDefect if they encounter it, for example:

https://github.com/python/cpython/blob/3.9/Lib/email/feedparser.py#L322

so I'm not sure how to handle this. it's not clear to me either how to workaround this problem at all... is there a way to keep the parser from recursing like this?

----------

_______________________________________
Python tracker <report at bugs.python.org>
<https://bugs.python.org/issue45066>
_______________________________________


More information about the Python-bugs-list mailing list