[issue25545] email.message.get_payload returns wrong encoding

R. David Murray report at bugs.python.org
Tue Nov 3 14:59:53 EST 2015


R. David Murray added the comment:

Your problem is that your input email is ia unicode string.  A unicode string has no RFC defintion as an email, so things do not work right, as you observed.  Whether or not email should throw an error when fed a non-ascii unicode string is an interesting question, but it hasn't in the past and so for backward compatibility reasons we won't change that.

If you add an "encode('utf-8')" to the end of your email string, and then use message_from_bytes, you will get the correct result.  You might also be interested in the newer email API, currently documented in the 'contentmanager' and 'policy' chapters of the documentation.  It says it is provisional, but the changes (other than bug fixes) between the current API and what will be final in 3.6 are trivial.

get_content_charset is None because you don't have any actual headers in your message, just body.  This is because of the leading newline in your triple quoted string, which the email package takes as the end of the headers.

----------
nosy: +r.david.murray
resolution:  -> not a bug
stage:  -> resolved
status: open -> closed

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue25545>
_______________________________________


More information about the Python-bugs-list mailing list