[ python-Bugs-1409455 ] email.Message.set_payload followed by bad result get_payload

Sat Jan 21 00:19:51 CET 2006

Bugs item #1409455, was opened at 2006-01-18 14:09
Message generated for change (Comment added) made by msapiro
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1409455&group_id=5470

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: Python Library
Group: Python 2.4
Status: Open
Resolution: None
Priority: 5
Submitted By: Mark Sapiro (msapiro)
Assigned to: Barry A. Warsaw (bwarsaw)
Summary: email.Message.set_payload followed by bad result get_payload

Initial Comment:
Under certain circumstances, in particular when charset
is 'iso-8859-1', where msg is an email.Message() instance,

    msg.set_payload(text, charset)

'apparently' encodes the text as quoted-printable and
adds a

Content-Transfer-Encoding: quoted-printable

header to msg. I say 'apparently' because if one prints
msg or creates a Generator instance and writes msg to a
file, the message is printed/written as a correct,
quoted-printable encoded message, but

    text = msg._payload
or

    text = msg.get_payload()

gives the original text, not quoted-printable encoded, and

    text = msg.get_payload(decode=True)

gives a quoted-printable decoding of the original text
which is munged if the original text included '=' in
some ways.

This is causing problems in Mailman which are currently
worked around by flagging if the payload was set by
set_payload() and not subsequently 'decoding' in that
case, but it would be better if
set_payload()/get_payload() worked properly.

A script is attached which illustrates the problem.

----------------------------------------------------------------------

>Comment By: Mark Sapiro (msapiro)
Date: 2006-01-20 15:19

Message:
Logged In: YES 
user_id=1123998

I've looked at the email library and I see the problem.
msg.set_payload() never QP encodes msg._payload. When the
message is stringified or flattened by a generator, the
generator's _handle_text() method does the encoding and it
is msg._charset that signals the need to do this. Thus when
the message object is ultimately converted to a suitable
external form, the body is QP encoded, but internally it
never is. Thus, subsequent msg.get_payload() calls return
unexpected results.

It appears (from minimal testing) that when a text message
is parsed into an email.Message.Message instance, _charset
is None even if there is a character set specification in a
Content-Type: header.

I have attached a patch (Message.py.patch.txt) which may fix
the problem. It has only been tested against the already
attached example.py so it is really untested. Also, it only
addresses the quoted-printable case. I haven't even thought
about whether there might be a similar problem involving base64.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1409455&group_id=5470