[Mailman-Developers] Scrubber mungs Quoted Printable
Mark Sapiro
msapiro at value.net
Mon Nov 28 19:01:56 CET 2005
Mark Sapiro wrote:
>
>I think the fix for the current problem is the following patch -
>
>--- mailman-2.1.6/Mailman/Handlers/Scrubber.py
>+++ mailman-mas/Mailman/Handlers/Scrubber.py
>@@ -376,9 +376,8 @@
> # Now join the text and set the payload
> sep = _('-------------- next part --------------\n')
> del msg['content-type']
>- msg.set_payload(sep.join(text), charset)
> del msg['content-transfer-encoding']
>- msg.add_header('Content-Transfer-Encoding', '8bit')
>+ msg.set_payload(sep.join(text), charset)
> return msg
I still think this is the correct fix, but it turns out there are some
tricky issues here that I believe come down to an error in the
set_payload() method.
Under certain circumstances, in particular when charset is 'iso-8859-1',
msg.set_payload(text, charset)
'apparently' encodes the text as quoted-printable and adds a
Content-Transfer-Encoding: quoted-printable
header to msg. I say 'apparently' because if one prints msg or creates
a Generator instance and writes msg to a file, the message is
printed/written as a correct, quoted-printable encoded message, but
text = msg._payload
or
text = msg.get_payload()
gives the original text, not quoted-printable encoded, and
text = msg.get_payload(decode=1)
gives a quoted-printable decoding of the original text which is munged
if the original text included '=' in some ways.
This is a problem for Mailman because if Scrubber is processing
individual messages, the 'apparently' quoted-printable message gets
passed ultimately to SMTPDirect which calls Decorate, and Decorate
does msg.get_payload(decode=1) when adding the header and/or footer
and can mung the message in the process.
There is also an issue with archiving when the archiver gets a
multipart message which is subsequently flattened by Scrubber.
The following is a transcript of a Python interactive session that
illustrates the above problems with set_payload() and get_payload().
This session is with Python 2.4.1, but exactly the same behavior
occurs with 2.3.4 and 2.4.2.
Python 2.4.1 (#1, May 27 2005, 18:02:40)
[GCC 3.3.3 (cygwin special)] on cygwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import email
>>>
>>> msg = email.message_from_file(open('plain2.eml'))
>>>
>>> print msg
>From nobody Mon Nov 28 09:18:41 2005
From: "Mark Sapiro" <msapiro at value.net>
To: list1 at localhost
Subject: HTML - all
Date: Sun, 27 Nov 2005 09:02:33 -0800
MIME-Version: 1.0
Content-Type: text/plain; charset="iso-8859-1"
How about just a line of stuff with some ==== and a few words.
X=91**2 (x is 91 squared)
>>>
>>> del msg['content-type']
>>> del msg['content-transfer-encoding']
>>> msg.set_payload(str(msg.get_payload()), 'iso-8859-1')
>>>
>>> print msg
>From nobody Mon Nov 28 09:18:41 2005
From: "Mark Sapiro" <msapiro at value.net>
To: list1 at localhost
Subject: HTML - all
Date: Sun, 27 Nov 2005 09:02:33 -0800
MIME-Version: 1.0
Content-Type: text/plain; charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable
How about just a line of stuff with some =3D=3D=3D=3D and a few words.
X=3D91**2 (x is 91 squared)
>>>
>>> print msg.get_payload()
How about just a line of stuff with some ==== and a few words.
X=91**2 (x is 91 squared)
>>>
>>> print msg.get_payload(decode=1)
How about just a line of stuff with some == and a few words.
X`**2 (x is 91 squared)
--
Mark Sapiro <msapiro at value.net> The highway is for gamblers,
San Francisco Bay Area, California better use your sense - B. Dylan
More information about the Mailman-Developers
mailing list