[Mailman-Developers] About Mailman's unicode-enabled Message subclass

Mark Sapiro mark at msapiro.net
Tue Dec 2 05:43:57 CET 2014


On 12/01/2014 08:45 AM, Aurelien Bompard wrote:
> 
> I'm really interested in any insight on this issue. Thanks for reading all
> that :-)


I just took a quick look, but I think the problem is likely that
mailman.email.message.Message overrides
email.message.Message.__getitem__() as follows:

    def __getitem__(self, key):
        # Ensure that header values are unicodes.
        value = email.message.Message.__getitem__(self, key)
        if isinstance(value, str):
            return unicode(value, 'ascii')
        return value

If we trace back a bit in the code, email.message.get_param(...) does

        for k, v in self._get_params_preserve(failobj, header):

and _get_params_preserve(...) does

        value = self.get(header, missing)

which because of our override decodes the entire
"attachment; filename*=UTF-8''d%C3%A9jeuner.txt" string as ascii into
u"attachment; filename*=UTF-8''d%C3%A9jeuner.txt" which is then parsed
into the tuple (u'UTF-8', u'', u'd\xc3\xa9jeuner.txt').

It seems to me that any way to fix this is a horrible kludge. In
particular, the value portion of the tuple has already been garbled, and
I don't see a good way to fix it once it's been returned.

I've looked, again briefly because after a short time my head spun out
of control, at trying to fix it by overriding additional methods like
_get_params_preserve(), but I didn't get far.

-- 
Mark Sapiro <mark at msapiro.net>        The highway is for gamblers,
San Francisco Bay Area, California    better use your sense - B. Dylan


More information about the Mailman-Developers mailing list