Decoding 'funky' e-mail subjects

Jonas Galvez jg at jonasgalvez.com
Mon Jun 7 11:44:27 EDT 2004


Hi, I need a function to parse badly encoded 'Subject' headers from
e-mails, such as the following:

    =?ISO-8859-1?Q?Murilo_Corr=EAa?=
    =?ISO-8859-1?Q?Marcos_Mendon=E7a?=

I tried using the decode() method from mimetools but that doesn't
appear to be correct solution. I ended up coding the following:

    import re

    subject = "=?ISO-8859-1?Q?Murilo_Corr=EAa?="
    subject = re.search("(?:=\?[^\?]*\?\Q\?)?(.*)\?=", subject)
    subject = subject.group(1)

    def decodeEntity(str):
        str = str.group(1)
        try: return eval('"\\x%s"' % str)
        except: return "?"

    subject = re.sub("=([^=].)", decodeEntity, subject)
    print subject.replace("_", " ").decode("iso-8859-1")

Can anyone recommend a safer method?

Tia,



\\ jonas galvez
// jonasgalvez.com










More information about the Python-list mailing list