Decoding 'funky' e-mail subjects

Jonas Galvez jg at jonasgalvez.com
Mon Jun 7 13:46:43 EDT 2004


Oliver Kurz wrote:
> Have you tried decode_header from email.Header
> in the python email-package?

Thanks, that works. The problem is that I need to make it compatible
with Python 1.5.2. I improved my regex-based method and it has worked
fine with all my test cases so far. But if anyone has any other
suggestion, I'm still interested. Anyway, here's my code:

import re
from string import *

def decodeHeader(h):
    def firstGroup(s):
        if s.group(1): return s.group(1)
        return s.group()
    h = re.compile("=\?[^\?]*\?q\?", re.I).sub("", h)
    h = re.compile(
        "=\?(?:(?:(?:(?:(?:(?:(?:(?:w)?i)?n)?d)?o)?w)?s)?|"
        "(?:(?:(?:i)?s)?o)?|(?:(?:(?:u)?t)?f)?)"
        "[^\.]*?(\.\.\.)?$",
        re.I).sub(firstGroup, h)
    h = re.sub("=.(\.\.\.)?$", firstGroup, h)
    def isoEntities(str):
        str = str.group(1)
        try: return eval('"\\x%s"' % str)
        except: return "?"
    h = re.sub("=([^=].)", isoEntities, h)
    if h[-2:] == "?=": h = h[:-2]
    return replace(h, "_", " ")

print decodeHeader("=?ISO-8859-1?Q?Marcos_Mendon=E7a?=")
print decodeHeader("=?ISO-8859-1?Q?Test?=")
print decodeHeader("=?UTF-8?Q?Test?=")
print decodeHeader("Test =?windows-125...")
print decodeHeader("Test =?window-125...")
print decodeHeader("Test =?windo-1...")
print decodeHeader("Test =?wind...")
print decodeHeader("Test =?...")
print decodeHeader("Test =?w...")
print decodeHeader("Test =?iso...")




\\ jonas galvez
// jonasgalvez.com








More information about the Python-list mailing list