Decoding 'funky' e-mail subjects
Jonas Galvez
jg at jonasgalvez.com
Mon Jun 7 13:46:43 EDT 2004
Oliver Kurz wrote:
> Have you tried decode_header from email.Header
> in the python email-package?
Thanks, that works. The problem is that I need to make it compatible
with Python 1.5.2. I improved my regex-based method and it has worked
fine with all my test cases so far. But if anyone has any other
suggestion, I'm still interested. Anyway, here's my code:
import re
from string import *
def decodeHeader(h):
def firstGroup(s):
if s.group(1): return s.group(1)
return s.group()
h = re.compile("=\?[^\?]*\?q\?", re.I).sub("", h)
h = re.compile(
"=\?(?:(?:(?:(?:(?:(?:(?:(?:w)?i)?n)?d)?o)?w)?s)?|"
"(?:(?:(?:i)?s)?o)?|(?:(?:(?:u)?t)?f)?)"
"[^\.]*?(\.\.\.)?$",
re.I).sub(firstGroup, h)
h = re.sub("=.(\.\.\.)?$", firstGroup, h)
def isoEntities(str):
str = str.group(1)
try: return eval('"\\x%s"' % str)
except: return "?"
h = re.sub("=([^=].)", isoEntities, h)
if h[-2:] == "?=": h = h[:-2]
return replace(h, "_", " ")
print decodeHeader("=?ISO-8859-1?Q?Marcos_Mendon=E7a?=")
print decodeHeader("=?ISO-8859-1?Q?Test?=")
print decodeHeader("=?UTF-8?Q?Test?=")
print decodeHeader("Test =?windows-125...")
print decodeHeader("Test =?window-125...")
print decodeHeader("Test =?windo-1...")
print decodeHeader("Test =?wind...")
print decodeHeader("Test =?...")
print decodeHeader("Test =?w...")
print decodeHeader("Test =?iso...")
\\ jonas galvez
// jonasgalvez.com
More information about the Python-list
mailing list