[issue10574] email.header.decode_header fails if the string contains multiple directives

Mon Nov 29 06:59:34 CET 2010

Roy Hyunjin Han <starsareblueandfaraway at gmail.com> added the comment:

Improved workaround to handle another degenerate case where the encoded string is in between non-encoded strings.

import re
import email.header

pattern_ecre = re.compile(r'((=\?.*?\?[qb]\?).*\?=)', re.VERBOSE | re.IGNORECASE | re.MULTILINE)

def decodeSafely(x):
    match = pattern_ecre.search(x)
    if not match:
        return x
    string, encoding = match.groups()
    stringBefore, string, stringAfter = x.partition(string)
    return stringBefore + email.header.decode_header('%s%s==?=' % (encoding, string.replace(encoding, '').replace('?', '').replace('=', '')))[0][0] + stringAfter

print decodeSafely('=?UTF-8?B?MjAxMSBBVVRNIENBTEwgZm9yIE5PTUlO?==?UTF-8?B?QVRJT05TIG9mIFZQIGZvciBNZW1iZXJz?==?UTF-8?B?aGlw?=')
print decodeSafely('"=?UTF-8?B?QVVUTSBIZWFkcXVhcnRlcnM=?="<info at autm.net>')

----------

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue10574>
_______________________________________