What do these '=?utf-8?' sequences mean in python?

jak nospam at please.ty
Sat May 6 12:27:23 EDT 2023


Peter Pearson ha scritto:
> On Sat, 6 May 2023 14:50:40 +0100, Chris Green <cl at isbd.net> wrote:
> [snip]
>> So, what do those =?utf-8? and ?= sequences mean?  Are they part of
>> the string or are they wrapped around the string on output as a way to
>> show that it's utf-8 encoded?
> 
> Yes, "=?utf-8?" signals "MIME header encoding".
> 
> I've only blundered about briefly in this area, but I think you
> need to make sure that all header values you work with have been
> converted to UTF-8 before proceeding.
> Here's the code that seemed to work for me:
> 
> def mime_decode_single(pair):
>      """Decode a single (bytestring, charset) pair.
>      """
>      b, charset = pair
>      result = b if isinstance(b, str) else b.decode(
>          charset if charset else "utf-8")
>      return result
> 
> def mime_decode(s):
>      """Decode a MIME-header-encoded character string.
>      """
>      decoded_pairs = email.header.decode_header(s)
>      return "".join(mime_decode_single(d) for d in decoded_pairs)
> 
> 
> 

HI,
You could also use make_header:

from email.header import decode_header, make_header

print(make_header(decode_header( subject )))



More information about the Python-list mailing list