Problem with accented characters in mailbox.Maildir()

jak nospam at please.ty
Sat May 6 10:27:04 EDT 2023


Chris Green ha scritto:
> Chris Green <cl at isbd.net> wrote:
>> A bit more information, msg.get("subject", "unknown") does return a
>> string, as follows:-
>>
>>      Subject: =?utf-8?Q?aka_Marne_=C3=A0_la_Sa=C3=B4ne_(Waterways_Continental_Europe)?=
>>
>> So it's the 'searchTxt in msg.get("subject", "unknown")' that's
>> failing. I.e. for some reason 'in' isn't working when the searched
>> string has utf-8 characters.
>>
>> Surely there's a way to handle this.
>>
> ... and of course I now see the issue!  The Subject: with utf-8
> characters in it gets spaces changed to underscores.  So searching for
> '(Waterways Continental Europe)' fails.
> 
> I'll either need to test for both versions of the string or I'll need
> to change underscores to spaces in the Subject: returned by msg.get().
> It's a long enough string that I'm searching for that I won't get any
> false positives.
> 
> 
> Sorry for the noise everyone, it's a typical case of explaining the
> problem shows one how to fix it! :-)
> 

This is probably what you need:

import email.header

raw_subj = 
'=?utf-8?Q?aka_Marne_=C3=A0_la_Sa=C3=B4ne_(Waterways_Continental_Europe)?='

subj = email.header.decode_header(raw_subj)[0]

subj[0].decode(subj[1])

'aka Marne à la Saône (Waterways Continental Europe)'






More information about the Python-list mailing list