Is there any way to make sense of these E-Mail subjects?

Barry barry at barrys-emacs.org
Fri Dec 24 13:02:16 EST 2021



> On 24 Dec 2021, at 16:40, Chris Green <cl at isbd.net> wrote:
> 
> I have a Python 3 script which processes E-Mail caught in my hosting
> provider's 'catchall' mailbox.  It looks for things that *might* be
> useful E-Mails, forwards them, and throws the rest away.
> 
> I have a function which, given a header name, extracts the header and
> returns it as a string:-
> 
>    #
>    #
>    # Get a message header as a string
>    #
>    def getHdr(msg, header):
>       return str("\n  " + header + ": " + str(msg.get(header, "empty"))) 
> 
> msg is a mailbox.mboxMessage object.
> 
> 
> This is mostly working as expected, returning the header contents as
> strings so I can output them to my log files as necessary.  However
> some Subject: lines are being returned like the following:-
> 
>      Subject: [SPAM] =?UTF-8?B?8J+TtyBKb2huIEJheHRlci1C?=
>     =?UTF-8?B?cm93biByZWNlbnRseSBw?=
>     =?UTF-8?B?b3N0ZWQgYSBuZXcgcGhv?=
>     =?UTF-8?B?dG8=?=
> 
> It looks like some sort of mis-encoding of UTF-8 strings, can anyone
> suggest what might be going on and/or a way to get some sense out of
> this?

I think this is correctly encoded.
See https://datatracker.ietf.org/doc/html/rfc1342

Barry
> 
> FWIW the above example is from "Facebook" <friendupdates at facebookmail.com>
> so while it is probably (as indicated) [SPAM] it shouldn't be so illegible.
> 
> At the moment I can't see an easy way to actually inspect the message
> as it's disappeared off somewhere else.  I guess I could add some code
> to the script to send it to myself as well but if there's something
> obvious in the above it would avoid having to do this.
> 
> 
> -- 
> Chris Green
> ·
> -- 
> https://mail.python.org/mailman/listinfo/python-list


More information about the Python-list mailing list