[Tutor] Splitting email headers when using imaplib

spir denis.spir at gmail.com
Wed Feb 5 08:39:27 CET 2014


On 02/04/2014 06:38 PM, Some Developer wrote:
> I'm currently trying to download emails from an IMAP server using Python. I can
> download the emails just fine but I'm having an issue when it comes to splitting
> the relevant headers. Basically I'm using the following to fetch the headers of
> an email message:
>
> typ, msg_header_content = self.m.fetch(msg_id, '(BODY.PEEK[HEADER])')
>
> then I can get a string containing the headers by accessing
> msg_header_content[0][1]. This works fine but I need to split the Subject
> header, the From header and the To header out into separate strings so I can
> save the information in a database.
>
> I thought the following regular expressions would do the trick when using
> re.MULTILINE when matching them to the header string but apparently that appears
> to be wrong.
>
> msg_subject_regex = re.compile(r'^Subject:\.+\r\n')
> msg_from_regex = re.compile(r'^From:\.+\r\n')
> msg_to_regex = re.compile(r'^To:\.+\r\n')
>
> Can anyone point me in the right direction for this please? I'm at a loss here.

I have no idea of the pattern or structure of email headers. Would you post some 
example of 'msg_header_content[0][1]'?

In the meantine, try to suppress \r from the regex formats. (Shouldn't be here, 
because when reading strings from files, python converts newlines into \n; also 
try "'\r' in s" or "'\r\n' in s" to be sure.)

d



More information about the Tutor mailing list