[Tutor] Splitting email headers when using imaplib

Peter Otten __peter__ at web.de
Wed Feb 5 10:12:47 CET 2014


Some Developer wrote:

> I'm currently trying to download emails from an IMAP server using
> Python. I can download the emails just fine but I'm having an issue when
> it comes to splitting the relevant headers. Basically I'm using the
> following to fetch the headers of an email message:
> 
> typ, msg_header_content = self.m.fetch(msg_id, '(BODY.PEEK[HEADER])')
> 
> then I can get a string containing the headers by accessing
> msg_header_content[0][1]. This works fine but I need to split the
> Subject header, the From header and the To header out into separate
> strings so I can save the information in a database.
> 
> I thought the following regular expressions would do the trick when
> using re.MULTILINE when matching them to the header string but
> apparently that appears to be wrong.
> 
> msg_subject_regex = re.compile(r'^Subject:\.+\r\n')
> msg_from_regex = re.compile(r'^From:\.+\r\n')
> msg_to_regex = re.compile(r'^To:\.+\r\n')
> 
> Can anyone point me in the right direction for this please? I'm at a
> loss here.

Maybe you can use the email package?

>>> import email
>>> msg = email.message_from_file(open("tmp.txt"))
>>> msg["From"]
'Some Developer <someukdeveloper at gmail.com>'
>>> msg["Subject"]
'Splitting email headers when using imaplib'
>>> msg.keys()
['Path', 'From', 'Newsgroups', 'Subject', 'Date', 'Lines', 'Approved', 
'Message-ID', 'NNTP-Posting-Host', 'Mime-Version', 'Content-Type', 'Content-
Transfer-Encoding', 'X-Trace', 'X-Complaints-To', 'NNTP-Posting-Date', 'To', 
'Original-X-From', 'Return-path', 'Envelope-to', 'Original-Received', 
'Original-Received', 'X-Original-To', 'Delivered-To', 'Original-Received', 
'X-Spam-Status', 'X-Spam-Evidence', 'Original-Received', 'Original-
Received', 'Original-Received', 'DKIM-Signature', 'X-Received', 'Original-
Received', 'User-Agent', 'X-Antivirus', 'X-Antivirus-Status', 'X-BeenThere', 
'X-Mailman-Version', 'Precedence', 'List-Id', 'List-Unsubscribe', 'List-
Archive', 'List-Post', 'List-Help', 'List-Subscribe', 'Errors-To', 
'Original-Sender', 'Xref', 'Archived-At']

There is also a message_from_string() function.



More information about the Tutor mailing list