Re: Internet Data Handling » mailbox

Adam Jensen hanzer at riseup.net
Sat Oct 22 19:49:29 EDT 2016


On 10/22/2016 03:24 AM, dieter wrote:
> In addition to the previous (excellent) responses:
> 
> A "message" models a MIME (RFC1521 Multipurpose Internet Mail Extensions)
> message (the international standard for the structure of emails).
> The standard tells you that a message consists essentially of two
> parts: a set of headers and a body and describes standard headers
> and their intended meaning (e.g. "To", "From", "Subject", ...).
> It allows a message to contain non-standard headers as well.
> 
> With this knowledge, your "keys" related question can be answered:
> there is a (case insensitive) key for each header actually present
> in your message. If the message contains several headers with
> the same name, the subscription access gives you the first one;
> there is an alternative method to access all of them.

Thanks. I needed to search for emails to/from a specific person and
extract them from a [Google mail archive][1].

[1]: https://takeout.google.com/settings/takeout

This is my quick and dirty little one-shot script to get the job done.

search_mbox.py
--------------------------------------------------------------
#!/usr/bin/env python2.7
import mailbox
import sys
name = sys.argv[2].lower()
for message in mailbox.mbox(sys.argv[1]):
	if message.has_key("From") and message.has_key("To"):
		addrs = message.get_all("From")
		addrs.extend(message.get_all("To"))
		for addr in addrs:
			addrl = addr.lower()
			if addrl.find(name) > 0:
				print message
				break
--------------------------------------------------------------

Usage: ./search_mbox.py archive.mbox hanzer > hanzer.mbox




More information about the Python-list mailing list