Reading Huge UnixMailbox Files

Brandon McGinty brandon.mcginty at gmail.com
Tue Apr 26 15:39:37 EDT 2011


List,
I'm trying to import hundreds of thousands of e-mail messages into a
database with Python.
However, some of these mailboxes are so large that they are giving
errors when being read with the standard mailbox module.
I created a buffered reader, that reads chunks of the mailbox, splits
them using the re.split function with a compiled regexp, and imports
each chunk as a message.
The regular expression work is where the bottle-neck appears to be,
based on timings.
I'm wondering if there is a faster way to do this, or some other method
that you all would recommend.

Brandon McGinty



More information about the Python-list mailing list