Getting data out of Mozilla Thunderbird with Python?

Christian Gollwitzer auriocus at gmx.de
Wed Dec 9 03:03:46 EST 2015


Am 08.12.15 um 19:21 schrieb Anthony Papillion:
> I have a TON of email (years) stored in my Thunderbird. My backup
> strategy for the last few years has been to periodically dump it all
> in a tar file, encrypt that tar file, and move it up to the cloud.
> That way, if my machine ever crashes, I don't lose years of email.
>
> But I've been thinking about bringing Python into the mix to build a
> bridge between Thunderbird and SQLite or MySQL (probably sqlite) where
> all mail would be backed up to a database where I could run analytics
> against it and search it more effectively.
>
> I'm looking for a way to get at the mail stored in Thunderbird using
> Python and, so far, I can't find anything. I did find the mozmail
> package but it seems to be geared more towards testing and not really
> the kind of use I need.

You have several options.

1) As noted before, Thunderbird ususally stores mail in mbox format, 
which you can read and parse. However it keeps an extra index file 
(.msf) to track deleted messages etc. Until you "compact" the folders, 
the messages are not deleted in the mbox file

2) You can configure it to use maildir instead. Maildir is a directory 
where every mail is stored in a single file. That might be easier to 
parse and much faster to access.

3) Are you sure that you want to solve the problem using Python? 
Thunderbird has excellent filters and global full text search (stored in 
sqlite, btw). You can instruct it to archive mails, which means it 
creates a folder for each year - once created for a past year, that 
folder will never change. This is how I do my mail backup, and these 
folders are backed up by my regular backup (TimeMachine). You could also 
try to open the full text index with sqlite and run some query on it.

4) Yet another option using Thunderbird alone is IMAP. If you can either 
use a commercial IMAP server, have your own server in the cloud or even 
write an IMAP server using Python, then Thunderbird can 
access/manipulate the mail there as a usual folder.

5) There are converters like Hypermail or MHonArc to create HTML 
archives of mbox email files for viewing in a browser

	Christian




More information about the Python-list mailing list