Getting data out of Mozilla Thunderbird with Python?

Steven D'Aprano steve at pearwood.info
Wed Dec 9 06:11:20 EST 2015


On Wed, 9 Dec 2015 07:03 pm, Christian Gollwitzer wrote:

> 1) As noted before, Thunderbird ususally stores mail in mbox format,
> which you can read and parse. However it keeps an extra index file
> (.msf) to track deleted messages etc. Until you "compact" the folders,
> the messages are not deleted in the mbox file
> 
> 2) You can configure it to use maildir instead. Maildir is a directory
> where every mail is stored in a single file. That might be easier to
> parse and much faster to access.

Maildir is also *much* safer too. With mbox, a single error when writing
email to the mailbox will likely corrupt *all* emails from that point on,
so potentially every email in the mailbox. With maildir, a single error
when writing will, at worst, corrupt one email.

Thanks Mozilla, for picking the *less* efficient and *more* risky format as
the default. Good choice!


> 3) Are you sure that you want to solve the problem using Python?
> Thunderbird has excellent filters and global full text search (stored in
> sqlite, btw).

Sqlite is unsafe on Linux systems if you are using ntfs. I have had no end
of database corruption with Firefox and Thunderbird due to this, although
in fairness I haven't had any problems for a year or so now.



-- 
Steven




More information about the Python-list mailing list