MemoryError on reading mbox file

Gabriel Genellina gagsl-py2 at yahoo.com.ar
Wed Sep 12 23:52:14 EDT 2007


En Wed, 12 Sep 2007 11:39:46 -0300, Istvan Albert  
<istvan.albert at gmail.com> escribi�:

> On Sep 12, 5:27 am, Christoph Krammer <redtige... at googlemail.com>
> wrote:
>
>>     string = self._file.read(stop - self._file.tell())
>> MemoryError
>
> This line reads an entire message into memory as a string. Is it
> possible that you have a huge email in there (hundreds of MB) with
> some attachment encoded as text?

Printing start,stop,stop-start inside that method would be an easy way to  
find if that is the case.

The following idea could help to fix it - at least, avoiding to read the  
whole message at once:
self._message_factory will eventually call the mailbox.Message  
constructor, which accepts a file object too (instead of a huge string).  
In that same module there is an utility class, _PartialFile ("A read-only  
wrapper of part of a file"). _mboxMMDF.get_file() does return a  
_PartialFile object, so I'd try this code (untested!):

     def get_message(self, key):
         """Return a Message representation or raise a KeyError."""
         msg = self._message_factory(self.get_file(key, True))
         msg.set_from(msg.get_unixfrom()[5:])
         return msg

-- 
Gabriel Genellina




More information about the Python-list mailing list