MemoryError on reading mbox file
Gabriel Genellina
gagsl-py2 at yahoo.com.ar
Wed Sep 12 23:52:14 EDT 2007
En Wed, 12 Sep 2007 11:39:46 -0300, Istvan Albert
<istvan.albert at gmail.com> escribi�:
> On Sep 12, 5:27 am, Christoph Krammer <redtige... at googlemail.com>
> wrote:
>
>> string = self._file.read(stop - self._file.tell())
>> MemoryError
>
> This line reads an entire message into memory as a string. Is it
> possible that you have a huge email in there (hundreds of MB) with
> some attachment encoded as text?
Printing start,stop,stop-start inside that method would be an easy way to
find if that is the case.
The following idea could help to fix it - at least, avoiding to read the
whole message at once:
self._message_factory will eventually call the mailbox.Message
constructor, which accepts a file object too (instead of a huge string).
In that same module there is an utility class, _PartialFile ("A read-only
wrapper of part of a file"). _mboxMMDF.get_file() does return a
_PartialFile object, so I'd try this code (untested!):
def get_message(self, key):
"""Return a Message representation or raise a KeyError."""
msg = self._message_factory(self.get_file(key, True))
msg.set_from(msg.get_unixfrom()[5:])
return msg
--
Gabriel Genellina
More information about the Python-list
mailing list