"Help needed - I don't understand how Python manages memory"

Christian Heimes lists at cheimes.de
Sun Apr 20 16:10:52 EDT 2008


Hank @ITGroup schrieb:
> In order to deal with 400 thousands texts consisting of 80 million
> words, and huge sets of corpora , I have to be care about the memory
> things. I need to track every word's behavior, so there needs to be as
> many word-objects as words.
> I am really suffering from the memory problem, even 4G  memory space can
> not survive... Only 10,000 texts can kill it in 2 minutes.
> By the way, my program has been optimized to ``del`` the objects after
> traversing, in order not to store the information in memory all the time.

No ordinary system and programming language can hold that much data in
memory at once. Your design is broken; some may call it even insane.

I highly recommend ZODB for your problem. ZODB will allow you to work
with several GB of data in a transaction oriented way without the needs
of an external database server like Postgres or MySQL. ZODB even
supports clustering and mounting of additional database from the same
file system or an external server.

Christian



More information about the Python-list mailing list