Tremendous slowdown due to garbage collection

Carl Banks pavlovevidence at gmail.com
Mon Apr 14 23:18:06 EDT 2008


On Apr 14, 4:27 pm, Aaron Watters <aaron.watt... at gmail.com> wrote:
> > A question often asked--and I am not a big a fan of these sorts of
> > questions, but it is worth thinking about--of people who are creating
> > very large data structures in Python is "Why are you doing that?"
> > That is, you should consider whether some kind of database solution
> > would be better.  You mention lots of dicts--it sounds like some
> > balanced B-trees with disk loading on demand could be a good choice.
>
> Well, probably because you can get better
> than 100x improved performance
> if you don't involve the disk and use clever in memory indexing.

Are you sure it won't involve disk use?  I'm just throwing this out
there, but if you're creating a hundreds of megabytes structure in
memory there's a chance the OS will swap it out to disk, which defeats
any improvements in latency you would have gotten.

However, that is for the OP to decide.  The reason I don't like the
sort of question I posed is it's presumptuous--maybe the OP already
considered and rejected this, and has taken steps to ensure the in
memory data structure won't be swapped--but a database solution should
at least be considered here.


Carl Banks



More information about the Python-list mailing list