Program very slow to finish
Tim Peters
tim.one at home.com
Mon Nov 5 18:14:45 EST 2001
[Roeland Rengelink]
> Thanks for your comprehensive answer. I hope you'll bear with me, for
> I'm looking for some clarification
I should have said "I expect you're just measuring your platform C's free()
performance", not "cache". I agree that cache effects should have little
bearing on this (beyond very small dicts that actually fit in cache), and
will give some evidence from my Win98SE box below.
> ...
> Here are my numbers again, just for comparison (and I do think you're
> showing an increase in destruction time too, just not as bad)
Yes, but not nearly as bad.
> The algo:
>
> - create dictionary d with N 7-char keys [all mapping to the same value]
> - execute d=None
>
> All times in micro-seconds per item. For the code see end of this post.
> (Linux 2.2.14, 128M RAM, Cel 333 MHz)
>
> size: 10000, creation: 31.00, destruction: 1.49
> size: 20000, creation: 31.10, destruction: 1.57
> size: 50000, creation: 32.77, destruction: 1.76
> size: 100000, creation: 32.00, destruction: 1.92
> size: 200000, creation: 32.59, destruction: 2.38
> size: 500000, creation: 32.12, destruction: 4.35
> size: 1000000, creation: 32.25, destruction: 10.47
>
> ...
> ["head argument" about cache effects elided]
> Now, this may be a very naive argument. But, hey, I'm an astronomer, not
> a computer scientist.
No, it was a good first approximation, but measurement speaks louder than
argument <wink>. I fiddled Python's dict dealloc routine like so:
for (ep = mp->ma_table; fill > 0; ep++) {
if (ep->me_key) {
--fill;
//Py_DECREF(ep->me_key);
//Py_XDECREF(ep->me_value);
/* New code follows. */
ep->me_key->ob_refcnt -= 1;
if (ep->me_value)
ep->me_value->ob_refcnt -= 1;
}
}
This does everything the DECREF macros did *except* to call the platform
free() on the now-dead string-object keys. So all that memory leaks, and
creation times start to zoom as my box runs out of RAM. "destruction" times
stay very steady, though (using time.clock() here, a much better timer on
Windows than time.time()):
size: 10000, destruction: 0.20 (usec/elem)
size: 20000, destruction: 0.20 (usec/elem)
size: 50000, destruction: 0.23 (usec/elem)
size: 100000, destruction: 0.22 (usec/elem)
size: 200000, destruction: 0.22 (usec/elem)
size: 400000, destruction: 0.23 (usec/elem)
size: 600000, destruction: 0.21 (usec/elem)
size: 800000, destruction: 0.22 (usec/elem)
I can't push it much beyond that, as "the usual" result of provking Win9X
into disk thrashing is a system crash (indeed, I endured 4 reboots before
giving up on letting this run thru a million).
So, on this box, all the nastiness is due to MS's free() implementation.
> ...
> You've convinced/reassured me that it can't be Python. I'm not convinced
> it's the cache. That basically leaves free(). I'm curious if anybody
> else has seen this behaviour. I hope some more people try this
You can be sure that everyone who suggested to exit the program via
os._exit() has seen similar behavior in anger <wink>. Alas, every platform
acts differently -- even different flavors of Windows have radically
different malloc/free performance characteristics. Win9X is usually the
worst of the lot, so I'm amazed that your flavor of Linux's performance made
Win9X look stellar in comparison.
I can't make time for it now, but "someone should" try this test using
Vladimir Marangozov's PyMalloc package (shipped with Python 2.1 but turned
off by default -- see the 2.1 NEWS file).
malloc-works-best-when-it-isn't-used<wink>-ly y'rs - tim
More information about the Python-list
mailing list