List problem
Bengt Richter
bokr at oz.net
Sun Oct 31 21:24:37 EST 2004
On Mon, 1 Nov 2004 00:45:24 +0200, aleaxit at yahoo.com (Alex Martelli) wrote:
>Bengt Richter <bokr at oz.net> wrote:
> ...
>> > a[:] = [x for x in a if x != 2]
^^^^-[1]
>> >
>> >is more concise, idiomatic, and speedy. No reason to do low-level
>> >twiddling with indices and fiddly loops, in cases where a list
>> >comprehension can do the job so simply.
>> >
>> Except maybe if the list is huge, and you don't want to generate a
>duplicate, e.g.,
> ...
>> Should also be faster for large lists, IWT.
>
>'shud' is a 4-letter word -- i prefer to measure. Of course, it does
>depend what you mean by large/huge -- I hope 10 million items is enough;
>and what you're doing exactly -- I've kept 'deleting item 2' as it was
>the example that had been used so far.
>
>This is the module for measurement:
>
>def delsome_br(n=1000000, todel=[2]):
> a = range(n)
> i = 0
> for x in a:
> if x not in todel:
> a[i] = x
> i += 1
> del a[i:]
> return a
>
>def delsome_am(n=1000000, todel=[2]):
> a = range(n)
> a = [x for x in a if x not in todel]
^-- shud be a[:] ;-)
(I think the missing return a is in the noise ;-)
>
>and these are the numbers I see:
>
>kallisti:/tmp alex$ python -mtimeit -s'import br' 'br.delsome_br()'
>10 loops, best of 3: 4.44 sec per loop
>kallisti:/tmp alex$ python -mtimeit -s'import br' 'br.delsome_am()'
>10 loops, best of 3: 3.92 sec per loop
>
I thought you had 'way faster machines than mine. Neither time is an improvement
over delsome_br for 2.3.2 on my old box. I wonder what's up.
>Python 2.4b1, of course -- somebody SO interested in performance, as to
>use those six delicate, fragile low-level lines, instead of the solid,
>no-space-for-bugs list comprehension, surely WANTS the performance
>pluses of 2.4, anyway;-).
>
Python 2.3.2 (#49, Oct 2 2003, 20:02:00) [MSC v.1200 32 bit (Intel)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> def delsome_br(n=1000000, todel=[2]):
... a = range(n)
... i = 0
... for x in a:
... if x not in todel:
... a[i] = x
... i += 1
... del a[i:]
... return a
...
>>> def delsome_am(n=1000000, todel=[2]):
... a = range(n)
... a = [x for x in a if x not in todel]
...
>>> from time import clock
>>> for f in delsome_br, delsome_am:
... print f.__name__
... for i in range(3):
... t0 = clock()
... ignore = f()
... print clock()-t0
...
delsome_br
3.39048024526
3.63548729364
3.6326486655
delsome_am
6.40547292869
6.32403355062
6.21752171923
>I'm sure platforms and tasks can be devised that make delsome_br faster.
>However, personally, I'm not surprised that, on a task I tried to code
>as impartially as possible, the "should be faster" fussy, low-level
a[:] would improve impartiality a smidge ;-) And change the numbers on my box:
delsome_br
3.36129106876
3.63628348399
3.63185750372
delsome_am
6.77888285274
6.76994708267
6.74697154332
>solution ends up taking _longer_ than the simpler, higher-level one;
>it's not always that way, but it matches my general Python experience...
>"simple is good, simple is fast, high-level is simple" memes;-).
>
I agree, really ;-) Ideally optimization should eventually drive equivalent
functionality towards equivalent compiled code anyway.
>The tasks ARE a bit time-consuming, as you see (30 times 4+ seconds is
>over a couple of minutes per variation...), so I'm not going to dwell
>much deeper, but, of course, that's just why I posted everything -- so
>ingenious people can sweat it out and find a way to show the lower level
>version as faster, with the appropriate combination of platform and
>tasks. Some suggestions...:
>
>The trick would be to hit just the right size of list that will make the
I did mention "huge" and "maybe" ;-)
>working set of the LC exceed available physical memory, while leaving
>the low-level approach with a working set that still fits; the thrashing
>will then kill the performance of one, but not the other. To avoid
>having to aim for many tens of millions of items, a machine with scarce
>memory or very ineffective VM algorithms would be better -- some old PC
>with 64M of RAM and win/98 might be ideal, for example. Unfortunately I
>don't have any machine with less than 640 M any more, even the laptops
>(RAM is so cheap, and SO precious to performance, that I can't justify
>stinting on THAT, of all items!), so I can't really help much there.
>
Ok, point made ;-)
Regards,
Bengt Richter
More information about the Python-list
mailing list