One Python 2.1 idea
Tim Peters
tim.one at home.com
Mon Dec 25 23:18:02 EST 2000
[posted & mailed]
[on speeding listcomps]
[Alex Martelli]
> I might be, given that I think list comprehensions *deserve* wider
> usage and I'd be loath to think of people who *might* use them...
> and won't because of a silly 5%-performance-issue (I guess I might
> 'think of it as evolution in action', but, I'm a softy!-).
Cool!
> If I'm right that the issue is allocation, then I'd need a different
> opcode at the start (a build-list variation) and one at the end to
> say 'ok, listbuilding done now, please ensure it's a normal list
> without too much extra fat'; the former would change the append
> method pointer of the list object to {either reallocate exponentially,
> or, probably better, build the list in disjoint pieces -- as nothing
> BUT 'append' will be used on the list until ok-done-now time},
> the latter would (if needed) recompact it up and restore the normal
> append method for possible future uses.
>
> Am I wildly off-base...? If not, it seems to me that procuring
> an acceptable diff-program is going to be the largest part of
> making a patch to submit...:-).
Easy one first: the Cygwin diff works fine on Windows; or if you grab a
copy of the Python source off of SourceForge, cvs diff works OK too. There
are detailed instructions for setting up cvs and ssh for use with
SourceForge on Windows at:
http://python.sourceforge.net/winssh.txt
Harder one: Unless lists grow very large, all evidence to date has said
that one-at-a-time appending displays linear-time behavior on all platforms
examined. It's unfortunately the platforms that Guido uses that *never*
display worse behavior than that, so it's never been possible to convince
him it can be a drag in some cases.
That said, the builtin map currently uses PyList_SetItem on a preallocated
list when it can(*), while listcomps use a bound .append method which goes
thru listobject.c's hairer ins1. Even if resizing is free, ins1 is simply
more involved than PyList_SetItem so costs more. And it's unclear to me why
builtin_map doesn't use the cheaper-still PyList_SET_ITEM macro.
(*) (Apologies if you already know this) Nothing that's simple, because it's
OK for sequences to lie to map and listcomps about their lengths. For
example,
>>> class LieLow:
def __len__(self): return 0
def __getitem__(self, i): return "abc"[i]
>>> class LieHigh:
def __len__(self): return 1000000
def __getitem__(self, i): return "xyz"[i]
>>> x, y = LieLow(), LieHigh()
>>> len(x), len(y)
(0, 1000000)
>>> map(lambda i: i, x) + map(lambda i: i, y)
['a', 'b', 'c', 'x', 'y', 'z']
>>> [i for i in x] + [i for i in y]
['a', 'b', 'c', 'x', 'y', 'z']
>>>
It's actually the raising of IndexError that controls how long the results
are. In the "map" example, the map over x falls back to using one-at-a-time
append because len(x) guessed too low about the true length, while the map
over y grossly overallocates a result list and cuts it back at the end.
In short, len() can't be trusted, so preallocation can't rely on it
(nevertheless, it's *usually* correct, and trusting it as far as is possible
pays off big for map).
Curiously, listcomps were implemented without any additions to the PVM; in
effect, [f(x) for x in s] is macro expanded to:
anonymous1 = []
__1__ = anonymous1.append
for anonymous2 in s:
__1__(f(anonymous2))
# result is now anonymous1
One way to speed that is map-like (where var names are really invisible
temps):
guess = len(s)
result = [None] * guess
i = 0
try:
while 1:
elt = s[i] # this may raise IndexError
val = f(elt)
if i < guess:
result[i] = val
else:
result.append(val)
i += 1
except IndexError:
pass
if i < guess:
del result[i:]
There are of course many ways to introduce opcodes to speed that. It's
curious that map() requires that sequences implement __len__ (although
tolerates them lying), while listcomps currently don't.
Another way to speed it would be to leave the listcomp implementation alone
but make one-at-a-time append faster. That may be more attractive!
one-at-a-time append is very common in Python programs. Kevin O'Connor was
interested in this back when we were all too busy finishing 2.0 to play
along; unfortunately, he appears to have vanished. A DejaNews search should
turn up a patch of his that sped listcomps by 30% for him this way. That's
both easier and harder <0.5 wink -- see the DejaNews thread for why>.
More information about the Python-list
mailing list