Need an identity operator because lambda is too slow

Steven D'Aprano steve at REMOVE.THIS.cybersource.com.au
Sun Feb 18 02:49:43 EST 2007


On Sat, 17 Feb 2007 21:59:18 -0800, Deron Meranda wrote:

> Consider a much-simplified example of such an iteration where
> sometimes you need to transform an item by some function, but most of
> the time you don't:
> 
>     if some_rare_condition:
>         func = some_transform_function
>     else:
>         func = lambda x: x
>     for item in some_sequence:
>         item2 = func(item)
>         ..... # more stuff

This does the test once, but does a function look-up every loop.


> Now the "lambda x:x" acts suitably like an identity operator.  But it
> is very slow, when compared to using more complex-looking code:
> 
>     do_transform = some_rare_condition
>     for item in some_sequence:
>         if do_transform:
>             item2 = transform_function(item)
>         else:
>             item2 = item
>         ..... # more stuff

Despite doing the if..else comparison every loop, this is a little faster
for the case where some_rare_condition is false.

But I have to question whether this is a worthwhile optimization,
especially if some_rare_condition really is rare. Calling the identity
function "lambda x: x" one million times takes about half a second; or to
put it another way, each call to the identity function costs you about
half a microsecond. How much time does the rest of the loop processing
take? Are you sure you're optimizing something which needs to be optimized?

(E.g. if your main loop takes fifteen minutes to run, and you're trying to
shave off half a second, just don't bother.)

Compare the following, where I use an arbitrary small function as a
placeholder for whatever work you really want to do:

setup = """import time
identity = lambda x: x
do_work = lambda x: time.time() + x
x = 1
"""

Now use timeit to compare doing the work on its own with doing the work
together with an identity function:

>>> timeit.Timer("do_work(x)", setup).repeat()
[3.1834621429443359, 3.1083459854125977, 3.1382210254669189]
>>> timeit.Timer("do_work(identity(x))", setup).repeat()
[3.5951459407806396, 3.6067559719085693, 3.5801000595092773]

Is your "do_work" function really so fast that one second per two million
calls to identity() is a significant load?



If some_rare_condition really is rare, then a possible solution might
be:

if some_rare_condition:
    some_sequence = [transform(item) for item in some_sequence]
    # or modify in place if needed... see enumerate
for item in some_sequence:
    do_something_with(item)

This should run as fast as possible in the common case, and slow down only
in the rare condition.

Another solution would be:

if some_rare_condition:
    for item in some_sequence:
        item = transform(item)
        do_something_with(item)
else:
    for item in some_sequence:
        do_something_with(item)

This is probably going to be the fastest of all, but has the disadvantage
that you are writing the same code twice.




-- 
Steven.




More information about the Python-list mailing list