[BangPypers] Map vs List comprehensions

Noufal Ibrahim noufal at gmail.com
Mon Aug 1 20:36:17 CEST 2011


Anand Balachandran Pillai <abpillai at gmail.com> writes:

[...]

>  Maybe I should have rephrased it like this.
>
>  - If using anonymous functions, prefer list comps over map.
>  - If using built-in functions (C functions), prefer map over list comps.
> - If using pythonic user functions, there is nothing much to choose among
>    these two, since the difference is  not much.

This is useful rule of thumb. 

I still don't understand how in your original email, the map using a
user defined function (sqr) was faster than mapping using an inbuilt C
function (hex). My tests show the opposite. 

> The reason - List comprehensions perform best, if all the calculations
> are inline within the two square brackets. Hence it scores over map
> in the lambda use-case but doesn't differ much if the function is
> declared outside.
>
>>>> def f1(): [lambda x: x*x for x in range(100)]
> ...
>>>> def f2(): map(lambda x: x*x, range(100))
> ...
>>>> mytimeit.Timeit(f1)
> '24.80 usec/pass'
>>>> mytimeit.Timeit(f2)
> '37.44 usec/pass'

This is kind of weird. Your numbers are correct. I disassembled the 2
functions and see this

>>> dis.dis(f1)
  1           0 BUILD_LIST               0
              3 DUP_TOP             
              4 STORE_FAST               0 (_[1])
              7 LOAD_GLOBAL              0 (range)
             10 LOAD_CONST               1 (100)
             13 CALL_FUNCTION            1
             16 GET_ITER            
        >>   17 FOR_ITER                16 (to 36)
             20 STORE_FAST               1 (x)
             23 LOAD_FAST                0 (_[1])
             26 LOAD_CONST               2 (<code object <lambda> at 0x7fc29eea7378, file "<stdin>", line 1>)
             29 MAKE_FUNCTION            0
             32 LIST_APPEND         
             33 JUMP_ABSOLUTE           17
        >>   36 DELETE_FAST              0 (_[1])
             39 POP_TOP             
             40 LOAD_CONST               0 (None)
             43 RETURN_VALUE        

The loop as far as I understand is between 17 and 36. The function is
"made" in every iteration. I don't know if there's an optimiser that
pulls this out before it's run but in this way, it looks pretty
inefficient. Am I correct here? 

With f2.
>>> dis.dis(f2)
  1           0 LOAD_GLOBAL              0 (map)
              3 LOAD_CONST               1 (<code object <lambda> at 0x7fc29eebe378, file "<stdin>", line 1>)
              6 MAKE_FUNCTION            0
              9 LOAD_GLOBAL              1 (range)
             12 LOAD_CONST               2 (100)
             15 CALL_FUNCTION            1
             18 CALL_FUNCTION            2
             21 POP_TOP             
             22 LOAD_CONST               0 (None)
             25 RETURN_VALUE        

it looks much more simple, load the functions, create the lambda and
then just delegate to map and pop/return the result. This intuitively
seems like it should be faster. 

> The gain is significant.
>
> Whereas ,
>
>>>> def sqr(x): return x*x
> ...
>>>> def f1(): [sqr(x) for x in range(100)]
> ...
>>>> def f2(): map(sqr, range(100))
> ...
>>>> mytimeit.Timeit(f1)
> '37.23 usec/pass'
>>>> mytimeit.Timeit(f2)
> '36.72 usec/pass'
>
> The gain is minuscule, barely noticeable.

This is reasonable. There's a LOAD_GLOBAL that happens in each iteration
which looks up the function outside the scope of the comprehension. I
remember reading how slow this is and so it would kill the performance
of the list comprehension.

[...]


-- 
~noufal
http://nibrahim.net.in

The first condition of immortality is death. -Stanislaw Lec


More information about the BangPypers mailing list