Performance of list comprehensions vs. map

Chris Barker chrishbarker at home.net
Thu Sep 6 20:30:49 EDT 2001


Skip Montanaro wrote:

> I think you're still missing what I was saying.  map() *can't* possibly have
> the same semantics as for loops precisely because it doesn't return control
> to the interpreter where the name->object lookup takes place.

I get that now. 

> Could list
> comprehensions behave that way?  Yes,

OK, at least I have that right.

> but in my opinion (and apparently in
> the opinion of the various powers that be), that would be wrong.

I guess I am having a hard time seeing why that would be wrong. I just
took a look at the PEP, and having been implimented already, it is
pretty sparce, but I see:

"""
Rationale

    List comprehensions provide a more concise way to create lists in
    situations where map() and filter() and/or nested loops would
    currently be used.
"""

except map() and filter work differently than nested loops do, as we are
talking about here. Which functionality should be emulated?

and:

"""
    - The form [... for x... for y...] nests, with the last index
      varying fastest, just like nested for loops.
"""

This seems to be referring only to the order of the indexing.

So the PEP just doesn't answer the question. From the reference manual:

"""
When a list comprehension is supplied, it consists of a single
expression followed by at least one for clause and zero or more for or
if clauses. In this case, the elements of the new list are those that
would be produced by considering each of the for or if clauses a block,
nesting from left to right, and evaluating the expression to produce a
list element each time the innermost block is reached.   
"""

This does say that the expression is evaluated "each time", which does
mean the current bindings of the objects in the expression will be used.
Without the PEP details, it's hard for me to know if this was a concious
design choice, or an aftifact of easy implimentaion


>If you don't like all that dynamism, maybe Python's not the language for you.

I like most of that dynamicism, but maybe not all....I do like Python a
whole lot, however.

Your example did make it clear how that kind of re-binding can be used,
and I can imagine it would be a useful feature (though I sure can't
figure out a use now!). However, you could always use for loops, if you
want that, and it would be more clear. Having the bindings of the
expression in a list comprehension fixed when the expression is
evaluated the first time would provide a lot less potential for
confusion, and also provide optimization options. Having it work EXACTLY
like a nested for loop makes it only syntactic sugar, and therefore less
useful. I do like the sugar however, so I'm still glad it's there, and
it's probably too late to change it anyway.

To use your example, I would like to see the following list
comprehension:

[c(i) for i in range(n)]


behave like:
l = []
fun = lambda i,f = c : f(i)
for i in range(n):
    l.append(fun(i))

instead of:
l = []
for i in range(n):
    l.append(c(i))
print "forloop:", l

i.e. bind the generating expression once at the beginning.


> I like it and am interested in speeding up the language as it exists, not some
> other language with Python's syntax but less dynamic semantics.

Me too. honestly. I am hoping that homogenous sequences could help
achieve that.
 
> This is not an ideal world.  The fact that map() is a function (defined in C
> or Python makes no difference) determines that.

OK. I was confused a bit there. The issue is that map is a function, and
list comprehensions are not. They are, however, new, and could have
behaved either way.

>     >> >> But there is no guarantee that the list you are operating on isn't
>     >> >> modified by another thread during execution.
>     >>
>     Chris> but there should be !!
>     >>
>     >> Why here and not elsewhere in the language?
> 
>     Chris> I thought you said that map() doesn't allow it?
> 
> map() gets a snapshot at the time it's called.  The binding of the name
> "str" can change during map()'s execution.

What about the input list? If I call a:

l2 = map(function, l1)

and another thread is altering l1 at the same time, will that effect l2?
This is what is scary to me.

>I suggested threads as a
> way that object bindings could change during the execution of a chunk of
> code, but the example above demonstrates it just as well without using
> threads.

well, it demonstrates re-binding of the generating function, not the
source list being altered.. the later I find a lot more scary, and can't
imagine how it could happen in a single thread.


> I'm not familiar with PEP 209 (one can't read everything).  In scanning the
> abstract it looks like the authors propose a redesign of Numeric, not
> incorporation of Numeric into the language.

You're right, but I'm pretty sure making it a part of the standard
library is one of the goals.

> I was suggesting that perhaps
> the array object could be elevated to the same status as lists, tuples and
> dicts, with appropriate changes made to the virtual machine to take
> advantage of arrays' homegeneity.

Yes, that is not proposed. I've posted a note to the NumPy list to see
if anyone thinks this is worthwhile.

>  Moving array into the language would
> probably be a lot more modest bit of work than moving Numeric into the
> language.

Yes, but Numeric arrays are SO much more. The multidimensionality, the
ufuncs, the array-oriented math, ...  I don't think it would be worth it
without all that. In fact, a lot of performance is gained by using NumPy
arrays, so there would be less need for VM changes to get perfomance.

> If I wrote a module in Python then compiled it to C, I would sure want it to
> behave the same.  Otherwise, the compilation process would probably
> introduce subtle bugs into the code.  I'd hate to have to debug the
> generated C code.

Sure. that's why it would need to be part of Python, not just extra
directives to Py2C. (this is not just my twisted, make-python-static
idea:  Guido presented a proposal quite a while ago:
anyhttp://www.python.org/~guido/static-typing/. I don't know if he has
changes his mind, but it doesn't seem to be at the top of the list
either) However, even if it were extra directives, I'm not sure that
would be  worse than what you have to do when you right your own C
extension. (except de-bugging generated code rather than your own: good
point!)

Well, no one else seems to be paying any attention to our jabbering here
so I don't htink we need to continue.

Thanks for your thoughts, I have learned a lot.

-Chris





-- 
Christopher Barker,
Ph.D.                                                           
ChrisHBarker at home.net                 ---           ---           ---
http://members.home.net/barkerlohmann ---@@       -----@@       -----@@
                                   ------@@@     ------@@@     ------@@@
Oil Spill Modeling                ------   @    ------   @   ------   @
Water Resources Engineering       -------      ---------     --------    
Coastal and Fluvial Hydrodynamics --------------------------------------
------------------------------------------------------------------------



More information about the Python-list mailing list