[Tutor] list (in)comprehensions

Remco Gerlich scarblac@pino.selwerd.nl
Wed, 24 Jan 2001 15:11:55 +0100


On Wed, Jan 24, 2001 at 07:31:03PM +0900, kevin parks wrote:
> The other thing that i am failing to understand is Python's new list
> comprehensions. They are new to 2.0 so are not discussed in any of the books
> that i have and the www.python.org site has very little on them, except for
> a few examples of someone fooling around with the interpreter. Some
> explanation would be helpful. I have been hacking lots of list processing
> tools and i have a feeling that they could all be rewritten more easily with
> list comprehensions if only i could (i can't say it) C O M P R E H E N D
> them (ok i said it). What are list comprehensions good for, what do they
> make easier, how to use them, etc. As a guy who uses map and filter
> everyday, it seems like something i better know. I sorely wish there was a
> howto with code and someone talking to me too! (Python documentation
> shouldn't just be a screenshot of someone's interpreter, though including an
> interactive session is helpful).

You already know map and filter, and more Python, so I'll try to explain it
a bit better comparing list comprehensions to those.

Say, you want a list of the squares of the numbers 0 to 9.

You could do that with a for loop:
squares = []
for x in range(10):
   squares.append(x**2)
   
Or with a map/lambda call:
squares = map(lambda x: x**2, range(10))

What both of these do, conceptually, is "take x's from the list range(10), and
for each one of them, add x**2 to the list".

With list comprehensions, you write that like this:
squares = [x**2 for x in range(10)]

For each x in the list 'range(10)', it computes the expression 'x**2', and
puts the results in a list.

This is faster than the for loop, more readable than map/lambda, it doesn't
leave a variable 'x' in your namespace (it only exists in the list
comprehension). It's also shorter than both alternatives.

Also, lambda has its own namespace, meaning you sometimes need to do strange
default parameter tricks to make it work. Say, if the power isn't 2, but is
stored in the variable 'power':

power = 4
powers = map(lambda x, power=power: x**power, range(10))

List comprehensions do not have this problem:
power = 4
powers = [x**power for x in range(10)]

works fine.

It's also possible to have more than one 'for' clause. For instance, if you
want to have a list of all the tuples (x,y) for 0 <= x,y <= 5 (like (1,3),
(0,4), but not (2,6)), you could make this for loop:

tuples = []
for y in range(6):
   for x in range(6):
      tuples.append((x,y))

This is the same as this list comprehension:
tuples = [(x,y) for x in range(6) for y in range(6)]


I hope this is clear.

There is also the 'if' feature, which is similar to 'filter'. What if you
only want the *odd* squares of the numbers 0 to 9?

This is the for loop:
squares = []
for x in range(10):
   if x%2 == 1:
      squares.append(x**2)

Map/lambda/filter:
squares = map(lambda x: x**2, filter(lambda x: x%2 == 1, range(10)))

(note how complicated that gets)

And the list comprehension is:
squares = [x**2 for x in range(10) if x%2 == 1]

So you use the 'if' clause to select suitable values from the list, like
filter does, but you don't need a lambda. You can have multiple if clauses,
then the value has to fit all of them.

That's about it. Since 2.0 I've noticed that list comprehensions can become
too complicated quite quickly as well, and that for loops are pretty nice
for readability. But I usually prefer them over map/lambda/filter
combinations, especially when the lambda would need a default argument.

>From a mp3player script I wrote earlier today, a real life example, I need a
list of the full pathnames of .mp3 files in directory d (and I have
directories named *.mp3 as well...):

listdir =  [os.path.abspath(d+'/'+x) for x in os.listdir(d)]
mp3s = [x for x in listdir
          if x.endswith('.mp3')
          if os.path.isfile(x)]

(This could be one list comprehension except the other files in listdir were
used elsewhere)

(Note that I always used x for the variable name in these examples, but it
can actually be anything)

(heck, i thought this was a long post, but alex martelli has posts thrice
this long *on average*..., I think)

Well that should be enough. Back to work :)

-- 
Remco Gerlich