Generator expressions v/s list comprehensions

Mon Aug 30 03:05:13 EDT 2004

Mahesh Padmanabhan <mahesh at privacy.net> wrote:
   ...
> What is "Python's applicative order evaluation" and how do generator 
> expressions get around it?

I see you've received several answers but it doesn't seem that they
address this point, so let me try.

Python, like most programming languages, specifies that evaluation is in
"applicative order": each argument to a function (or operator, etc) is
entirely evaluated before the function executes (Python, like several
other languages, makes a specific ad hoc exception for 'short-circuiting
operators', namely 'and' & 'or', which may entirely omit evaluating one
argument in some cases).  This is also known as 'strict' or 'eager'
evaluation and by other names yet (some draw subtle distinctions between
the exact connotation of such names but for simplicity we'll treat them
as synonyms, and similarly for terms introduced in the next paragraph).

Haskell, where the concept of list comprehensions originated, is
different than most languages: Haskell specifies that evaluation is in
"normal order" -- each argument and PART of an argument is evaluated at
the time, if ever, at which that value is needed to compute.  This is
also known as 'non-strict' or 'lazy' evaluation.  The reason this
feature of Haskell is pretty unusual is that implementing it is somewhat
complicated and costly -- in a lazy-evaluation language's implementation
there are typically plenty of 'thunks' of code being stashed here and
yon for possible future execution.  I don't even know of any lazy
evaluation language that does not rely on the crucial simplification of
the computation environment given by 'functional programming' aka FP
(basically, the idea that all objects are immutable -- a program
computes new objects, it never mutates existing objects)... and Python
is not a FP language (it crucially uses mutable data, particularly
mutable lists), so it would be astonishing if Python had the kind of
generalized support for lazy evaluation that you see in Haskell.

So, Python copied Haskell's list comprehensions, but didn't (couldn't,
probably!) copy the lazy evaluation aspect.  So, whenever Python
evaluates a LC, it evaluates it fully -- it necessarily materializes the
whole list in memory, say N items' worth.  This is intrinsically O(N) in
space and therefore AT BEST can be O(N) in time (it will be worse than
O(N) unless each step is constant-time, i.e. O(1)...!).  This doesn't
apply to Haskell, thanks to lazy evaluation -- in Haskell, each item of
a list comprehension will be evaluated only if and when it's needed,
just like any other operand/arguments/etc... but Python can't do that.

Which is why Python 2.4 introduced generator comprehensions -- in a way
a closer match to Haskell's list comprehensions, than Python's own LCs!
That's because generators play the role of 'thunks' -- pieces of code
hanging out there in suspended animation, ready for a bit more execution
if and when needed -- for the specific but crucially important case of
sequential iteration over (some prefix of) the items of a (potentially
unbounded) sequence [[a case which you will also find in other
programming models under such names as 'streams' -- Unix pipes also
implement some part of that, with various processes in a pipeline
playing the role of the thunks in this case...]].  Generators (and more
generally iterators, and more specifically generator comprehensions)
give Python "just enough lazy evaluation to get by" -- without paying
the full conceptual and practical costs in switching to "full laziness"
(and totally immutable data), Python thereby nevertheless gets the
ability to express some (vast and important) subset of those algorithms
which non-strict evaluation makes so elegant to express.

Alex