Performance of list comprehensions vs. map

Wed Sep 5 15:21:27 EDT 2001

    Chris> I just took another look at:

    Chris> http://musi-cal.mojam.com/~skip/python/fastpython.html

    Chris> specifically, the loops section:

    fp.html> List comprehensions were added to Python in version 2.0 as
    fp.html> well. They provide a syntactically more compact way of writing
    fp.html> the above for loop:

    fp.html> newlist = [s.upper() for s in list]

    fp.html> It's not really any faster than the for loop version, however.

    Chris> my question is: Why not? 

map() is a single function that doesn't return control to the interpreter
until it's work is done.  List comprehensions are essentially just syntactic
sugar for the equivalent for loop.  Both the for loop and list comprehension
are implemented as loops using multiple virtual machine instructions.
Perhaps the easiest way to see this is to write three equivalent functions
and disassemble them:

    def map_str(n):
        return map(str, range(n))

    def lc_str(n):
        return [str(i) for i in range(n)]

    def for_str(n):
        l = []
        append = l.append
        for i in range(n):
            append(str(i))

    assert map_str(5) == lc_str(5) == for_str(5)

    import dis
    print "disassembling map_str"
    dis.dis(map_str)
    print "disassembling lc_str"
    dis.dis(lc_str)
    print "disassembling for_str"
    dis.dis(for_str)

    Chris> I like list comprehensions a lot, and they would seem to offer a
    Chris> way to get optimization over a for loop, much like map does, but
    Chris> apparently not. If map can do the loop in C, why can't a list
    Chris> comprehension?

In theory, an optimizer could identify and replace some list comprehensions
with calls to map, but they wouldn't be quite equivalent.  In the examples
above, there is no guarantee that the definition of the str function doesn't
change during the execution of either the for loop or the list comprehension
(think multiple threads of execution ;-).  The call to map, on the other
hand, binds str once before map is called.

    Chris> I think a big stumbling block is that Python has no concept of
    Chris> the "homogenous" sequence. There are modules that provide such a
    Chris> thing (Numeric, array), and indeed, the string type is a
    Chris> homogenous sequence, but Python itself does not understand this
    Chris> concept, so that if a map or list comprehesion is using a
    Chris> homogenous list, it still has to check the type of every element
    Chris> as it goes through the list. For example:

    Chris> [2*a for a in list]

    Chris> If the interpreter could just check once what the type of all the
    Chris> elements of list were, it would only have to figure out what 2*a
    Chris> meant once, and it could whip through the list very
    Chris> quickly. Indeed, if you do:

    Chris> [2*a for a in a_string]

But there is no guarantee that the list you are operating on isn't modified
by another thread during execution.

I've been thinking of this a little in the context of Ken Simpson's recently
announced PyInline module.  The biggest reason I can't use it right now
(ignoring it's pre-alpha version number ;-) in my own code is precisely that
I operate on dicts, lists and tuples a lot.  I'm sure that's true for other
Python programmers.  The problem of easily writing C code that can operate
on lists (for example) boils down to deciding how you will map Python's
lists to something that is efficiently representable in C.  Perl's inline
module does this with something called typemaps.  (Look in something like
/usr/local/lib/perl5/5.6.1/ExtUtils for a file called "typemap".)  I don't
really understand everything that's going on in that file, but it does
provide mappings.  For instance, a "char **" argument followed by an
"unsigned int" arg might map to and from a list of strings and a length, an
"int *" followed by "unsigned int" might map to and from a list of ints,
etc.  I'm not sure quite how you'd do dicts (or if you could in a reasonable
fashion without listifying them somehow).

I suspect this is about as close as Python will ever come to understanding
homogeneous containers at the language level.

    Chris> I've proposed similar ideas in the past, and not gotten much
    Chris> response. I imagine my ideas are full of holes, but no one has
    Chris> taken the time to point them out to me yet. I have come to the
    Chris> conclusion that the really sharp minds on this list (and the
    Chris> folks that might be qualified to actually impliment such a thing)
    Chris> are not really interested in something like this that is a
    Chris> perfomance only improvement. Am I right? or is the idea just so
    Chris> stupid that it's not worth commenting on?

It's not a matter of lack of interest as much as a desire to keep the
semantics of Python and lack of time.  Here are some efforts I'm aware of.
Note that there are lots of ways performance can be addressed, not just by
constraining types.

    * Rattlesnake (my thing) - a new register-oriented virtual machine whose
      premise is that the current stack-oriented virtual machine does far
      too much data shuffling

    * Psyco (Armin Rego) - a run-time specializing virtual machine that sees
      what sorts of types are input to a function and compiles a type- or
      value-specific version of that function on-the-fly.  I believe Armin
      is looking at some JIT code generators in addition to or instead of
      another virtual machine.

    * Static typing SIG (http://www.python.org/sigs/types-sig/) - I'm not
      familiar with this SIGs work, so you'll have to browse the archives.

    * Python2C (Bill Tutt, Greg Stein) - A "module compiler" that generates
      a C version of a Python module.  http://www.mudlib.org/~rassilon/p2c/

    * PEP 266 - A "pie in the sky" PEP I wrote that proposes that global
      data could be accessed like local data if you reverse the roles of who
      keeps references up-to-date.

    * Various peephole optimizers - I wrote one a few years ago.  There's
      also Michael Hudson's bytecodehacks project (on Sourceforge).  The
      xfreeze bytecode freezer (part of the standard distribution?) also
      does some peepholing.

I'm sure there are several I'm missing.

-- 
Skip Montanaro (skip at pobox.com)
http://www.mojam.com/
http://www.musi-cal.com/