An IMHO interesting optimization question...

Greg Chapman glchapman at earthlink.net
Thu Apr 25 09:57:25 EDT 2002


On Wed, 24 Apr 2002 19:08:53 -0500, Skip Montanaro <skip at pobox.com> wrote:

>The cost to run a loop is the startup/tear down cost plus the cost of
>running all the iterations.  I believe the startup/tear down cost of map()
>is fairly high, while it's per iteration cost is low (high and low being
>relative to the cost of the same work using a for loop).  You have to
>amortize that startup cost over a large number of iterations for it to pay
>off.  My guess is that your wordExp.findall() call only returns a list of at
>most a few elements.

My understanding is that the real problem with map is the overhead involved with
calling a Python function (i.e., a function implemented in Python).  Map called
with a function implemented in C (e.g., repr) will generally be faster than the
equivalent Python for loop, while map with a Python function will generally be
slower.  For example, on my system both of the for-loop variants below are
faster than the map with python function version (even though the second for
loop version does a global function lookup every time through the loop).  The
map with C function is the fastest:

import time

def pyrepr(obj):
    return `obj`

def map_cfunc(objs):
    return map(repr, objs)

def map_pyfunc(objs):
    return map(pyrepr, objs)

def for_operator(objs):
    res = [''] * len(objs)
    i = 0
    for obj in objs:
        res[i] = `obj`
        i += 1
    return res

def for_globalfunc(objs):
    res = [''] * len(objs)
    i = 0
    for obj in objs:
        res[i] = repr(obj)
        i += 1
    return res

def test():
    tests = [map_cfunc, map_pyfunc, for_operator, for_globalfunc]
    items = range(10000)
    for test in tests:
        start = time.clock()
        for i in range(10):
            test(items)
        stop = time.clock()
        print ('%14s:' % test.__name__), (stop-start)*1000


---
Greg Chapman




More information about the Python-list mailing list