Pre-pep discussion material: in-place equivalents to map and filter

Chris Angelico rosuav at gmail.com
Thu Nov 3 17:34:34 EDT 2016


On Fri, Nov 4, 2016 at 4:00 AM, Steve D'Aprano
<steve+python at pearwood.info> wrote:
> On Fri, 4 Nov 2016 01:05 am, Chris Angelico wrote:
>
>> On Thu, Nov 3, 2016 at 7:29 PM, Steven D'Aprano
>> <steve+comp.lang.python at pearwood.info> wrote:
>>>> lst = map (lambda x: x*5, lst)
>>>> lst = filter (lambda x: x%3 == 1, lst)
>>>> And perform especially bad in CPython compared to a comprehension.
>>>
>>> I doubt that.
>>>
>>
>> It's entirely possible. A list comp involves one function call (zero
>> in Py2), but map/lambda involves a function call per element in the
>> list. Function calls have overhead.
>
> I don't know why you think that list comps don't involve function calls.

List comps themselves involve one function call (zero in Py2). What
you do inside the expression is your business. Do you agree that list
comps don't have the overhead of opening and closing files?

files = "/etc/hostname", "/etc/resolv.conf", ".bashrc"
contents = [open(fn).read() for fn in files]

In a comparison between comprehensions and map, this cost is
irrelevant, unless your point is that "they're all pretty quick".

> Here's some timing results using 3.5 on my computer. For simplicity, so
> folks can replicate the test themselves, here's the timing code:
>
>
> from timeit import Timer
> setup = """data = list(range(10000))
> def func(x):  # simulate some calculation
>     return {x+1: x**2}
> """
> t1 = Timer('[func(a) for a in data]', setup=setup)
> t2 = Timer('list(map(func, data))', setup=setup)

This is very different from the original example, about which the OP
said that map performs badly, and you doubted it. In that example, the
list comp includes the expression directly, whereas map (by its
nature) must use a function. The "inline expression" of a
comprehension is more efficient than the "inline expression" of a
lambda function given to map.

> And here's the timing results on my machine:
>
> py> min(t1.repeat(number=1000, repeat=7))  # list comp
> 18.2571472954005
> py> min(t2.repeat(number=1000, repeat=7))  # map
> 18.157311914488673
>
> So there's hardly any difference, but map() is about 0.5% faster in this
> case.

Right. As has often been stated, map is perfectly efficient, *if* the
body of the comprehension would be simply a function call, nothing
more. You can map(str, stuff) to stringify a bunch of things. Nice.
But narrow, and not the code that was being discussed.

> Now, it is true that *if* you can write the list comprehension as a direct
> calculation in an expression instead of a function call:
>
>     [a+1 for a in data]
>
> *and* compare it to map using a function call:
>
>     map(lambda a: a+1, data)
>
>
> then the overhead of the function call in map() may be significant.

Thing is, this is extremely common. How often do you actually use a
comprehension with something that is absolutely exactly a function
call on the element in question?

> But the
> extra cost is effectively just a multiplier. It isn't going to change
> the "Big Oh" behaviour.

Sure it doesn't. In each case, the cost is linear. But the work is
linear, so I would expect the time to be linear.

> So map() here is less than a factor of two slower. I wouldn't call
> that "especially bad" -- often times, a factor of two is not important.
> What really hurts is O(N**2) performance, or worse.
>
> So at worst, map() is maybe half as fast as a list comprehension, and at
> best, its perhaps a smidgen faster. I would expect around the same
> performance, differing only by a small multiplicative factor: I wouldn't
> expect one to be thousands or even tens of times slower that the other.

But this conclusion I agree with. There is a performance difference,
but it is not overly significant. Compared to the *actual work*
involved in the task (going through one list and doing some operation
on each operation), the difference between map and a comprehension is
generally going to be negligible.

ChrisA



More information about the Python-list mailing list