Filter versus comprehension (was Re: something about split()???)

Terry Reedy tjreedy at udel.edu
Wed Aug 22 12:43:04 EDT 2012


On 8/22/2012 3:30 AM, Mark Lawrence wrote:
> On 22/08/2012 06:46, Terry Reedy wrote:
>> On 8/21/2012 11:43 PM, mingqiang hu wrote:
>>> why filter is bad when use lambda ?
>>
>> Inefficient, not 'bad'. Because the equivalent comprehension or
>> generator expression does not require a function call.

for each item in the iterable.

> A case of premature optimisation? :)

No, as regards my post. I simply made a factual statement without 
advocating a particular action.

filter(lambda x: <expr>, iterable)
(x for x in iterable if <expr>)

both create iterators that produce the items in iterable such that 
bool(<expr>) is true. The following, with output rounded, shows 
something of the effect of the extra function call.

 >>> timeit.timeit("list(i for i in ranger if False)", "ranger=range(0)")
0.91
 >>> timeit.timeit("list(i for i in ranger if False)", "ranger=range(20)")
1.28
 >>> timeit.timeit("list(filter(lambda i: False, ranger))", 
"ranger=range(0)")
0.83
 >>> timeit.timeit("list(filter(lambda i: False, ranger))", 
"ranger=range(20)")
2.60

Simply keeping true items is faster with filter -- at least on my 
particular machine with 3.3.0b2.

 >>> timeit.timeit("list(filter(None, ranger))", "ranger=range(20)")
1.03

Filter is also faster if the expression is a function call.

 >>> timeit.timeit("list(filter(f, ranger))", "ranger=range(20); 
f=lambda i: False")
2.5033614114454394
 >>> timeit.timeit("list(i for i in ranger if f(i))", "ranger=range(20); 
f=lambda i: False")
3.2394095327040304

---
Perhaps or even yes as regards the so-called rule 'always use 
comprehension'. If one prefers filter as more readable, if one only 
wants to keep true items, if the expression is a function call, if 
evaluating the expression takes much more time than the extra function 
call so the latter does not matter, if the number of items is few enough 
that the extra time does not matter, then the rule is not needed or even 
wrong.

So I think PyLint should be changed to stop its filter fud.

-- 
Terry Jan Reedy




More information about the Python-list mailing list