Pre-pep discussion material: in-place equivalents to map and filter
Terry Reedy
tjreedy at udel.edu
Thu Nov 3 20:03:42 EDT 2016
On 11/3/2016 2:56 AM, arthurhavlicek at gmail.com wrote:
> lst = [ item for item in lst if predicate(item) ]
> lst = [ f(item) for item in lst ]
>
> Both these expressions feature redundancy, lst occurs twice and item at least twice. Additionally, the readability is hurt, because one has to dive through the semantics of the comprehension to truely understand I am filtering the list or remapping its values.
...
> A language support for these operations to be made in-place could improve the efficiency of this operations through reduced use of memory.
We already have that: slice assignment with an iterator.
lst[:] = (item for item in list if predicate(item))
lst[:] = map(f, lst) # iterator in 3.x.
To save memory, stop using unneeded temporary lists and use iterators
instead. If slice assignment is done as I hope it will optimize remain
memory operations. (But I have not read the code.) It should overwrite
existing slots until either a) the iterator is exhausted or b) existing
memory is used up. When lst is both source and destination, only case
a) can happen. When it does, the list can be finalized with its new
contents.
As for timings.
from timeit import Timer
setup = """data = list(range(10000))
def func(x):
return x
"""
t1a = Timer('data[:] = [func(a) for a in data]', setup=setup)
t1b = Timer('data[:] = (func(a) for a in data)', setup=setup)
t2a = Timer('data[:] = list(map(func, data))', setup=setup)
t2b = Timer('data[:] = map(func, data)', setup=setup)
print('t1a', min(t1a.repeat(number=500, repeat=7)))
print('t1b', min(t1b.repeat(number=500, repeat=7)))
print('t2a', min(t2a.repeat(number=500, repeat=7)))
print('t2b', min(t2b.repeat(number=500, repeat=7)))
#
t1a 0.5675313005414555
t1b 0.7034254675598604
t2a 0.5181285985208888
t2b 0.5196112759726024
If f does more work, the % difference among these will decrease.
--
Terry Jan Reedy
More information about the Python-list
mailing list