Optimizing list processing
MRAB
python at mrabarnett.plus.com
Wed Dec 11 19:59:42 EST 2013
On 11/12/2013 23:54, Steven D'Aprano wrote:
> I have some code which produces a list from an iterable using at least
> one temporary list, using a Decorate-Sort-Undecorate idiom. The algorithm
> looks something like this (simplified):
>
> table = sorted([(x, i) for i,x in enumerate(iterable)])
> table = [i for x,i in table]
>
> The problem here is that for large iterables, say 10 million items or so,
> this is *painfully* slow, as my system has to page memory like mad to fit
> two large lists into memory at once. So I came up with an in-place
> version that saves (approximately) two-thirds of the memory needed.
>
> table = [(x, i) for i,x in enumerate(iterable)]
> table.sort()
This looks wrong to me:
> for x, i in table:
> table[i] = x
Couldn't it replace an item it'll need later on?
Let me see if I can find an example where it would fail.
Start with:
>>> table = [('b', 0), ('a', 1)]
Sort it and you get:
>>> table.sort()
>>> table
[('a', 1), ('b', 0)]
Run that code:
>>> for x, i in table:
table[i] = x
Traceback (most recent call last):
File "<pyshell#18>", line 1, in <module>
for x, i in table:
ValueError: need more than 1 value to unpack
Why did it fail?
>>> table
[('a', 1), 'a']
The 2 methods give different results anyway: the first returns a list
of indexes, and the second returns a list of items from the iterable.
[snip]
More information about the Python-list
mailing list