Slicing versus loops, was Re: for i in range() anti-pattern

Peter Otten __peter__ at web.de
Fri Dec 1 05:01:22 EST 2006


Steven D'Aprano wrote:

> On Thu, 30 Nov 2006 11:17:17 +0100, Peter Otten wrote:
> 
>> Steven D'Aprano wrote:
>> 
>>> And remember that if alist is truly huge, you may take a performance hit
>>> due to duplicating all those megabytes of data when you slice it.
>> 
>> Having the same object in two lists simultaneously does not double the
>> total amount of memory; you just need space for an extra pointer.
> 
> Sure. But if you have millions of items in a list, the pointers themselves
> take millions of bytes -- otherwise known as megabytes.

I don't know the exact layout of an int, but let's assume 4 bytes for the
class, the value, the refcount and the initial list entry -- which gives
you 16 bytes per entry for what is probably the class with the smallest
footprint in the python universe. For the range(N) example the slicing
approach then needs an extra 4 bytes or 25 percent. On the other hand, if
you are not dealing with unique objects (say range(100) * (N//100)) the
amount of memory for the actual objects is negligable and consequently the
total amount doubles.
You should at least take that difference into account when you choose the
swapping algorithm.

>> That example was chosen to prove your point.
 
> Well, I thought about choosing an example that disproved my point, but I
> couldn't think of one :-)

Lack of fantasy :-)

>> The real contender for the "swap items" problem are slices.
>> 
>> def swap_slice(items):
>>     left = items[::2]
>>     items[::2] = items[1::2]
>>     items[1::2] = left
>>     return items
> 
> I always forget that extended slices can be assigned to as well as
> assigned from! Nice piece of code... if only it worked.
> 
>>>> def swap_slice(items):
> ...     left = items[::2]
> ...     items[::2] = items[1::2]
> ...     items[1::2] = left
> ...     return items
> ...
>>>> alist
> [0, 1, 2, 3, 4]
>>>> swap_slice(alist)
> Traceback (most recent call last):
>   File "<stdin>", line 1, in ?
>   File "<stdin>", line 3, in swap_slice
> ValueError: attempt to assign sequence of size 2 to extended slice of size
> 3
> 
> Everybody always forgets corner cases... like lists with odd number of
> items... *wink*

True in general, but on that one I'm with Duncan.

Here is another implementation that cuts maximum memory down from 100 to
50%.

from itertools import islice
def swap(items):
    items[::2], items[1::2] = islice(items, 1, None, 2), items[::2]
    return items
 
Peter



More information about the Python-list mailing list