Just for Fun: Python Permutation Jumbles

Alex Martelli aleax at aleax.it
Thu Oct 31 15:04:39 EST 2002


Jane Austine wrote:
   ...
>> def myperm(l):
   ...
>>                 yield l
> Why this difference?
> 
>>>> for e in myperm(range(3)):
> print e,
> 
> [2, 1, 0] [1, 2, 0] [1, 0, 2] [2, 0, 1] [0, 2, 1] [0, 1, 2]
>>>> print list(myperm(range(3)))
> [[0, 1, 2], [0, 1, 2], [0, 1, 2], [0, 1, 2], [0, 1, 2], [0, 1, 2]]
> 
> What's the semantics of "list"?

myperm is yielding the SAME object each and every time, and
mutating it between one yield and the next one.  The semantics
of result=list(myperm(X)) are the same as:
    result = []
    for x in myperm(X): result.append(x)
so we're appending the SAME list object 6 times (in your example
where X has length 3) and therefore ending up with a list
with six identical items.

Basically, this not-uncommon buglet is tied to the deep notions of 
"object identity" and Python's reference semantics.  Each time
you refer to an object, in Python, you have just that, a _reference_
to the object, not an implicit copy.  The object you refer to may
mutate, but you're still holding a reference to the object.

Imagine that I'm modeling T-shirts at a geeks-haute-couture catwalk.
First I come out with a green T-shirt, you like it, and say "I'll
buy the T-shirt Alex is wearing".  Then I come out again with a
blue T-shirt, you like it too, and again you say "I'll buy the T-shirt
Alex is wearing".  I come out a third time, with a black T-shirt,
and you repeat the same sentence a third time.  If each time just a
"reference" was recorded, the black T-shirt is all you'll get, because
THAT is "the T-shirt Alex is wearing" at the end of the show when your
order is fulfilled.  If you wanted a green T-shirt, AND a blue one,
AND a black one too, you should each time have said "I'll buy A COPY OF
the T-shirt Alex is wearing RIGHT NOW" -- explicitly taking "a snapshot
of the situation" before the referent of the reference "T-shirt Alex
is wearing" mutates.

Similarly:

    import copy
    result = []
    for x in myperm(X): result.append(copy.copy(x))

The call to function copy of module copy is the most explicit and
clearest way to ask for A COPY OF whatever object x references right
now, rather than a reference to the object itself.  There are other
ways, marginally faster, more cryptic, and depending on specific
knowledge about the type of the object x is referring to, e.g.:

    result = []
    for x in myperm(X): result.append(x[:])

(the "all-list slice" idiom), or:

    result = []
    for x in myperm(X): result.append(list(x))

(a bit more readable, using the 'list' constructor).

More concise ways to express any of these are given by list-comprehension
syntax, but the semantics is the same as for the loops, i.e.:

    [ x for x in myperm(X) ]

will leave you with a list of 6 identical items, exactly as list(myperm(X))
did, because it takes no "snapshot", asks for no COPY; while any of:

    [ copy.copy(x) for x in myperm(X) ]
    [ x[:] for x in myperm(X) ]
    [ list(x) for x in myperm(X) ]

will take 6 snapshots, one per item, and therefore give you the result
you expect.

Of course, SOME (but NOT all!) of the performance advantage obtained
by permuting in-place rather than building new lists each time goes
away when you need to take snapshots of each permutation.  NOT all,
because you're only going to copy (snapshot) the permutation of the
overall list: the permutations recursively used within the list to
compute the overall permutation are still going to benefit.


Alex





More information about the Python-list mailing list