Generating a random subsequence

Alex Martelli aleax at aleax.it
Wed Jan 9 05:43:32 EST 2002


"Tom Bryan" <tbryan at python.net> wrote in message
news:SHP_7.73257$mp3.29790061 at typhoon.southeast.rr.com...
> Alex Martelli wrote:
>
> > can become:
> >           templist = list(seq)
> >           random.shuffle(templist)
> >           for item in templist[:length]:
> >               returnSeq += item

Sorry, a silly error here on my part: sequence addition needs both
operands to be of the same type, of course.  So, you need to 'cast'
the right-hand side to returnSeq's type, as in the other solution.

Note that you had the same bug in these lines of your original post:

    for idx in range(length):
        returnSeq = returnSeq + seq[indexes[idx]]

Here, too, you need to 'cast' the right-hand operand to operator +,
although, here, there's an obvious way to do it -- use slicing
rather than indexing:

    for idx in range(length):
        returnSeq = returnSeq + seq[indexes[idx]:indexes[idx]+1]

Shuffling indices does have this advantage -- no 'casting' issues,
since a slice from a sequence 'naturally' has the same type as
the sequence it's taken from.

This does not invalidate the suggestion to use += -- it still
dispatches on mutable vs immutable LHS appropriately and thus
avoids the need for doing it yourself.  Following these ideas,
the recipe would be:

    returnSeq = seq[0:0]
    indices = range(len(seq))
    random.shuffle(indices)
    for anindex in indices[:length]:
        returnSeq += seq[anindex:anindex+1]


> > The += augmented operator dispatches appropriately to in-place or
> > not-in-place mutation depending on LHO's mutable/immutable nature.
>
> Is that just true of +=?

No, all augmented assignment operators behave similarly.  They're NOT
some silly syntax sugar for "x = x OP y": they're a precious and useful
solution for the frequent and otherwise-difficult issue of:

    if is_mutable(x):
        x.modify_in_place_via_OP(y)
    else:
        x = x OP y

where finding out is_mutable and how to modify in-place would be
extremely hard to perform yourself in the general case.


> That is, if returnSeq is a list then, I can say
> returnSeq = returnSeq + item

No, returnSeq = returnSeq + [item], as above.

> OR
> returnSeq.append( item )

Yes, you can say this.

> OR
> returnSeq += item

Yes, you can say this too.


> You're saying that the last one is roughly equivalent to the second one
and
> not the first?  Or are they all three equivalent?

They're not equivalent: when returnSeq is a list, thus mutable, += mutates
it (so the id(returnSeq) doesn't change), as would returnSeq.append, while
the first idea generates a new object and rebinds name returnSeq to the
new object (so the id(returnSeq) does change).

Performance apart, this only makes a difference when you have some other
references to the list object to which name returnSeq is bound, of course.


> the-last-time-I-used-python-there-wasn't-a-+=-ly,

You're in for a treat -- all the changes since 1.5.2 (the last version
to lack +=, I believe) are very useful and practical, as well as neat.
(Well, excepting print>>blah,"we didn't need THAT one" of course:-).


Alex






More information about the Python-list mailing list