implement random selection in Python

Boris Borcic bborcic at gmail.com
Fri Nov 16 08:34:31 EST 2007


Boris Borcic wrote:
> Bruza wrote:
>> No. That does not solve the problem. What I want is a function
>>
>>   def randomPick(n, the_items):
>>
>> which will return n DISTINCT items from "the_items" such that
>> the n items returned are according to their probabilities specified
>> in the (item, pro) elements inside "the_items".
> 
> and in the initial post you wrote :
> 
>  > But how about the general case, for N > 1 and N < len(items)?
> 
> The problem is you need to make "the n items returned are according
> to their probabilities" more precise. "According to their probabilities" implies 
> n INDEPENDENT picks, but this is contradictory with the n picks having to 
> provide DISTINCT results (what clearly constrains picks relative to each other).
> 
> Of course there are obvious ways to combine the results of random choices of 
> single items to obtain a set like you want, but it is not obvious that they are 
> equivalent.
> 
> This is the most simple-minded :
> 
> def randomPick(n, the_items) :
>      assert n<len(the_items)
>      result = set()
>      while len(result)<n :
>            result.add(singlePick(the_items))
>      return sorted(result)
> 
> This is another (but it won't work with your version of singlePick as it is,
> although it would with that provided by the other posters) :
> 
> def randomPick(n, the_items) :
>      result = []
>      items = dict(the_items)
>      for k in range(n) :
>          pick = singlePick(items.items())
>          result.append(pick)
>          del items[pick]
>      return result
> 
> I would be surprised if they had exactly the same statistical properties, IOW, 
> if they did fit the same exact interpretation of "according to their probabilities".
> 
> 

yet another one, constructing a list of n-sets first, and then picking one;
like the other solutions, it doesn't optimize for repeated use.

def pickn(items,n) :
     "yields all n-sublists of (destructed) items"
     if n<=len(items) :
         if n :
             item = items.pop()
             for res in pickn(items[:],n) :
                 yield res
             for res in pickn(items,n-1) :
                 res.append(item)
                 yield res
         else :
             yield []


def randomPick(n,the_items) :
     "randomly pick n distinct members of the_items"
     the_sets = list(pickn(the_items[:],n))
     divisor = len(the_sets)*float(n)/len(the_items)
     for k,s in enumerate(the_sets) :
         the_sets[k] = (sorted(who for who,_ in s),
                    int(1+sum(p for _,p in s)/divisor)) # mhh...
     return singlePick(the_sets)




More information about the Python-list mailing list