Pulling all n-sized combinations from a list

mensanator at aol.com mensanator at aol.com
Thu Feb 9 20:36:06 EST 2006


Jack Diederich wrote:
> liberally snipped out parts
> On Thu, Feb 09, 2006 at 03:25:18PM -0800, mensanator at aol.com wrote:
> > Jack Diederich wrote:
> > > > > On Wed, Feb 08, 2006 at 12:50:19PM -0800, Swroteb wrote:
> > > > > >
> > > > > > I've got a reasonably sized list of objects that I'd like to pull out
> > > > > > all combinations of five elements from.  Right now I have a way to do
> > > > > > this that's quite slow, but manageable.  I know there must be a better
> > > > > > way to do this, but I'm not sure what it is.  Here's what I've got so
> > > > > > far:
> > > > >
> > > > > import probstat # http://probstat.sourceforge.net
> > > > > for (a, b, c) in probstat.Combination(range(9), 3): # 0..9, pick 3
> > > > >   print a, b, c
> > > > >
> >
> > so I put together a little test program to see how probstat
> > compares. Correspondence to my emulations is (as far as I can tell)
> >
> >
> > def p_perm(a,n):
> >         A = list(a)
> >         t0 = time.time()
> >         if len(A)<n:
> >             c = probstat.Permutation(A,n)
> >         else:
> >             c = probstat.Permutation(A)
> >         print time.time()-t0
> >         p = []
> >         for i in c:
> >             p.append(''.join(i))
> >         return p
>
> This never calls the A choose n branch because len(A) is always
> bigger than n.

Duh <slaps forehead>. It was supposed to be

if n<len(A):

That explains why it worked from the Idle prompt. I thought the
functions was executing c = probstat.Permutation(A,n), so that's
what I typed at the prompt.

>
> > print 'permutation w/o replacemment "abcdefghijklmnopqrstuvwxyz":4'
> > t0 = time.time()
> > p = p_perm("abcdefghijklmnopqrstuvwxyz", 4)
> > t1 = time.time()
> > print len(p),'4-letter words',t1-t0,'seconds'
> >
> > Unfortunately, the long test died (out of virtual memory) executing
> > the probstat.Permution test.
> >
>
> >>> import probstat
> >>> p = probstat.Permutation(range(25))
> >>> len(p)
> 2076180480
> >>> p = probstat.Permutation(range(26))
> >>> len(p)
> -1853882368
> >>>
>
> Overflow error.  I'll have to add a test that raises a ValueError
> for lists that are too big.
>
> The following simple loop takes three minutes to run on my laptop
> import probstat
> count_to = len(probstat(range(12))) # just computes the size
> cnt = 0
> while cnt < count_to:
>   cnt += 1
>
> So a permutation of all arrangements of the alphabet would take
> rougly forever.

Right, I never intended to do that, was trying to make 4-letter words,
not 26 letter permutations.

Anyway, now that _my_ bug is fixed, it works properly:

permutation w/  replacemment "abcdefghijklmnopqrstuvwxyz":4
456976 4-letter words 1.26600003242 seconds
combination w/  replacemment "abcdefghijklmnopqrstuvwxyz":4
23751 4-letter words 0.671999931335 seconds
permutation w/o replacemment "abcdefghijklmnopqrstuvwxyz":4
358800 4-letter words 1.6400001049 seconds
combination w/o replacemment "abcdefghijklmnopqrstuvwxyz":4
14950 4-letter words 0.56200003624 seconds

probstat.Cartesian "abcdefghijklmnopqrstuvwxyz":4
456976 4-letter words 1.42199993134 seconds
probstat.Permutation "abcdefghijklmnopqrstuvwxyz":4
358800 4-letter words 1.06200003624 seconds
probstat.Combination "abcdefghijklmnopqrstuvwxyz":4
14950 4-letter words 0.077999830246 seconds

Thanks again for supplying that pyd.

> 
> -Jack




More information about the Python-list mailing list