Break up list into groups
James Stroud
jstroud at mbi.ucla.edu
Mon Jul 16 19:05:30 EDT 2007
Paul Rubin wrote:
> See:
>
> http://groups.google.com/group/comp.lang.python/msg/2410c95c7f3b3654
Groupby is damn slow as far as I can tell (the Bates numbering in the
above link assumes more than the OP intended, I assume). It looks like
the author's original algorithm is the fastest python way as it bypasses
a lot of lookup, etc.
Here's the output from the script below (doit2 => groupby way):
doit
11.96 usec/pass
doit2
87.14 usec/pass
James
# timer script
from itertools import groupby
from timeit import Timer
alist = [0xF0, 1, 2, 3, 0xF0, 4, 5, 6,
0xF1, 7, 8, 0xF2, 9, 10, 11, 12, 13,
0xF0, 14, 0xF1, 15]
def doit(alist):
ary = []
for i in alist:
if 0xf0 & i:
ary.append([i])
else:
ary[-1].append(i)
return [x for x in ary if x]
def c(x):
return 0xf0 & x
def doit2(alist):
i = (list(g) for k,g in groupby(alist, c))
return [k for k in [j + i.next() for j in i] if len(k)>1]
print doit(alist)
print 'doit'
t = Timer('doit(alist)',
'from __main__ import groupby, doit, alist, c')
print "%.2f usec/pass" % (1000000 * t.timeit(number=100000)/100000)
print 'doit2'
t = Timer('doit2(alist)',
'from __main__ import groupby, doit2, alist, c')
print "%.2f usec/pass" % (1000000 * t.timeit(number=100000)/100000)
--
James Stroud
UCLA-DOE Institute for Genomics and Proteomics
Box 951570
Los Angeles, CA 90095
http://www.jamesstroud.com/
More information about the Python-list
mailing list