groupby() seems slow
Raymond Hettinger
python at rcn.com
Tue Oct 16 16:04:41 EDT 2007
On Oct 15, 8:02 pm, 7stud <bbxx789_0... at yahoo.com> wrote:
> t = timeit.Timer("test3()", "from __main__ import test3, key, data")
> print t.timeit()
> t = timeit.Timer("test1()", "from __main__ import test1, data")
> print t.timeit()
>
> --output:---
> 42.791079998
> 19.0128788948
>
> I thought groupby() would be faster. Am I doing something wrong?
The groupby() function is not where you are losing speed. In test1,
you've in-lined the code for computing the key. In test3, groupby()
makes expensive, repeated calls to a pure python key function. For
an apples-to-apples comparison, try something like this:
def test4():
master_list = []
row = []
for elem in data:
if key(elem) == 'a':
row.append(elem)
elif row:
master_list.append(' '.join(row))
del row[:]
Raymond
More information about the Python-list
mailing list