"groupby" is brilliant!

John Machin sjmachin at lexicon.net
Tue Jun 13 18:29:55 EDT 2006


On 14/06/2006 8:06 AM, Gary Herron wrote:
> John Machin wrote:
>> On 13/06/2006 6:28 PM, Paul McGuire wrote:
>>
>>   
>>> (Oh, and I like groupby too!  Combine it with sort to quickly create
>>> histograms.)
>>>
>>> # tally a histogram of a list of values from 1-10
>>> dataValueRange = range(1,11)
>>> data = [random.choice(dataValueRange) for i in xrange(10000)]
>>>
>>> hist = [ (k,len(list(g))) for k,g in itertools.groupby(sorted(data)) ]
>>>     
>> That len(list(g)) looks like it uses O(N) memory just to find out what N 
>> is :-(
>>   
> Not at all! A python list *knows* its length at all times. len() is a
> constant time lookup of an internal attribute.

Did you see any reference to time in what I wrote? Did you notice the 
word "memory" at all?

My point is that "g" is an iterator, and list(g) actually builds a list 
of size N, merely in order to use len(that_list) to count the number of 
items that g will produce.



More information about the Python-list mailing list