count

Tim Chase python.list at tim.thechases.com
Wed Jul 8 05:53:02 EDT 2009


> I wanted to sort column 2 in assending order  and I read whole file in array
> "data" and did the following:
> 
> data.sort(key = lambda fields:(fields[2]))
> 
> I have sorted column 2, however I want to count the numbers in the column 2.
> i.e. I want to know, for example, how many repeates of say '3' (first row,
> 2nd column in above data) are there in column 2.

I think you're trying to count in the wrong step -- I'd do the 
counting while you read the file in; not while you're trying to 
sort.  Maybe something like this untested file-wrapper:

   class Histo:
     def __init__(self, data, column):
       self.data = data
       self.column = column
       self.stats = defaultdict(int)
       self.exhausted = False
     def __iter__(self):
       if self.exhausted:
         raise StopIteration("Input exhausted")
       for line in self.data:
         self.stats[line[self.column]] += 1
         yield line
       self.exhausted = True
     def stats_so_far(self):
       return self.stats

   with file('in.txt') as f:
     r = csv.reader(f, delimiter='\t')
     h = Histo(r, 2)
     data = list(h) # slurp up the elements
   print repr(h.stats_so_far())
   data.sort(key = lambda fields: (fields[2]))

-tkc







More information about the Python-list mailing list