Data aggregation

jay graves jaywgraves at gmail.com
Thu Mar 6 12:38:05 EST 2008


On Mar 6, 10:28 am, vedranp <vedran.preg... at gmail.com> wrote:
> So, group by DATE, COUNTRY, ZIP and CITY and sum (or do some

You are soooo close.  Look up itertools.groupby
Don't forget to sort your data first.

http://aspn.activestate.com/ASPN/search?query=groupby&x=0&y=0&section=PYTHONCKBK&type=Subsection
http://mail.python.org/pipermail/python-list/2006-June/388004.html


> From some little experience with Perl, I think this is managable with
> double hash tables (1: basic hash with key/value = CITY/pointer-to-
> other-hash, 2: hash table with values for CITY1), so I assume that
> there would be also a way in Python, maybe with dictionaries? Any
> ideas?

Sometimes it makes sense to do this with dictionaries.  For example,
if you need to do counts on various combinations of columns.

count of unique values in column 'A'
count of unique values in column 'C'
count of unique combinations of columns 'A' and 'B'
count of unique combinations of columns 'A' and 'C'
count of unique combinations of columns 'B' and 'C'
in all cases, sum(D) and avg(E)

Since I need 'C' by itself, and 'A' and 'C' together, I can't just
sort and break on 'A','B','C'.

HTH
...
jay graves



More information about the Python-list mailing list