Pivot Table/Groupby/Sum question
patrick.waldo at gmail.com
patrick.waldo at gmail.com
Fri Jan 4 10:55:36 EST 2008
Petr thanks so much for your input. I'll try to learn SQL, especially
if I'll do a lot of database work.
I tried to do it John's way as en exercise and I'm happy to say I
understand a lot more. Basically I didn't realize I could nest
dictionaries like db = {country:{genre:{sub_genre:3}}} and call them
like db[country][genre][sub_genre]. The Python Cookbook was quite
helpful to figure out why items needed to be added the way they did.
Also using the structure of the dictionary was a conceptually easier
solution than what I found on http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/334695.
So, now I need to work on writing it to Excel. I'll update with the
final code.
Thanks again.
#Movie Store Example
class PivotData:
def __init__(self):
self.total_mov = 0
self.total_cou = {}
self.total_gen = {}
self.total_sub = {}
self.total_cou_gen ={}
self.db = {}
def add_data(self,country,genre,sub_genre,value):
self.total_mov += value
try:
self.total_cou[country] += value
except KeyError:
self.total_cou[country] = value
try:
self.total_gen[genre] += value
except:
self.total_gen[genre] = value
try:
self.total_sub[sub_genre] += value
except:
self.total_sub[sub_genre] = value
try:
self.total_cou_gen[country][genre] += value
except KeyError:
try:
self.total_cou_gen[country][genre] = value
except KeyError:
self.total_cou_gen[country] = {genre:value}
try:
self.db[country][genre][sub_genre] += value
except KeyError:
try:
self.db[country][genre][sub_genre] = value
except KeyError:
try:
self.db[country][genre] = {sub_genre:value}
except:
self.db[country] = {genre:{sub_genre:value}}
data = [['argentina','Horror', 'Slasher',4],
['argentina','Horror', 'Halloween',6],
['argentina','Drama','Romance',5],
['argentina','Drama','Romance',1],
['argentina','Drama','True Life',1],
['japan','Classics','WWII',1],
['japan','Cartoons','Anime',1],
['america','Comedy','Stand-Up',1],
['america','Cartoons','WB',10],
['america','Cartoons','WB',3]]
COUNTRY, GENRE, SUB_GENRE, VALUE =range(4)
x=PivotData()
for s in data:
x.add_data(s[COUNTRY],s[GENRE],s[SUB_GENRE],s[VALUE])
print
print 'Total Movies:\n', x.total_mov
print 'Total for each country\n', x.total_cou
print 'Total Genres\n', x.total_gen
print 'Total Sub Genres\n', x.total_sub
print 'Total Genres for each Country\n', x.total_cou_gen
print
print x.db
More information about the Python-list
mailing list