Pivot Table/Groupby/Sum question

patrick.waldo at gmail.com patrick.waldo at gmail.com
Fri Jan 4 10:55:36 EST 2008


Petr thanks so much for your input.  I'll try to learn SQL, especially
if I'll do a lot of database work.

I tried to do it John's way as en exercise and I'm happy to say I
understand a lot more.  Basically I didn't realize I could nest
dictionaries like db = {country:{genre:{sub_genre:3}}} and call them
like db[country][genre][sub_genre].  The Python Cookbook was quite
helpful to figure out why items needed to be added the way they did.
Also using the structure of the dictionary was a conceptually easier
solution than what I found on http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/334695.

So, now I need to work on writing it to Excel.  I'll update with the
final code.

Thanks again.

#Movie Store Example
class PivotData:
    def __init__(self):
        self.total_mov = 0
        self.total_cou = {}
        self.total_gen = {}
        self.total_sub = {}
        self.total_cou_gen ={}
        self.db = {}
    def add_data(self,country,genre,sub_genre,value):
        self.total_mov += value
        try:
            self.total_cou[country] += value
        except KeyError:
            self.total_cou[country] = value
        try:
            self.total_gen[genre] += value
        except:
            self.total_gen[genre] = value
        try:
            self.total_sub[sub_genre] += value
        except:
            self.total_sub[sub_genre] = value
        try:
            self.total_cou_gen[country][genre] += value
        except KeyError:
            try:
                self.total_cou_gen[country][genre] = value
            except KeyError:
                self.total_cou_gen[country] = {genre:value}
        try:
            self.db[country][genre][sub_genre] += value
        except KeyError:
            try:
                self.db[country][genre][sub_genre] = value
            except KeyError:
                try:
                    self.db[country][genre] = {sub_genre:value}
                except:
                    self.db[country] = {genre:{sub_genre:value}}

data =  [['argentina','Horror', 'Slasher',4],
         ['argentina','Horror', 'Halloween',6],
         ['argentina','Drama','Romance',5],
         ['argentina','Drama','Romance',1],
         ['argentina','Drama','True Life',1],
         ['japan','Classics','WWII',1],
         ['japan','Cartoons','Anime',1],
         ['america','Comedy','Stand-Up',1],
         ['america','Cartoons','WB',10],
         ['america','Cartoons','WB',3]]

COUNTRY, GENRE, SUB_GENRE, VALUE =range(4)
x=PivotData()
for s in data:
    x.add_data(s[COUNTRY],s[GENRE],s[SUB_GENRE],s[VALUE])
print
print 'Total Movies:\n', x.total_mov
print 'Total for each country\n', x.total_cou
print 'Total Genres\n', x.total_gen
print 'Total Sub Genres\n', x.total_sub
print 'Total Genres for each Country\n', x.total_cou_gen
print
print x.db



More information about the Python-list mailing list