Yet Another Tabular Data Question

carllemp at gmail.com carllemp at gmail.com
Thu Nov 29 22:22:35 EST 2007


On Nov 29, 5:46 pm, patrick.wa... at gmail.com wrote:
> Hi all,
>
> Fairly new Python guy here.  I am having a lot of trouble trying to
> figure this out.  I have some data on some regulations in Excel and I
> need to basically add up the total regulations for each country--a
> statistical analysis thing that I'll copy to another Excel file.
> Writing with pyExcelerator has been easier than reading with xlrd for
> me...So that's what I did first, but now I'd like to learn how to
> crunch some data.
>
> The input looks like this:
>
> Country         Module
> Topic                      # of Docs
> Argentina       Food and Consumer Products      Cosmetics                1
> Argentina       Food and Consumer Products      Cosmetics                8
> Argentina       Food and Consumer Products      Food Additives          1
> Argentina       Food and Consumer Products      Food Additives          1
> Australia       Food and Consumer Products      Drinking Water           7
> Australia       Food and Consumer Products      Food Additives           3
> Australia       Food and Consumer Products      Food Additives           1
> etc...
>
> So I need to add up all the docs for Argentina, Australia, etc...and
> add up the total amount for each Topic for each country so, Argentina
> has 9 Cosmetics laws and 2 Food Additives Laws, etc...
>
> So, here is the reduced code that can't add anything...Any thoughts
> would be really helpful.
>
> import xlrd
> import pyExcelerator
> from pyExcelerator import *
>
> #Open Excel files for reading and writing
> path_file = "c:\\1\\data.xls"
> book = xlrd.open_workbook(path_file)
> Counts = book.sheet_by_index(1)
> wb=pyExcelerator.Workbook()
> matrix = wb.add_sheet("matrix")
>
> #Get all Excel data
> n=1
> data = []
> while n<Counts.nrows:
>     data.append(Counts.row_values(n, start_colx=0, end_colx=None))
>     n=n+1
>
> COUNTRY, MODULE, TOPIC,DOCS = range(4)
> COUNTRY_TOT = []
> n=0
> while n<len(data):
>     x=n
>     while data[n][COUNTRY]==data[n+1][COUNTRY]:
>         n=n+1
>     print sum(data[x:n][FT_DOCS])
>
> wb.save('c:\\1\\matrix.xls')

Considering the topic of the usenet group, I know this is heresy but
I'd suggest using the Pivot Table feature in Excel.  The whole thing
will be done if 5 clicks and no code.  Simple is better than complex.



More information about the Python-list mailing list