handling tabular data in python--newbie question

Steve Holden steve at holdenweb.com
Wed Aug 29 09:22:16 EDT 2007


hyena wrote:
> Hi,
>   Just jump in python few days. I am wondering how to store and index a 
> table in python effectively and easily.I think the basic data types are not 
> very straight foward to handle a table (eg, from csv or data base.)
> 
>   I have a csv file, the first row of it is column names and the rest rows 
> are data. There are some tens of columns and hundreds rows in the file. I am 
> planning to use the column names as variables to access data, currently I am 
> thinking of using a dictionary to store this file but did not figure out a 
> elegant way to start.
> 
>   Any comments and suggestions are wellcomed. Please forgive me if this 
> question is too naive , and yes, I did search google  a while but did not 
> find what I want.
> 
> Thanks 
> 
> 
One way would be to store each row as a dictionary. Suppose your data 
file is called "myfile.txt" and, for simplicity, that columns are 
separated by whitespace. Please note the following code is untested.

f = open("myfile.txt", "r")
names = file.next().split()

So now names contains a list of the field names you want to use. Let's 
store the rows in a dictionary of dictionaries, using the first column 
to index each row.

rows = {}
for line in file:
   cols = line.split()
   rdict =  dict(zip(names, cols))
   rows[cols[0]] = rdict

dict(zip(names, cols)) should create a dictionary where each field is 
stored against its column name. I assume that cols[0] is unique for each 
row, otherwise you will suffer data loss unless you check for that 
circumstance. You can check this kind of thing in the interactive 
interpreter:

 >>> names = ["first", "second", "third"]
 >>> dict(zip(names, [1, 2, 3])
... )
{'second': 2, 'third': 3, 'first': 1}
 >>>

Another alternative, however, would be to create an object for each row 
where the columns are stored as attributes. This approach would be 
useful if the column names are predictable, but rather less so if each 
of your data files were to use different column names. Let us know and 
if appropriate someone can point you at the "bunch" class.

Welcome to Python!

regards
  Steve
-- 
Steve Holden        +1 571 484 6266   +1 800 494 3119
Holden Web LLC/Ltd           http://www.holdenweb.com
Skype: holdenweb      http://del.icio.us/steve.holden
--------------- Asciimercial ------------------
Get on the web: Blog, lens and tag the Internet
Many services currently offer free registration
----------- Thank You for Reading -------------




More information about the Python-list mailing list