How to store "3D" data? (data structure question)

Graham Fawcett graham.fawcett at gmail.com
Wed Jul 20 13:47:50 EDT 2005


Sebastian Bassi wrote:
> Hello,
>
> I have to parse a text file (was excel, but I translated to CSV) like
> the one below, and I am not sure how to store it (to manipulate it
> later).
>
> Here is an extract of the data:
>
[snip]

This looks a lot like 2D data (row/column), not 3D. What's the third
axis? It looks, too, that you're not really interested in storage, but
in analysis...

Since your "line" columns all have names, why not use them as keys in a
dictionary? The associated values would be lists, in which you could
keep references to matching rows, or parts of those rows (e.g. name and
allele). Count up the length of the row, and you have your "number of
matches".



import csv                              # let Python do the grunt work

f = file('name-of-file.csv')
reader = csv.reader(f)

headers = reader.next()                 # read the first row
line_names = headers[2:]

results = {}                            # set up the dict
for lname in line_names:                # each key is a line-name
    results[lname] = []

for row in reader:                      # iterate the data rows
    row_name, allele = row[:2]
    line_values = row[2:]               # get the line values.
    # zip is your friend here. It lets you iterate
    # across your line names and corresponding values
    # in parallel.
    for lname, value in zip(line_names, line_values):
        if value == '*':
            results[lname].append((row_name, allele))

# a quick look at the results.
for lname, matches in results.items():
    print '%s %d' % (lname, len(matches))


Graham




More information about the Python-list mailing list