Numpy record array - field names for all dimensions
Robert Kern
robert.kern at gmail.com
Wed Dec 3 16:22:04 EST 2008
ShanMayne wrote:
> Greetings All
Greetings! If you have more numpy questions, you will find numpy-discussion to
be a better forum:
http://www.scipy.org/Mailing_Lists
> I am seeking to represent datasets where each data element is the
> calculated result from several (4 for now) other data types. A matrix-
> like (in the general mathematical sense) seems logical, where the
> intersection of each of the 4 values (from different data sets) holds
> the value derived from those 4 values here serving as indexes.
>
> So, each matrix/array element is associated with 4 fields.
> eg:
> matrix element/output value = 24.235 -->
> 'Formula' = 'C12H24O2N2'
> 'Solvent' = 'Acetonitrile'
> 'fragmentation_method' = 'CID'
> 'resolution' = 'unit'
>
> ideally I would like to call the output value by indexing the matrix
> with the input information. eg:
>
> matrix['C12H24O2N2']['Acetonitrile']['CID']['unit'] = 24.235
>
> Numpy's record arrays seemingly don't allow all dimensions to carry
> field names. ie. each column/row carrying a label. Instead fieldname
> usage appears to create a "new dimension" as denoted by square
> brackets.
Pretty much. You can make nested dtypes, but that's not really the data
structure that you want. You probably want a simple dictionary.
d = {
('C12H24O2N2','Acetonitrile','CID','unit'): 24.235,
...
}
assert d['C12H24O2N2','Acetonitrile','CID','unit'] == 24.235
If you want to make partial queries (e.g. Formula='C12H23O2N2' and
resolution='unit'), this becomes more like a typical relational database, but
you can probably get along with a few simple functions to loop over the
dictionary and pull out the relevant keys pretty quickly.
--
Robert Kern
"I have come to believe that the whole world is an enigma, a harmless enigma
that is made terrible by our own mad attempt to interpret it as though it had
an underlying truth."
-- Umberto Eco
More information about the Python-list
mailing list