[Numpy-discussion] Proposed record array behavior: the rest of the story

Russell E Owen rowen at u.washington.edu
Tue Jul 20 10:15:05 EDT 2004


At 12:04 PM -0400 2004-07-20, Perry Greenfield wrote:
>...(a detailed summary of proposed changes to numarray record arrays)

+1 on all of it with one exception noted below. This sounds like a 
first-rate overhaul and is much appreciated.

Will it be possible, when creating a new records array, to specify 
types of a record array as a list of normal numarray types? Currently 
one has to specify the types as a "formats" string, which is 
nonstandard.

I'm unhappy about one proposal:
>...
>Record array behavior changes:
>...
>5) Field name indexing for record arrays. It will be possible to index
>record arrays with a field name, i.e., if the index is a string, then what
>will be returned is a numarray/chararray for that column. (Note that it
>won't be possible to index record arrays by field number for obvious
>reasons).
>
>I.e. Currently
>
>>>>  col = recArr.field('doc')
>
>Can also be
>
>>>>  col = recArr['abc']
>
>But the current
>
>>>>  col = recArr.field(1)
>
>Cannot become
>
>>>>  col = recArr[1]

I think recarray[field name] is too easily confused with 
recarray[index] and is unnecessary.

I suggest one of two solutions:
- Do nothing. Make users use field(field name or index)
or
- Allow access to the fields via an indexable entity. Simplest for 
the user would be to use "field" itself:
   recArr.field[1]
   recArr.field["abc"]
(i.e. field becomes an object that can be called or can be accessed 
via __getitem__)

This could easily support index arrays (a topic you brought up and 
that sound appealing to me):
   recArr.field[index array]
and it might even be practical to support:
   recArr.field[sequence of field indices and/or names]
e.g.
   recArr.field[(ind 1, field name 2, ind 3...)]

You asked about other issues. One that comes to mind is record arrays 
of record arrays. Should they be allowed? My gut reaction is yes if 
it's not too hard. Folks always seem to find a use for generality if 
it's offered. On the other hand, if it's hard, it's not worth the 
effort. If they are allowed, users are going to want some efficient 
way to get to a particular field (i.e. in one call even if the field 
is several recArrays deep). That could get messy.

Thanks for a great posting. The improvements to record arrays sound first-rate.

-- Russell




More information about the NumPy-Discussion mailing list