[Numpy-discussion] Proposed record array behavior: the rest of the story

Colin J. Williams cjw at sympatico.ca
Thu Jul 22 05:22:01 EDT 2004


Francesc Alted wrote:

>Hi,
>
>I agree that numarray team's overhaul of RecArray access modes is very good
>and I agree most of it.
>
>A Dimarts 20 Juliol 2004 19:14, Russell E Owen va escriure:
>  
>
>>I think recarray[field name] is too easily confused with 
>>recarray[index] and is unnecessary.
>>    
>>
>
>Yeah, maybe you are right.
>
>  
>
>>I suggest one of two solutions:
>>- Do nothing. Make users use field(field name or index)
>>or
>>- Allow access to the fields via an indexable entity. Simplest for 
>>the user would be to use "field" itself:
>>   recArr.field[1]
>>   recArr.field["abc"]
>>(i.e. field becomes an object that can be called or can be accessed 
>>via __getitem__)
>>    
>>
>
>I prefer the second one. Although I know that you don't like the __getattr__
>method, the field object can be used to host one. The main advantage I see
>having such a __getattr__ method is that I'm very used to press TAB twice in
>the python console with its completion capabilities activated. It would be a
>very nice way of interactively discovering the fields of a RecArray object.
>I don't know whether this feature is used a lot or not out there, but for me
>is just great.  I understand, however, that having to include a map to
>suport non-vbalid python names for field names can be quite inconvenient.
>
>Regards,
>  
>
Perry's issue 3.

Perhaps there is a need to separate the name or identifier of a column 
in a RecArray or a field in a Record from its label.  The labels, for 
display purposes, would default to the column names.  The column names 
would default, as at present, to the Cn form.

I like the use of attributes for the column names, it avoids the problem 
Russell Owen mentioned above.
Suppose we have a simple RecArray with the fields "name" and "age", it's 
much simpler to write rec.name or rec.age that rec["name"] or rec["age"].

The problems with the use of attributes, which must be Python names, are 
(1) they cannot have accented or special characters eg é, ç, @, & * 
etc.  and (2) there is a danger of conflict with existing properties or 
attributes.  My guess is that the special characters would be required 
primarily for display purposes.  Thus, the label could meet that need.

The danger of conflict could be addressed by raising an exception.  
There remains a possible problem where identifiers are passed on from 
some other system, perhaps a database. 

Thus, the primary identifier of a row in a RecArray would be an integer 
index and that of a column or field would be a standard Python 
identifer.  Although, at times, it would be useful to be able to index 
the individual fields (or columns) as part of the usual indexing 
scheme.  Thus rec[2, 3, 4] could identify a record and rec[2, 3, 4].age 
or rec[2, 3, 4, 5] could identify the sixth field in that record.

The use of attributes raises the possibility that one could have nested 
records.  For example, suppose one has an address record:

addressRecord
   streetNumber
   streetName
   postalCode
   ...

There could then be a personal record:
personRecord
   ...
   officeAddress
   homeAddress
   ...

One could address a component as rec.homeAddress.postalCode.

Finally, there was mention, earlier in the discussion, of facilitating 
the indexing of a RecArray.  I hope that some way will be found to do this.

Colin W.





More information about the NumPy-Discussion mailing list