[Numpy-discussion] Proposed record array behavior: the rest of the story
Colin J. Williams
cjw at sympatico.ca
Thu Jul 22 05:22:01 EDT 2004
Francesc Alted wrote:
>Hi,
>
>I agree that numarray team's overhaul of RecArray access modes is very good
>and I agree most of it.
>
>A Dimarts 20 Juliol 2004 19:14, Russell E Owen va escriure:
>
>
>>I think recarray[field name] is too easily confused with
>>recarray[index] and is unnecessary.
>>
>>
>
>Yeah, maybe you are right.
>
>
>
>>I suggest one of two solutions:
>>- Do nothing. Make users use field(field name or index)
>>or
>>- Allow access to the fields via an indexable entity. Simplest for
>>the user would be to use "field" itself:
>> recArr.field[1]
>> recArr.field["abc"]
>>(i.e. field becomes an object that can be called or can be accessed
>>via __getitem__)
>>
>>
>
>I prefer the second one. Although I know that you don't like the __getattr__
>method, the field object can be used to host one. The main advantage I see
>having such a __getattr__ method is that I'm very used to press TAB twice in
>the python console with its completion capabilities activated. It would be a
>very nice way of interactively discovering the fields of a RecArray object.
>I don't know whether this feature is used a lot or not out there, but for me
>is just great. I understand, however, that having to include a map to
>suport non-vbalid python names for field names can be quite inconvenient.
>
>Regards,
>
>
Perry's issue 3.
Perhaps there is a need to separate the name or identifier of a column
in a RecArray or a field in a Record from its label. The labels, for
display purposes, would default to the column names. The column names
would default, as at present, to the Cn form.
I like the use of attributes for the column names, it avoids the problem
Russell Owen mentioned above.
Suppose we have a simple RecArray with the fields "name" and "age", it's
much simpler to write rec.name or rec.age that rec["name"] or rec["age"].
The problems with the use of attributes, which must be Python names, are
(1) they cannot have accented or special characters eg é, ç, @, & *
etc. and (2) there is a danger of conflict with existing properties or
attributes. My guess is that the special characters would be required
primarily for display purposes. Thus, the label could meet that need.
The danger of conflict could be addressed by raising an exception.
There remains a possible problem where identifiers are passed on from
some other system, perhaps a database.
Thus, the primary identifier of a row in a RecArray would be an integer
index and that of a column or field would be a standard Python
identifer. Although, at times, it would be useful to be able to index
the individual fields (or columns) as part of the usual indexing
scheme. Thus rec[2, 3, 4] could identify a record and rec[2, 3, 4].age
or rec[2, 3, 4, 5] could identify the sixth field in that record.
The use of attributes raises the possibility that one could have nested
records. For example, suppose one has an address record:
addressRecord
streetNumber
streetName
postalCode
...
There could then be a personal record:
personRecord
...
officeAddress
homeAddress
...
One could address a component as rec.homeAddress.postalCode.
Finally, there was mention, earlier in the discussion, of facilitating
the indexing of a RecArray. I hope that some way will be found to do this.
Colin W.
More information about the NumPy-Discussion
mailing list