[Numpy-discussion] BOF notes: Fernando's proposal: NumPy ndarray with named axes

Rob Speer rspeer at MIT.EDU
Mon Jul 12 13:52:28 EDT 2010


It's not just about the rows: a 2-D datarray can also index by
columns, an operation that has no equivalent in a 1-D array of records
like your example.

In the movie example, arr.col_named(305) (or, in datarray syntax,
arr.named[:,305], or arr.user.named[305]) contains the movie ratings
for the user with ID 305, still indexed by movie titles. You can't do
that at all with a record array of the form you described, except by
using a list comprehension over the whole array that turns it into
something else.

2-D datarrays and 1-D record arrays may look similar, but they are
very different data structures. In fact, they're probably orthogonal
to each other -- I see no reason one couldn't make a datarray of
records, except for the fact that I wouldn't want to write the __str__
for such a beast.

(Speaking of which, I'm working on a 2-D datarray __str__ based on the
Divisi one. I have to make it support datatypes besides floats,
however.)
-- Rob

On Sun, Jul 11, 2010 at 2:09 PM, Neil Crighton <neilcrighton at gmail.com> wrote:
> Robert Kern <robert.kern <at> gmail.com> writes:
>
>>
>> On Sun, Jul 11, 2010 at 11:36, Rob Speer <rspeer <at> mit.edu> wrote:
>> >> But the utility of named indices is not so clear
>> >> to me. As I understand it, these new arrays will still only be
>> >> able to have a single type of data (one of float, str, int and so
>> >> on). This seems to be pretty limiting.
>>
>> Having ticks on *every* axis is the primary feature there.
>>
>
> I see, thanks.
>
> So for Rob's example slide you could use a record array:
>
> rec = np.rec.fromrecords(data, names='name,305,6,234')
>
> (Here data is a list of tuples, each tuple giving the movie name + it's data.)
>
> In this case it's easy to index by field name (rec['205']), but a trickier to
> choose the row using the movie name:
>
> ind = dict((n,i) for i,n in enumerate(rec.name))
>
> rec[ind['Wrong Trousers, The (1993)']]
>
> So datarrays would make this easier.
>
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>



More information about the NumPy-Discussion mailing list