[Numpy-discussion] Selection of only a certain number of fields

Mon Feb 9 03:34:00 EST 2009

A Sunday 08 February 2009, Neil escrigué:
> > The first one (and most important IMO), is that newarr continues to
> > be an structured array (BTW, when changed this name from the
> > original record array?), and you can use all the features of these
> > beasts with it.  Other reason (albeit a bit secondary) is that its
> > data buffer can be shared through the array interface with other
> > applications, or plain C code, in a relatively straightforward way.
> >  However, if newarr becomes a list (or dictionary), this is simply
> > not possible.
> >
> > Cheers,
>
> That's not a sample use case ;)
>
> One of the things I love about Python is that it has a small core set
> of features and tries to avoid having many ways to do the same thing.
>  This makes it extremely easy to learn.  With every new feature,
> numpy gets a little bit harder to learn, there's more to document and
> the code base gets larger and so harder to maintain.  In those
> senses, whenever you add a new function/feature to numpy, it gets a
> little bit worse.

Mmm, you have made another good point.  Actually, it is not very clear 
to me that adding too much functionality to NumPy is going to be a good 
idea for every case.  For example, lately I was thinking in that it 
would be a good idea to support column-wise structured arrays (the 
current ones are row-wise), but provided that they can be trivially 
reproduced with a combination of dictionaries and plain arrays I think 
now that implementing that in NumPy has not much sense.

Similarly, and as you said, having:

l = [rec[n] for n in ['name', 'age']]

or, if a dictionary is wanted instead:

d = dict((n,rec[n]) for n in ['name', 'age'])

would admittedly cover many of the needs of users.  In addition, one can 
get a record array easily from the above dictionary:

newrec = np.rec.fromarrays(d.values(), names=d.keys())

Having said that, I still see some value in implementing 
arr[['name', 'age']], but frankly, I'm not so sure now whether this 
idiom would be much better than:

d = dict((n,rec[n]) for n in ['name', 'age'])
newrec = np.rec.fromarrays(d.values(), names=d.keys())

or than the already implemented drop_fields() function in 
np.lib.recfunctions.

So, I'm +0 on the proposal now.

> So I think it would be nice to have some concrete examples of what
> the new feature will be useful for, just to show how it outweighs
> those negatives.  As a bonus, they'd provide nice examples to put in
> the documentation :).

Yeah, I completely agree that this would be a nice excercise to do: for 
every new asked feature, first look if it can be done easily with a 
combination of the current weaponeries of Python and NumPy together.
That would lead to a simple and powerful NumPy.

> PS.  Thanks for your work on pytables!  I've used it quite a bit,
> mostly for reading hdf5 files.

My pleasure.

-- 
Francesc Alted