[Numpy-discussion] Selection of only a certain number of fields
Francesc Alted
faltet at pytables.org
Mon Feb 9 03:34:00 EST 2009
A Sunday 08 February 2009, Neil escrigué:
> > The first one (and most important IMO), is that newarr continues to
> > be an structured array (BTW, when changed this name from the
> > original record array?), and you can use all the features of these
> > beasts with it. Other reason (albeit a bit secondary) is that its
> > data buffer can be shared through the array interface with other
> > applications, or plain C code, in a relatively straightforward way.
> > However, if newarr becomes a list (or dictionary), this is simply
> > not possible.
> >
> > Cheers,
>
> That's not a sample use case ;)
>
> One of the things I love about Python is that it has a small core set
> of features and tries to avoid having many ways to do the same thing.
> This makes it extremely easy to learn. With every new feature,
> numpy gets a little bit harder to learn, there's more to document and
> the code base gets larger and so harder to maintain. In those
> senses, whenever you add a new function/feature to numpy, it gets a
> little bit worse.
Mmm, you have made another good point. Actually, it is not very clear
to me that adding too much functionality to NumPy is going to be a good
idea for every case. For example, lately I was thinking in that it
would be a good idea to support column-wise structured arrays (the
current ones are row-wise), but provided that they can be trivially
reproduced with a combination of dictionaries and plain arrays I think
now that implementing that in NumPy has not much sense.
Similarly, and as you said, having:
l = [rec[n] for n in ['name', 'age']]
or, if a dictionary is wanted instead:
d = dict((n,rec[n]) for n in ['name', 'age'])
would admittedly cover many of the needs of users. In addition, one can
get a record array easily from the above dictionary:
newrec = np.rec.fromarrays(d.values(), names=d.keys())
Having said that, I still see some value in implementing
arr[['name', 'age']], but frankly, I'm not so sure now whether this
idiom would be much better than:
d = dict((n,rec[n]) for n in ['name', 'age'])
newrec = np.rec.fromarrays(d.values(), names=d.keys())
or than the already implemented drop_fields() function in
np.lib.recfunctions.
So, I'm +0 on the proposal now.
> So I think it would be nice to have some concrete examples of what
> the new feature will be useful for, just to show how it outweighs
> those negatives. As a bonus, they'd provide nice examples to put in
> the documentation :).
Yeah, I completely agree that this would be a nice excercise to do: for
every new asked feature, first look if it can be done easily with a
combination of the current weaponeries of Python and NumPy together.
That would lead to a simple and powerful NumPy.
> PS. Thanks for your work on pytables! I've used it quite a bit,
> mostly for reading hdf5 files.
My pleasure.
--
Francesc Alted
More information about the NumPy-Discussion
mailing list