[Numpy-discussion] Adding the ability to "clone" a few fields from a data-type

Robert Kern robert.kern at gmail.com
Wed Oct 29 20:26:36 EDT 2008


On Wed, Oct 29, 2008 at 19:05, Travis E. Oliphant
<oliphant at enthought.com> wrote:
>
> Hi all,
>
> I'd like to add to NumPy the ability to clone a data-type object so that
> only a view fields are copied over but that it retains the same total size.
>
> This would allow, for example, the ability to "select out a few records"
> from a structured array using
>
> subarr = arr.view(cloned_dtype)
>
> Right now, it is hard to do this because you have to at least add a
> "dummy" field at the end.  A simple method on the dtype class
> (fromfields or something) would be easy to add.

I'm not sure what this accomplishes. Would the dummy fields that fill
in the space be inaccessible? E.g. tuple(subarr[i,j,k]) gives a tuple
with no numpy.void scalars? That would be a novel feature, but I'm not
sure it fits the problem. On the contrary:

> It was thought in the past to do this with indexing
>
> arr['field1', 'field2']
>
> And that would still be possible (and mostly implemented) if this
> feature is added.

This appears more like the interface that people want. Except that I
think people were thinking that it would follow fancy indexing syntax:

  arr[['field1', 'field2']]

I guess there are two ways to implement this. One is to make a new
array that just contains the desired fields. Another is to make a view
that just points to the desired fields in the original array provided
that we have a new feature for inaccessible dummy fields. One point
for the former approach is that it is closer to fancy indexing which
must always make a copy. The latter approach breaks that connection.

OTOH, now that I think about it, I don't think there is really any
coherent way to mix field selection with any other indexing
operations. At least, not within the same brackets. Hmm. So maybe the
link to fancy indexing can be ignored as, ahem, fanciful.

Overall, I guess, I would present the feature slightly differently.
Provide a kind of inaccessible and invisible dtype for implementing
dummy fields. This is useful in other places like file parsing. At the
same time, implement a function that uses this capability to make
views with a subset of the fields of a structured array. I'm not sure
that people need an API for replacing the fields of a dtype like this.

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth."
  -- Umberto Eco



More information about the NumPy-Discussion mailing list