[Numpy-discussion] Setting custom dtypes and 1.14

Chris Barker chris.barker at noaa.gov
Mon Jan 29 12:38:02 EST 2018


On Sat, Jan 27, 2018 at 8:50 PM, Allan Haldane <allanhaldane at gmail.com>
wrote:

> On 01/26/2018 06:01 PM, josef.pktd at gmail.com wrote:
>
>>     I thought recarrays were pretty cool back in the day, but pandas is
>>     a much better option.
>>
>>     So I pretty much only use structured arrays for data exchange with C
>>     code....
>>
>> My impression is that this turns into a deprecate recarrays and
>> supporting recfunction issue.
>>
>>

> *should* we have any dataframe-like functionality in numpy?
>
> We get requests every once in a while about how to sort rows, or about
> adding a "groupby" function. I myself have used recarrays in a
> dataframe-like way, when I wanted a quick multiple-array object that
> supported numpy indexing. So there is some demand to have minimal
> "dataframe-like" behavior in numpy itself.
>
> recarrays play part of this role currently, though imperfectly due to
> padding and cache issues. I think I'm comfortable with supporting some
> minor use of structured/recarrays as dataframe-like, with a warning in docs
> that the user should really look at pandas/xarray, and that structured
> arrays are primarily for data exchange.
>

Well, I think we should either:

deprecate recarrays -- i.e. explicitly not support DataFrame-like
functionality in numpy, keeping only the data-exchange functionality as
maintained.

or

Properly support it -- which doesn't mean re-implementing Pandas or xarray,
but it would mean addressing any bug-like issues like not dealing properly
with padding.

Personally, I don't need/want it enough to contribute, but if someone does,
great.

This reminds me a bit of the old numpy.Matrix issue -- it was ALMOST there,
but not quite, with issues, and there was essentially no overlap between
the people that wanted it and the people that had the time and skills to
really make it work.

(If we want to dream, maybe one day we should make a minimal multiple-array
> container class. I imagine it would look pretty similar to recarray, but
> stored as a set of arrays instead of a structured array. But maybe
> recarrays are good enough, and let's not reimplement pandas either.)
>

Exactly -- we really don't need to re-implement Pandas....

(except it's CSV reading capability :-) )

-CHB


-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20180129/887daf22/attachment.html>


More information about the NumPy-Discussion mailing list