[Numpy-discussion] deprecate fromstring() for text reading?

Chris Barker - NOAA Federal chris.barker at noaa.gov
Fri Oct 23 20:22:55 EDT 2015


Grabbing the pandas csv reader would be great, and I hope it happens sooner
than later, though alas, I haven't the spare cycles for it either.

In the meantime though, can we put a deprecation Warning in when using
fromstring() on text files? It's really pretty broken.

-Chris

On Oct 23, 2015, at 4:02 PM, Jeff Reback <jeffreback at gmail.com> wrote:



On Oct 23, 2015, at 6:49 PM, Nathaniel Smith <njs at pobox.com> wrote:

On Oct 23, 2015 3:30 PM, "Jeff Reback" <jeffreback at gmail.com> wrote:
>
> On Oct 23, 2015, at 6:13 PM, Charles R Harris <charlesr.harris at gmail.com>
wrote:
>
>>
>>
>> On Thu, Oct 22, 2015 at 5:47 PM, Chris Barker - NOAA Federal <
chris.barker at noaa.gov> wrote:
>>>
>>>
>>>> I think it would be good to keep the usage to read binary data at
least.
>>>
>>>
>>> Agreed -- it's only the text file reading I'm proposing to deprecate.
It was kind of weird to cram it in there in the first place.
>>>
>>> Oh, fromfile() has the same issues.
>>>
>>> Chris
>>>
>>>
>>>> Or is there a good alternative to `np.fromstring(<bytes>,
dtype=...)`?  -- Marten
>>>>
>>>> On Thu, Oct 22, 2015 at 1:03 PM, Chris Barker <chris.barker at noaa.gov>
wrote:
>>>>>
>>>>> There was just a question about a bug/issue with scipy.fromstring
(which is numpy.fromstring) when used to read integers from a text file.
>>>>>
>>>>> https://mail.scipy.org/pipermail/scipy-user/2015-October/036746.html
>>>>>
>>>>> fromstring() is bugging and inflexible for reading text files -- and
it is a very, very ugly mess of code. I dug into it a while back, and gave
up -- just to much of a mess!
>>>>>
>>>>> So we really should completely re-implement it, or deprecate it. I
doubt anyone is going to do a big refactor, so that means deprecating it.
>>>>>
>>>>> Also -- if we do want a fast read numbers from text files function
(which would be nice, actually), it really should get a new name anyway.
>>>>>
>>>>> (and the hopefully coming new dtype system would make it easier to
write cleanly)
>>>>>
>>>>> I'm not sure what deprecating something means, though -- have it
raise a deprecation warning in the next version?
>>>>>
>>
>> There was discussion at SciPy 2015 of separating out the text reading
abilities of Pandas so that numpy could include it. We should contact Jeff
Rebeck and see about moving that forward.
>
>
> IIRC Thomas Caswell was interested in doing this :)

When he was in Berkeley a few weeks ago he assured me that every night
since SciPy he has dutifully been feeling guilty about not having done it
yet. I think this week his paltry excuse is that he's "on his honeymoon" or
something.

...which is to say that if someone has some spare cycles to take this over
then I think that might be a nice wedding present for him :-).

(The basic idea is to take the text reading backend behind pandas.read_csv
and extract it into a standalone package that pandas could depend on, and
that could also be used by other packages like numpy (among others -- I
thing dato's SFrame package has a fork of this code as well?))

-n

_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion at scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


I can certainly provide guidance on how/what to extract but don't have
spare cycles myself for this :(

_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion at scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20151023/4936d1ac/attachment.html>


More information about the NumPy-Discussion mailing list