[SciPy-dev] [SciPy-user] efficiently importing ascii data
Darren Dale
dd55 at cornell.edu
Mon Nov 14 06:36:31 EST 2005
On Sunday 13 November 2005 11:58 pm, you wrote:
> Darren Dale wrote:
> >On Sunday 13 November 2005 10:46 pm, you wrote:
> >>Darren Dale wrote:
> >>>I am considering using scipy's fromfile function, which gives a big
> >>> speed boost over io.read_array, but I don't understand what this
> >>> docstring is trying to tell me:
> >>>
> >>> WARNING: This function should be used sparingly, as it is not
> >>> a robust method of persistence. But it can be useful to
> >>> read in simply-formatted or binary data quickly.
> >>
> >>It's simply trying to advertise that fromfile and tofile are very raw
> >>functions. They should work fine as far as they go, but there may be
> >>easier solutions. I don't expect the capability of these to increase.
> >>But, for example, something like a TableIO could take advantage of them.
> >
> >I was wondering if the fromstring function could be expanded to include
> > ascii strings.
>
> Hmm. Interesting concept. Do you mean in the same way that fromfile
> works, a single type and a simple separator? That would be consistent
> wouldn't it...
Yes, a single type and a single separator is what I had in mind.
> The other thing that needs to be done is that string (and unicode)
> arrays need to convert to the numeric types, easily. This would let you
> read in a string array and convert to numeric types quite cleanly.
For the longest time, I thought the this ability must exist, but that I was
just overlooking it somehow.
> Right now, this doesn't work simply because the wrong generic functions
> are getting called in the conversion routines. This can and should be
> changed, however. The required code to change is in arraytypes.inc.src.
>
> I could see using PyInt_FromString, PyLong_FromString,
> PyFloat_FromString, and calling the complex python function for
> constructing complex numbers from a string. Code would need to be
> written to fully support Long double and complex long double conversion,
> but initially they could just punt and use the double conversions.
>
> Alternatively, sscanf could be called when available for the type, and
> the other approaches used when it isn't.
>
> Anybody want a nice simple project to get themselves up to speed with
> the new code base :-)
I'll look into it (after hours, just started a new job). I haven't worked much
with C or wrapping C, and I need to learn sometime.
Darren
More information about the SciPy-Dev
mailing list