[SciPy-dev] [SciPy-user] efficiently importing ascii data

Darren Dale dd55 at cornell.edu
Mon Nov 14 06:36:31 EST 2005


On Sunday 13 November 2005 11:58 pm, you wrote:
> Darren Dale wrote:
> >On Sunday 13 November 2005 10:46 pm, you wrote:
> >>Darren Dale wrote:
> >>>I am considering using scipy's fromfile function, which gives a big
> >>> speed boost over io.read_array, but I don't understand what this
> >>> docstring is trying to tell me:
> >>>
> >>>   WARNING: This function should be used sparingly, as it is not
> >>>   a robust method of persistence.  But it can be useful to
> >>>   read in simply-formatted or binary data quickly.
> >>
> >>It's simply trying to advertise that fromfile and tofile are very raw
> >>functions.   They should work fine as far as they go, but there may be
> >>easier solutions.  I don't expect the capability of these to increase.
> >>But, for example, something like a TableIO could take advantage of them.
> >
> >I was wondering if the fromstring function could be expanded to include
> > ascii strings.
>
> Hmm.  Interesting concept.  Do you mean in the same way that fromfile
> works, a single type and a simple separator?   That would be consistent
> wouldn't it...

Yes, a single type and a single separator is what I had in mind.

> The other thing that needs to be done is that string (and unicode)
> arrays need to convert to the numeric types, easily.  This would let you
> read in a string array and convert to numeric types quite cleanly.

For the longest time, I thought the this ability must exist, but that I was 
just overlooking it somehow.

> Right now, this doesn't work simply because the wrong generic functions
> are getting called in the conversion routines.  This can and should be
> changed, however.  The required code to change is in arraytypes.inc.src.
>
> I could see using PyInt_FromString, PyLong_FromString,
> PyFloat_FromString, and calling the complex python function for
> constructing complex numbers from a string.   Code would need to be
> written to fully support Long double and complex long double conversion,
> but initially they could just punt and use the double conversions.
>
> Alternatively, sscanf could be called when available for the type, and
> the other approaches used when it isn't.
>
> Anybody want a nice simple project to get themselves up to speed with
> the new code base :-)

I'll look into it (after hours, just started a new job). I haven't worked much 
with C or wrapping C, and I need to learn sometime.

Darren




More information about the SciPy-Dev mailing list