[SciPy-user] Re: io.read_array slow

Thu Sep 29 13:46:17 EDT 2005

Arnd Baecker wrote:
> On Wed, 28 Sep 2005, Christian Kristukat wrote:
> 
> 
>>Jordan H. Mantha wrote:
>>
>>>Christian Kristukat wrote:
>>>
>>>
>>>>Hi,
>>>>I noticed that io.read_array is really slow compared to a python
>>>>while/readline/split loop (about 5 times slower). I believe to remember
>>>>that at some time it was written in C but when I looked at the source of
>>>>scipy 0.3.2 it seems to be pure python. Is there a evident reason why it
>>>>must be that slow or are there maybe plans to rewrite it in C?
>>>>Regards,
>>>>Christian
>>>
>>>I have noticed this too. I was wrote a fitting script that reads in a 3
>>>column data file that has 502,000 rows. I first started with io.read_array
>>>and it took > 10 min. to get the array in. I was reading "Python Scripting
>>>for Computational Science" by Hans Petter Langtangen and used the
>>>following:
>>>
>>>	data = array([float(x) for x in infile.read().split()], Float)
>>>        data.shape = (len(data)/3,3)
>>>
>>
>>Nice short solution indeed. I think read_array could be cut down to something
>>less versatile such that it can handle different separators and works at C speed.
> 
> 
> You could also have a look at TableIO,
>    http://php.iupui.edu/~mmiller3/python/

That's fast! So we really should have some C coded IO module. TableIO is GPLed,
so it's not possible to include it directly according to what I've learned here
in the last weeks. Maybe I'll write a replacement some day.
Regards, Christian