[SciPy-dev] reading ascii files into arrays

eric jones ejones17 at austin.rr.com
Thu Oct 25 09:06:22 EDT 2001


> Hi Eric !
>
> I have decided to use python & scipy for data evaluation/presentation for
> the future.

Excellent.  Glad to hear it.

> One thing that I did not find in scipy (maybe I did not look
> carefully enough) are commands for simple reading-in of ascii data from
data files
> with simple contents like:
>
> 1 2
> 2 4
> 3 9
> 4 7
> 5 9
> 6 1
>
> Nothing more than 2 simple blank-seperated vectors in two columns. I have
> tried it in pure python with the readline command an splitting the lines
into
> the blank-sepatated elements, but I think it is not very elegant since it
> takes too many commands to do so.
> Reading in of data from an external file is such a very fundamental think
in
> science that there should be a simple solution (imagine handling big data
> sets).
>
> Do you see a possibility to make it easier ?

I agree this is important.  Travis Oliphant's scipy.io package has a module
called array_import that I think will do what you want.  It was adapted from
some of Konrad Hinsen's work.

file: test.dat
1 2
2 4

>>> from scipy import *
>>> import scipy.io.array_import
>>> help(scipy.io.array_import.read_array)

        read_array(fileobject, separator=None, columns=None, comment='#',
lines=None,
            atype='d', linesep='\n', rowsize=10000, missing=0)

        Return an array containing the data from file |fileobject|.

>>> q = scipy.io.array_import.read_array('c:\\test.dat')
>>> q
array([[ 1.,  2.],
       [ 2.,  4.]])

While this is available, I think it is buried way to deep in scipy.  We need
to pull the functionality up a level or two.  As you say, this is a very
common need and it should be made easily accessible.  I also think it should
be augmented to read from a string instead of just from a file -- although I
guess this would need another name.

The code for read_array is at the bottom of this file:

http://scipy.net/cgi-bin/viewcvs.cgi/scipy/io/array_import.py?rev=1.5&conten
t-type=text/vnd.viewcvs-markup

Also:
There has been a short discussion about the merits of MS Excel (which counts
me as a user) on the scipy-dev list, and this is one of the things it does
extremely well.  It has an import capability that lets you split data in a
fixed format or character delimited way.  Whenever I need a quick and dirty
plot of data in a file, I always reach for Excel to do this.  It'd be nice
to have something similar to Excel's import wizard available to those who
like that sort of thing.  For now, read_array will work quite nicely.

see ya,
eric







More information about the SciPy-Dev mailing list