[SciPy-dev] reading ascii files into arrays
eric jones
ejones17 at austin.rr.com
Thu Oct 25 09:06:22 EDT 2001
> Hi Eric !
>
> I have decided to use python & scipy for data evaluation/presentation for
> the future.
Excellent. Glad to hear it.
> One thing that I did not find in scipy (maybe I did not look
> carefully enough) are commands for simple reading-in of ascii data from
data files
> with simple contents like:
>
> 1 2
> 2 4
> 3 9
> 4 7
> 5 9
> 6 1
>
> Nothing more than 2 simple blank-seperated vectors in two columns. I have
> tried it in pure python with the readline command an splitting the lines
into
> the blank-sepatated elements, but I think it is not very elegant since it
> takes too many commands to do so.
> Reading in of data from an external file is such a very fundamental think
in
> science that there should be a simple solution (imagine handling big data
> sets).
>
> Do you see a possibility to make it easier ?
I agree this is important. Travis Oliphant's scipy.io package has a module
called array_import that I think will do what you want. It was adapted from
some of Konrad Hinsen's work.
file: test.dat
1 2
2 4
>>> from scipy import *
>>> import scipy.io.array_import
>>> help(scipy.io.array_import.read_array)
read_array(fileobject, separator=None, columns=None, comment='#',
lines=None,
atype='d', linesep='\n', rowsize=10000, missing=0)
Return an array containing the data from file |fileobject|.
>>> q = scipy.io.array_import.read_array('c:\\test.dat')
>>> q
array([[ 1., 2.],
[ 2., 4.]])
While this is available, I think it is buried way to deep in scipy. We need
to pull the functionality up a level or two. As you say, this is a very
common need and it should be made easily accessible. I also think it should
be augmented to read from a string instead of just from a file -- although I
guess this would need another name.
The code for read_array is at the bottom of this file:
http://scipy.net/cgi-bin/viewcvs.cgi/scipy/io/array_import.py?rev=1.5&conten
t-type=text/vnd.viewcvs-markup
Also:
There has been a short discussion about the merits of MS Excel (which counts
me as a user) on the scipy-dev list, and this is one of the things it does
extremely well. It has an import capability that lets you split data in a
fixed format or character delimited way. Whenever I need a quick and dirty
plot of data in a file, I always reach for Excel to do this. It'd be nice
to have something similar to Excel's import wizard available to those who
like that sort of thing. For now, read_array will work quite nicely.
see ya,
eric
More information about the SciPy-Dev
mailing list