scanf in python...?

chris_barker at my-deja.com chris_barker at my-deja.com
Thu Oct 19 13:51:16 EDT 2000


In article <mailman.971935007.2926.python-list at python.org>,
> Soapbox mode: FWIW, I think there is a significant disparity between
> Python's data input and output facilities. It has good output
formatting,
> but nothing comparably concise for input. I often have to write
one-off
> programs to process large amounts of tabular output from other
programs,
> usually a mixture of floats and ints, with the occasional string for
good
> measure. Having to write a pile of tedious code to parse this sort of
guff
> is irritating, considering the relative ease with which you can
accomplish
> other things in Python. Still, few people seem to complain about this,
so I
> guess it's not a common issue.

Well, it's a common problem for me! (and I'm sure a lot of other
people). Common enough that I'm very close to writing a C extension to
deal with the problem.

There have been discussions about this in the past, and I think there
may be a FAQ on it. I believe Guido's response was that the string and
re modules are much for flexible than *scanf, so you should just use
those. Frankly, I think that's not a vey "Pythonesque" approach. In
Python, there should be an easy and obvious way to do things that are
very common (fast would be nice as well).

If I wrote something now, I would probably do something simple, modeled
after MATLAB's "fscanf".

MATLAB's fscanf (it has an sscanf as well, BTW) is really just a simple
extension of C's fscanf. The extension is that it is "vectorised". what
it does is keep repeating the format specifier until either: 1) the end
of file is reached, or 2) you have filled the size matrix that you
specified. Some examples:

V = fscanf(file,'%g')
# to create a vector with all the numbers in file (whitespace
delimited).
M = fscanf(file,'%g',[m,n])
# to create a m by n matrix with the next m*n numbers in file.
M = fscan(file,'%g',[m,inf])
# to create a m by ? matrix with all the numbers in the file.

Note that it starts at the current position in the file, and leaves the
position in place when it is done, so different calls can be combined
easily to read a large variety of text file formats. MATLAB only stores
matrixes of C doubles, so it's a little easier than it would be for
Python.

For my purposes, I would probably just have it create Numpy arrays,
since that's usually what I need if I am reading a lot of numbers from a
file.

If anyone is interested in helping me with this project, I would love to
get input, and hopefully coding, testing and debugging help. Send me a
note if you are interested

-Chris

cbarker at jps.net


Sent via Deja.com http://www.deja.com/
Before you buy.



More information about the Python-list mailing list