text file to array and back

Kevin Russell prairiesasquatch at crosswinds.net
Sun Mar 18 23:31:06 EST 2001


Andreas Kremer wrote:

> Hi,
>
> i am trying to write a python class to make access to a collection of
> scientific analysis programs easier for me. In this case I need to read a
> file which consists of a time series, collect it into an array (numpy
> preferred) and after some operations write it back to disk as a text file.
> My solutions are attached. These are my questions:
> 1. as I am new to python i suspect my joining and splitting not to be very
> efficient, especially the joining.

The reason you're getting those square brackets and commas is that
you're converting the entire list for a line to a string *first*, and
only then worrying about joining it, which is too late, since you're
now dealing with just a single string.

You could use:
    dataline = string.join(map(str, line))

or:
    for line in array:
        for element in line:
            file.write("%s " % element)
        file.write("\n")

(though that's probably less efficient I/O-wise.)

or even, if you're willing to mess around with the definition of
sys.stdout:

    realstdout = sys.stdout
    sys.stdout = open(...)
    for line in array:
        for element in line:
            print element,
        print
    sys.stdout = realstdout

> 2. Is there a way while reading the file
> to store the data immediately into the array?

If you're asking "Is there a magical procedure that will already do
exactly what I want?", there probably is, but I don't know it.

If all you're asking is "Can I just read one line at a time instead
of reading the whole file into memory first?", then definitely.
In Python 2.1, just use xreadlines() instead of readlines().
In older versions of Python, you can get the same effect using
the fileinput module.


> 3. I am concerned about a memory leak. what happens to self.array if I read
> the file again? It obviously doesnt append the data to the old, but are the
> old data deleted from the memory?

Yes.  As soon as Python figures out that nobody is referring to the
old object any more, it will be garbage-collected.


>
> Thanks in advance. Andreas Kremer.
>
>     def __createarray(self):
>         "creates an array of the outputfile"
>         # open file, transfer content into list self.array
>         file = open(self.outputfile, 'r')
>         for line in file.readlines():
>             dataline = map(float, string.split(line))
>             self.array.append(dataline)
>         file.close()
>         # convert list self.array into an array (Numpy)
>         self.array = array(self.array, Float64)
>
>     def __writearray(self):
>         "writes array to outputfile"
>         # open file, format array and write it to disk
>         file = open(self.outputfile, 'w')
>         for line in self.array:
>             dataline = string.join(str(line),'')
>             dataline = dataline+'\n'
>             dataline = string.replace(dataline,'[','')
>             dataline = string.replace(dataline,']','')
>             dataline = string.replace(dataline,',','')
>             file.write(dataline)
>         file.close()




More information about the Python-list mailing list