How can I Read/Write multiple sequential Binary/Text data files

John Machin sjmachin at lexicon.net
Thu Mar 10 15:48:31 EST 2005


On Thu, 10 Mar 2005 20:06:29 +0200, Christos "TZOTZIOY" Georgiou
<tzot at sil-tec.gr> wrote:

>On 10 Mar 2005 09:41:05 -0800, rumours say that "Albert Tu"
><sjtu.0992 at gmail.com> might have written:
>
>>Dear there,
>>
>>We have an x-ray CT system. The acquisition computer acquires x-ray
>>projections and outputs multiple data files in binary format (2-byte
>>unsigned integer) such as projection0.raw, projection1.raw,
>>projection2.raw ... up to projection500.raw. Each file is
>>2*1024*768-byte big.
>>
>>I would like to read those files and convert to ascii files in %5.0f/n
>>format as projection0.data ... projection500.data so that our
>>visualization software can undersatnd the projection images. I was
>>trying to do this conversion using Python. However, I had troubles
>>declaring the file names using the do-loop index. Anyone had previous
>>experience?   
>
>Regular expressions could help, but if you *know* that these are the filenames,
>you can (untested code):
>
>PREFIX= "projection"
>SUFFIX_I= ".raw"
>SUFFIX_O= ".data"
>
> import glob, struct

import array

DIFFERENT_ENDIAN = True/False

>
>for filename in glob.glob("%s*%s" % (PREFIX, SUFFIX_I)):
>    number= filename[len(PREFIX):-len(SUFFIX_I)]
>    fpi= open(filename, "rb")
>    fpo= open("%s%s%s" % (PREFIX, number, SUFFIX_O), "w")
>    while 1:
>        datum= fpi.read(2)
>        if not datum: break
>        fpo.write("%5d\n" % struct.unpack("H", datum)) # check endianness!!!

If the OP knows that each input file is small enough (1.5Mb each as
stated), then the agony of file.read(2) can be avoided by reading the
whole file in one hit. The agony of struct.unpack() on each datum can
be avoided by using the array module. E.g. replace the whole 'while'
loop by this:

     ary = array.array('H', fpi.read())
     if DIFFERENT_ENDIAN:
         ary.byteswap()
     for datum in ary:
         fpo.write("%5d\n" % datum)

Even if the input files were too large to fit in memory, they could
still be processed fast enough by reading a big chunk at a time.


>    fpi.close()
>    fpo.close()




More information about the Python-list mailing list