Question about reading a big binary file and write it into several text (ascii) files
Bengt Richter
bokr at oz.net
Mon Jan 24 22:56:38 EST 2005
On 24 Jan 2005 12:44:32 -0800, "Albert Tu" <sjtu.0992 at gmail.com> wrote:
>Hi,
>
>I am learning and pretty new to Python and I hope your guys can give me
>a quick start.
>
>I have an about 1G-byte binary file from a flat panel x-ray detector; I
>know at the beggining there is a 128-byte header and the rest of the
>file is integers in 2-byte format.
It looks like 16-bit pixels in the 1024*768 images, I assume
>
>What I want to do is to save the binary data into several smaller files
>in integer format and each smaller file has the size of 2*1024*768
>bytes.
You could do that, but why duplicate so much data that you may never look at?
E.g., why not a class that provides a view of your big file in terms of an image index
and returns an efficient array in memory e.g., (untested)
import array
def getimage(n, f, offset=128):
f.seek(offset+n*2*1024*768)
return array('H', f.read(2*1024*768)) # 'H' is for unsigned 2-byte integers (check endianness for swap need!)
Then usage would be
imfile = open('big_file.bin', 'rb')
imarray = getimage(23, imfile)
And you could get pixel x,y by
xpix, ypix = imarray[x+y*1024] # or maybe x*768+y etc.
or your could make getimage a method of a class that you intialize with
the file and which could maintain an lru cache of images
with a particular disk directory as backup, etc. etc. and would provide
images wrapped with nice methods to support whatever you are doing with the images.
>
>I know I can do something like
>>>>f=open("xray.seq", 'rb')
>>>>header=f.read(128)
>>>>file1=f.read(2*1024*768)
>>>>file2=f.read(2*1024*768)
>>>>......
>>>>f.close()
>
>Bur I don't them how to save files in integer format (converting from
>binary to ascii files) and how to do this in an elegant and snappy way.
Best is probably to leave the original format alone, e.g., (untested and needs try/except)
this should split the big file into individual image files named file0.ximg .. filen.ximg
f = open('xray.seq/, 'rb')
header = f.read(128)
nfile = 0
while 1:
im = f.read(2*1024*768)
if not im: break
if len(im) != 2*1024*768: print 'broken tail of %s bytes'%len(im); break
fw = open('file%s.ximg' % nfile, 'wb')
fw.write(im)
fw.close()
nfile +=1
then you could use getimage above with offset passed as 0 and image number 0, e.g.,
im23 = getimage(0, open('file23.ximg','rb'), 0) # img 0, offset 0
But then you might wonder about all those separate files, unless you want to
put them on a series of CDs where they wouldn't all fit on one. Whatever ;-)
You will probably lose in both speed and space if you try to make some kind
of ascii disk files. You aren't thinking XML are you??!! For this, definitely ick ;-)
>
What you want to do will depend on the big picture, which is not apparent yet ;-)
>
>Please reply when you guyes can get a chance.
>Thanks,
Sorry to give nothing but untested suggestion, but I have to go, and I
will be off line mostly for a while.
Regards,
Bengt Richter
More information about the Python-list
mailing list