Dealing with binary data...

Thomas A. Bryan tbryan at python.net
Sat Mar 4 12:19:09 EST 2000


I'm trying to work with a data file format defined by Fortran programmers.
I'd like to write some Python to read and write the data.  I like 
Python's struct because I can simply specify '<' at the beginning of 
the format string to guarantee a platform independent reader/writer 
for this format.

I'm hitting one problem.  The format contains a fixed-size series of 
4-byte (little-endian) floats.  When there isn't enough data to fill 
up the file, each float is padded with the bit pattern of 
ff7f ff7f
The file format definition explains it as one (little-endian) integer
32767 in each of the float's two bytes.  

Here's the problem:

Python 1.5.2 (#1, Apr 18 1999, 16:03:16)  [GCC pgcc-2.91.60 19981201 
(egcs-1.1.1  on linux2
Copyright 1991-1995 Stichting Mathematisch Centrum, Amsterdam
>>> import struct
>>> struct.pack('<hh',32767,32767)
'\377\177\377\177'
>>> struct.unpack('<f','\377\177\377\177')
(6.79235465281e+38,)
>>> floatNum = struct.unpack('<f','\377\177\377\177')[0]
>>> struct.pack('<f',floatNum)
Traceback (innermost last):
  File "<stdin>", line 1, in ?
OverflowError: float too large to pack with f format

It seems odd to me that struct will unpack a value that it can't 
repack.  I haven't looked at the structmodule code yet to see 
what's going on.  

If this behavior is expected, then I suppose that I'll have to 
unpack each float twice...once into a pair of shorts (to check 
for the "no data" values) and then into a float (if the data is 
present).  Then, when I output the data, I'll have to check 
each float.  If it's None, pack the two ints.  If it's not None, 
pack the float.

starting-to-consider-a-C-extension-instead-ly yours
---Tom



More information about the Python-list mailing list