Reading unformatted big-endian files

John Machin sjmachin at lexicon.net
Fri Aug 11 06:17:07 EDT 2006


Andrea Gavana wrote:

> "err=8" means that, if an error occours in
> reading the file,
> it should go to the label "8 continue" and continue reading the file

Silently ignoring errors when reading a file doesn't sound like a good
idea to me at all, especially if different records have different
formats.

>
> Well, does anyone have some suggestion about which kind of
> material/tutorial on similar things I should read? How can I deal in
> Python with variables that must be 8-chars or 4-chars in order to read
> correctly the file?

(a) read the docs on the struct module
(b) eyeball this rough untested translation:
8<---
def filereader(filename):
    import struct
    f = open(fname, 'rb') # 'rb' is read binary, very similar to C
stdio
    fmt = '>8si4s'
    # Assuming unformatted means binary,
    # and integer means integer*4, which is signed.
    # Also assuming that the 3-variable records are fixed-length.
    fmtsz = struct.calcsize(fmt)
    while True:
        buff = f.read(fmtsz)
        if not buff: # EOF
            break
        keyword, number, keytype = struct.unpack(fmt)
        keyword = keyword.rstrip() # remove trailing spaces
        keytype = keytype.rstrip()
        if keyword == 'DIMENS':
            # 'dimens' is neither declared nor initialised in the
FORTRAN
            # so I'm just guessing here ...
            buff2 = f.read(4)
            dimens = struct.unpack('>i', buff2)
            break
        print keyword, number, keytype # or whatever
    # reached end of file (dimens *NOT* defined),
    # or gave up (dimens should have a value)
    f.close() # not absolutely necessary especially when only reading

if __name__ == "__main__":
    import sys
    filereader(sys.argv[1])
8<---

If this doesn't work, and it's not obvious how to fix it, it might be a
good idea when you ask again if you were to supply a
FORTRAN-independent layout of the file, and/or a dump of a short test
file that includes the DIMENS/dimens caper -- you can get such a dump
readily with the *x od command or failing that, use Python:

#>>>repr(open('thetestfile', 'rb').read(100)) # yes, I said *short*

HTH,
John




More information about the Python-list mailing list