dealing with binary files

Peter Otten __peter__ at web.de
Mon Jan 7 09:33:50 EST 2008


Gerardo Herzig wrote:

> Hi all. Im trying to read a binary data from an postgres WAL archive.
> If i make a
> xfile = open('filename', 'rb').xreadlines()
> line = xfile.next()
> 
> i see this sort of thing: 
> ']\xd0\x03\x00\x01\x00\x00\x00\r\x00\x00\x00\x00\x00\x00JM//DI+,D\x00\x00\x00\x01$\x00\x00\x00\x7f\x06\x00\x00y\r\t\x00\x02\x0f\t\x00\x00\x00\x10\x00)\x00\x01\x00\x12\x08 
> \x00^\xc2\x0c\x00\x08\x00\x00\x003001({\xe8\x10\r\x00\x00\x00\xe4\xff\xffI\x10?l\x01@\x00\x00\x00$\x00\x00\x00\x00\n'

file.readline() or similar functions are certainly the wrong approach if
you want to take a look into a binary file. What readline() interprets as
a newline character could actually be part of an integer:

>>> struct.pack("I", 10)
'\n\x00\x00\x00'

> This file suppose to have some information about database activity, but
> at this point i cant do more than this, because i cant figure out what
> to do in order to have some 'readable' text.

The probability is high that the file doesn't contain the information you
want in a human-readable way.

> Im guessing is some C code, im reading the struct module to see if it
> helps, but im not into C programming, and im lost at the start of my
> problem.
> 
> Can someone point me out some advice? Thanks!

Your first and foremost hope is that postgres provides tools to extract
the information you want. You could then call these tools from python
using the subprocess module.

If there are no such tools look for a library in postgres to access WAL
files and use it via ctypes.

If all else fails try to come by a detailed description of the file
format. Here "detailed" includes the meaning of each and every bit. If
you are out of luck that description may come in the form of a few structs
in C source code...

Peter



More information about the Python-list mailing list