Reading a binary file

Thu Jun 26 17:52:50 EDT 2003

Sorin Marti <mas at semafor.ch> wrote in message news:<mailman.1056637113.19927.python-list at python.org>...
> Peter Hansen wrote:
> > Sorin Marti wrote:
> > 
> >>I am quite new to python and very new to this list.
> >>
> >>I've got following problem. I have a binary file which contains
> >>information I should read. I can open the file with
> > 
> > [snip]
> > 
> > It would really be best if you could describe in more detail
> > what you are trying to do with this data.  Bytes are bytes,
> > and things like hex and binary are just different _representations_
> > of bytes, so whether you want binary, hex, decimal, or something
> > else depends entirely on the use to which you will put the info.
> > 
> 
> Hi Peter,
> 
> Ok I'll try to give more details. I have a Siemens SPS. With an SPS you 
> can controll machines such as pumps or motors or anything else. To 
> controll you have to set Variables. If you want to see which state these 
> variables have you can get a file via ftp where these values are stored. 
> This is what I have done. Now I have a file (called cpu1db2.dat) and 
> this file has a length of 16 bytes.
> 
> Byte Number/Length   Type       Hex-Value
> ----------------------------------------------------------------
> Byte              1: Boolean:   01    (which is true, 00 would be false)
> Byte              2: Byte:      11   (This data type is called byte)
> Byte              3: Char:      50    (Which should be a "P")
> Byte            4,5: Word       00 00
> Byte            6,7: Integer    22 04
> Byte      8,9,10,11: DoubleWord D2 00 00 BB
> Byte 12,13,14,15,16: Real       BB 42 C8 00 00
> 
> 

As some others described the struct module should do the right work:
>>> import struct
### This should be your data to read from a file into a string x.
>>> x='\x01\x11P\x00\x00\x22\x04\xd2\x00\x00\xbb\xbb\x42\xc8\x00\x00'
### The format to decode your data except the Real.
### The **>** is necessary because your data come little-endian.
>>> decode_fmt='>BBcHHI'
>>> (Boolean,Byte,Char,Word,Integer,DoubleWord)=struct.unpack(decode_fmt,x[:-5])
>>> format="""
...           Boolean   : %02X
...           Byte      : %02X
...           Char      : %s
...           Word      : %04X
...           Integer   : %04X
...           DoubleWord: %04X"""
>>> print format%(Boolean,Byte,Char,Word,Integer,DoubleWord)

          Boolean   : 01
          Byte      : 11
          Char      : P
          Word      : 0000
          Integer   : 2204
          DoubleWord: D20000BB

> So I have written a python class which makes a connection to the 
> ftp-server (on the SPS) and gets the file.
> Then there is a function where you can call a value with a startbyte and 
> an endbyte. You also have to specify the type. That means you can call
> getValue('REAL',12,16) and you should get back 100 because if you have 

I guess your Real is a 4-Byte Realvalue and you meant: '42 C8 00 00'
what is the value of your binary representation.

> the binary value of 'BB 42 C8 00 00' is 01000010110010000000000000000000 
> , first digit is the Sign (which is + or - ), next 8 digits are the 
> exponent, in this case 10000101 = 133dec. Now you take away 127 from 133 
> then you get six, thats the exponent. The rest 
> (110010000000000000000000) has a hex value of C80000 this is 13107200 
> decimal. Now you have to multiply 13107200 with 2^6 and 2^-23 and you 
> get (tataaaaaa!): 100!
> 
I'm not quite sure if I understand the format your're describing above.
I dealed some time ago with IEEE and some AMD FPU-Format but you seem
to me to describe a format where the exponent goes over Bytelimits.

A function could be somewhat as the following:
>>> def str2real(s):
... 	sign     = ord(s[0])&0x80 and -1 or 1
... 	expo     = ((ord(s[0])&0x7f)<<1) + (ord(s[1])>>7) - 127 
... 	mantissa = ((ord(s[1])<<16)|0x80) + (ord(s[2])<<8) + ord(s[3])
... 	print 'sign=%d, expo=%d, mantissa=%06X'%(sign,expo,mantissa)
... 	return sign*2**(expo-23)*mantissa
... 
>>> str2real(x[-4:])
sign=1, expo=6, mantissa=C80080
100.0009765625

Thoug I'm not sure if I hit the goal, cause normally the exponent is
in 6 or 7 or 8 Bits 2's complement and then there would be a - 64 or
- 128 or - 256 instead of - 127 in the algo. Also a 23 Bit mantissa
seems a bit strange to me. Even if so the 24rth Bit is **1**
by default, why I put **|0x80**.
With an excact description of your Real-Format the solution would
be a "Klacks".

> The different data types need different calculations, that's why I asked 
> a few things about changing the representation because I only can do 
> some things in binary mode or hex mode.
> 
> Cheers
>     Sorin Marti

Regards Peter