msbin to ieee

Tue May 8 18:30:01 EDT 2007

On May 8, 8:15 pm, revuesbio <revues... at gmail.com> wrote:
> On 7 mai, 23:38, John Machin <sjmac... at lexicon.net> wrote:
>
>
>
> > On May 7, 11:37 pm, revuesbio <revues... at gmail.com> wrote:
>
> > > On 7 mai, 14:56, John Machin <sjmac... at lexicon.net> wrote:
>
> > > > On May 7, 10:00 pm, revuesbio <revues... at gmail.com> wrote:
>
> > > > > On 7 mai, 13:21, John Machin <sjmac... at lexicon.net> wrote:
>
> > > > > > On May 7, 6:18 pm, revuesbio <revues... at gmail.com> wrote:
>
> > > > > > > On 7 mai, 03:52, John Machin <sjmac... at lexicon.net> wrote:
>
> > > > > > > > On May 7, 7:44 am, revuesbio <revues... at gmail.com> wrote:
>
> > > > > > > > > Hi
> > > > > > > > > Does anyone have the python version of the conversion from msbin to
> > > > > > > > > ieee?
> > > > > > > > > Thank u
>
> > > > > > > > Yes, Google has it. Google is your friend. Ask Google. It will lead
> > > > > > > > you to such as:
>
> > > > > > > >http://mail.python.org/pipermail/python-list/2005-August/337817.html
>
> > > > > > > > HTH,
> > > > > > > > John
>
> > > > > > > Thank you,
>
> > > > > > > I've already read it but the problem is always present. this script is
> > > > > > > for double precision MBF format ( 8 bytes).
>
> > > > > > It would have been somewhat more helpful had you said what you had
> > > > > > done so far,  even posted your code ...
>
> > > > > > > I try to adapt this script for single precision MBF format ( 4 bytes)
> > > > > > > but i don't find the right float value.
>
> > > > > > > for example : 'P\xad\x02\x95' will return '0.00024924660101532936'
>
> > > > > > If you know what the *correct* value is, you might like to consider
> > > > > > shifting left by log2(correct_value/erroneous_value) :-)
>
> > > > > > Do you have any known correct pairs of (mbf4 string, decimal_float
> > > > > > value)? My attempt is below -- this is based on a couple of
> > > > > > descriptive sources that my friend Google found, with no test data. I
> > > > > > believe the correct answer for the above input is 1070506.0 i.e. you
> > > > > > are out by a factor of 2 ** 32
>
> > > > > > def mbf4_as_float(s):
> > > > > >     m0, m1, m2, m3 = [ord(c) for c in s]
> > > > > >     exponent = m3
> > > > > >     if not exponent:
> > > > > >         return 0.0
> > > > > >     sign = m2 & 0x80
> > > > > >     m2 |= 0x80
> > > > > >     mant = (((m2 << 8) | m1) << 8) | m0
> > > > > >     adj = 24 + 128
> > > > > >     num = mant * 2.0 ** (exponent - adj)
> > > > > >     if sign:
> > > > > >         return -num
> > > > > >     return num
>
> > > > > > HTH,
> > > > > > John
>
> > > > > well done ! it's exactly what i'm waiting for !!
>
> > > > > my code was:>>> from struct import *
> > > > > >>> x = list(unpack('BBBB','P\xad\x02\x95'))
> > > > > >>> x
> > > > > [80, 173, 2, 149]
> > > > > >>> def conversion1(bytes):
>
> > > > > b=bytes[:]
> > > > > sign = bytes[-2] & 0x80
> > > > > b[-2] |= 0x80
> > > > > exp = bytes[-1] - 0x80 - 56
> > > > > acc = 0L
> > > > > for i,byte in enumerate(b[:-1]):
> > > > > acc |= (long(byte)<<(i*8))
> > > > > return (float(acc)*2.0**exp)*((1.,-1.)[sign!=0])
>
> > > > Apart from the 2**32 problem, the above doesn't handle *any* of the
> > > > 2**24 different representations of zero. Try feeding \0\0\0\0' to it
> > > > and see what you get.
>
> > > > > >>> conversion1(x)
>
> > > > > 0.00024924660101532936
>
> > > > > this script come from google groups but i don't understand bit-string
> > > > > manipulation (I'm a  newbie). informations about bit-string
> > > > > manipulation with python is too poor on the net.
>
> > > > The basic operations (and, or, exclusive-or, shift) are not specific
> > > > to any language. Several  languages share the same notation (& | ^ <<
>
> > > > >>), having inherited it from C.
>
> > > > > thank you very much for your script.
>
> > > > Don't thank me, publish some known correct pairs of values so that we
> > > > can verify that it's not just accidentally correct for 1 pair of
> > > > values.
>
> > > pairs of values :
> > > (bytes string, mbf4_as_float(s) result)                        right
> > > float value
> > > ('P\xad\x02\x95', 1070506.0)
> > > 1070506.0
> > > ('\x00\x00\x00\x02', 5.8774717541114375e-039)         0.0
>
> > There is no way that \x00\x00\x00\x02' could represent exactly zero.
> > What makes you think it does? Rounding?
>
> > > ('\x00\x00\x00\x81', 1.0)
> > > 1.0
> > > ('\x00\x00\x00\x82', 2.0)
> > > 2.0
> > > ('\x00\x00@\x82', 3.0)
> > > 3.0
> > > ('\x00\x00\x00\x83', 4.0)
> > > 4.0
> > > ('\x00\x00 \x83', 5.0)
> > > 5.0
> > > ('\xcd\xcc\x0c\x81', 1.1000000238418579)                 1.1
> > > ('\xcd\xcc\x0c\x82', 2.2000000476837158)                  2.2
> > > ('33S\x82', 3.2999999523162842)                              3.3
> > > ('\xcd\xcc\x0c\x83', 4.4000000953674316)                  4.4
>
> > It is not apparent whether you regard the output from the function as
> > correct or not.
>
> > 4.4 "converted" to mbf4 format is '\xcd\xcc\x0c\x83' which is
> > 4.4000000953674316 which is the closest possible mbf4 representation
> > of 4.4 (difference is 9.5e-008).
>
> > The next lower mbf4 value '\xcc\xcc\x0c\x83' is 4.3999996185302734
> > (difference is   -3.8e-007).
>
> > Note that floating-point representation of many decimal fractions is
> > inherently inexact. print repr(4.4) produces 4.4000000000000004
>
> > Have you read this:
> >    http://docs.python.org/tut/node16.html
> > ?
>
> > If you need decimal-fraction output that matches what somebody typed
> > into the original software, or saw on the screen, you will need to
> > know/guess the precision that was involved, and round the numbers
> > accordingly -- just like the author of the original software would
> > have needed to do.
>
> > >>> ['%.*f' % (decplaces, 4.4000000953674316) for decplaces in range(10)]
>
> > ['4', '4.4', '4.40', '4.400', '4.4000', '4.40000', '4.400000',
> > '4.4000001', '4.40000010', '4.400000095']
>
> > HTH,
> > John
>
> another couples and round number corresponding to the right value
>
> ('\x00\x00\x00\x02', 5.8774717541114375e-039, '0.000')
[snip]

> all is ok.
> thank u

I have not yet found a comprehensive let alone authoritative
description of the Microsoft binary floating format. However I've seen
enough to form a view that in general converting '\x00\x00\x00\x02' to
0.0 would be a mistake, and that 5.8774717541114375e-039 is the
correct answer.

Why do I think so? There's a Borland/Inprise document on the
wotsit.org website that gives C functions for conversion both ways
between MBF and IEEE formats (both 32 bits and 64 bits). They are
supposed to mimic functions that were in the MS C runtime library at
one stage. The _fieeetomsbin (32 bits) function does NOT make a
special case of IEEE 0.0; it passes it through the normal what is the
exponent, what is the mantissa routine, and produces
'\x00\x00\x00\x02' (ms exponent field == 2). The converse routine
regards any MBF number with exponent 0 as being 0.0, and puts anything
else through the normal cycle -- which is a nonsense with MBF exponent
== 1, by the way (because of the offset of 2, the result in IEEE-32-
bit is an exponent of -1 which becomes 255 which tags the result as
infinity or NaN (not a number)). The lack of round-trip sameness for
0.0 is so astonishing that it this were true one would have expected
it to be remarked on somewhere.

So: It is probably sufficient for your application to round everything
to 3 decimal places, but I thought I'd better leave this note to warn
anyone else who might want to use the function.

I am curious as to what software created your MBF-32-bit numbers ...
care to divulge?

Cheers,
John