Bug in floating point multiplication

Oscar Benjamin oscar.j.benjamin at gmail.com
Fri Jul 3 11:13:58 EDT 2015


On 2 July 2015 at 18:29, Jason Swails <jason.swails at gmail.com> wrote:
>
> As others have suggested, this is almost certainly a 32-bit vs. 64-bit
> issue.  Consider the following C program:
>
> // maths.h
> #include <math.h>
> #include <stdio.h>
>
> int main() {
>     double x;
>     int i;
>     x = 1-pow(0.5, 53);
>
>     for (i = 1; i < 1000000; i++) {
>         if ((int)(i*x) == i) {
>             printf("%d\n", i);
>             break;
>         }
>     }
>
>     return 0;
> }
>
> For the most part, this should be as close to an exact transliteration of
> your Python code as possible.
>
> Here's what I get when I try compiling and running it on my 64-bit (Gentoo)
> Linux machine with 32-bit compatible libs:
>
> swails at batman ~/test $ gcc maths.c
> swails at batman ~/test $ ./a.out
> swails at batman ~/test $ gcc -m32 maths.c
> swails at batman ~/test $ ./a.out
> 2049

I was unable to reproduce this on my system. In both cases the loops
run to completion. A look at the assembly generated by gcc shows that
something different goes on there though.

The loop in the 64 bit one (in the main function) looks like:

$ objdump -d a.out | less
...
400555:  pxor   %xmm0,%xmm0
400559:  cvtsi2sdl -0xc(%rbp),%xmm0
40055e:  mulsd  -0x8(%rbp),%xmm0
400563:  cvttsd2si %xmm0,%eax
400567:  cmp    -0xc(%rbp),%eax
40056a:  jne    400582 <main+0x4c>
40056c:  mov    -0xc(%rbp),%eax
40056f:  mov    %eax,%esi
400571:  mov    $0x400624,%edi
400576:  mov    $0x0,%eax
40057b:  callq  400410 <printf at plt>
400580:  jmp    40058f <main+0x59>
400582:  addl   $0x1,-0xc(%rbp)
400586:  cmpl   $0xf423f,-0xc(%rbp)
40058d:  jle    400555 <main+0x1f>
...

Where is the 32 bit one looks like:

$ objdump -d a.out.32 | less
...
 804843e:  fildl  -0x14(%ebp)
 8048441:  fmull  -0x10(%ebp)
 8048444:  fnstcw -0x1a(%ebp)
 8048447:  movzwl -0x1a(%ebp),%eax
 804844b:  mov    $0xc,%ah
 804844d:  mov    %ax,-0x1c(%ebp)
 8048451:  fldcw  -0x1c(%ebp)
 8048454:  fistpl -0x20(%ebp)
 8048457:  fldcw  -0x1a(%ebp)
 804845a:  mov    -0x20(%ebp),%eax
 804845d:  cmp    -0x14(%ebp),%eax
 8048460:  jne    8048477 <main+0x5c>
 8048462:  sub    $0x8,%esp
 8048465:  pushl  -0x14(%ebp)
 8048468:  push   $0x8048520
 804846d:  call   80482f0 <printf at plt>
 8048472:  add    $0x10,%esp
 8048475:  jmp    8048484 <main+0x69>
 8048477:  addl   $0x1,-0x14(%ebp)
 804847b:  cmpl   $0xf423f,-0x14(%ebp)
 8048482:  jle    804843e <main+0x23>
...

So the 64 bit one is using SSE instructions and the 32-bit one is
using x87. That could explain the difference you see at the C level
but I don't see it on this CPU (/proc/cpuinfo says Intel(R) Core(TM)
i5-3427U CPU @ 1.80GHz).

--
Oscar



More information about the Python-list mailing list