[Numpy-discussion] Re: A weekend floating point/compiler question

Sat Apr 29 09:09:11 EDT 2006

Hi,

I don't think this is a GCC bug, but it does seem to be related to
Intel's 80-bit floating-point architecture.

As of the Pentium 3, Intel and compatible processors have two sets of
instructions for performing floating-point operations: the original
8087 set, which do all computations at 80-bit precision, and SSE (and
their extension SSE2), which don't use extended precision.

GCC allows you to select either instruction set.  Unfortunately, in
the absence of an explicit choice, it uses a default target that
varies by platform: The i386 version defaults to 8087 instructions,
while the x86-64 version defaults to SSE.  See

http://gcc.gnu.org/onlinedocs/gcc-4.1.0/gcc/i386-and-x86_002d64- 
Options.html

for details.

I can make your test programs behave correctly on a Pentium 4 by
selecting SSE2:

devel12-35: g77 testbug.f
devel12-36: ./a.out
  Should be 99: 98
devel12-37: g77 -msse2 -mfpmath=sse testbug.f
devel12-38: ./a.out
  Should be 99: 99
devel12-39: gcc scanbug.c
devel12-40: ./a.out | head -1
ERROR at x=3.000000e-02!
devel12-41: gcc -msse2 -mfpmath=sse scanbug.c
devel12-42: ./a.out
devel12-43:

Interestingly, I expected to be able to induce incorrect results on an
Opteron by using 8087, but that wasn't the case (both instruction sets
produced the correct result).  I'll have to think about why that's
happening -- maybe casting between ints and doubles differs between 32
and 64-bit architectures?

I've never used the Intel or Lahey Fortran compilers, but I suspect
they must be generating SSE instructions by default.  Actually, it's
interesting that the 80-bit computations are causing problems here,
since it's easy to come up with examples where they give you better
results than computations done without the extra bits.

Hope that helps,
Chris