Signed zeros: is this a bug?

Alex Martelli aleax at mac.com
Sun Mar 11 14:26:01 EDT 2007


Mark Dickinson <dickinsm at gmail.com> wrote:

> On Mar 11, 1:26 pm, a... at mac.com (Alex Martelli) wrote:
> > [Long analysis of probable cause of the problem]
> 
> Thank you for this.  I was suspecting something along these lines,
> but I don't yet know my way around the source well enough to figure
> out where the problem was coming from.

The parser/compiler/etc are unfortunately some of the hardest parts of
the sources -- I'm not all that familiar with that part myself, which is
why it took me quite some digging.


> > In the meantime, I hope that some available workarounds for the bug are
> > clear from this discussion: avoid using multiple constants in a single
> > compilation unit where one is 0.0 and another is -0.0, or, if you really
> > can't avoid that, perhaps use compiler.compile to explicitly build the
> > bytecode you need.
> 
> Yup: the workaround seems to be as simple as replacing all occurrences
> of -0.0 with -(0.0).  I'm embarrassed that I didn't figure this out
> sooner.
> 
> >>> x, y = -(0.0), 0.0
> >>> x, y
> (-0.0, 0.0)

Glad it works for you, but it's the kind of workaround that could break
with any minor tweak/optimization to the compiler... very fragile:-(.

I think i found the cause of the bug, BTW.  The collection of constants
in a code object is built in Python/compile.c and it's built as a
dictionary, field u_consts in struct compiler_unit.  The "visitor" for
an expression that is a number is (in a case statement)

case Num_kind:
                ADDOP_O(c, LOAD_CONST, e->v.Num.n, consts);
                break;

(line 2947 of compile.c in Python's current sources from svn). ADDOP_O
just calls compiler_addop_o, which in turn does compiler_add_o before
adding the opcode (LOAD_CONST)

compiler_add_o (at lines 903-933) is used for all of the temporary
dictionaries in compiler_unit; a Python equivalent, basically, would be:

def eqv_cao(somedict, someobj):
    # make sure types aren't coerced, e.g. int and long
    t = someobj, type(someobj)
    if t in somedict:
        return somedict[t]
    somedict[t] = index = len(somedict)
    return index

a simple and fast way to provide a distinct numeric index (0 and up) to
each of a bunch of (hashable) objects.

Alas, here is the problem...: 0.0 and -0.0 are NOT separate as dict
keys!  They are == to each other.  So are 0, 0L, and 0+j0, but the
compiler smartly distinguishes these cases by using (obj, type) as the
key (the *types* are distinguished, even though the *values*) are; this
doesn't help with 0.0 and -0.0 since both have type float.

So, the first occurrence of either 0.0 or -0.0 in the compilation unit
ends up in the table of constants, and every other occurrence of either
value later in the unit is mapped to that one constant value:-(

This is not trivial to fix cleanly...:-(.  compiler_add_o would have to
test "is the object I'm storing -0.0" (I don't even know how to do that
in portable C...) and then do some kludge -- e.g. use as the key into
the dict (-0.0, 0) instead of (-0.0, float) for this one special case.
(I think the table of constants would still be emitted OK, since the
type part of the key is elided anyway in that table as placed in the
bytecode).  Or maybe we should give up ever storing -0.0 in the tables
of constant and ALWAYS have "0.0, unary-minus" wherever it appears (that
would presumably require working on the AST-to-bytecode visitors that
currently work ever-so-slightly-differently for this specific
troublespot in the C-coded version vs the Python-coded one...).

If you know the proper way to test for -0.0 in portable C code (or some
feature macro to use in a #if to protect nonportable code) I could try
proposing the first of these two solutions as a patch (I'm not going to
keep delving into that AST and visitors much longer...:-), but I suspect
it would be rejected as "too tricky [and minisculely slowing down every
compilation] for something that's too much of special case [and Python
does not undertake to support in general anyway]".  Still, we can't be
sure unless we try (and maybe somebody can think of a cleaner
workaround...).


Alex



More information about the Python-list mailing list