[Python-Dev] test_gzip/test_tarfile failure om AMD64

Mon May 29 22:31:08 CEST 2006

[Guido]
>>> I think we should do as Thomas proposes: plan to make it an error in
>>> 2.6 (or 2.7 if there's a big outcry, which I don't expect) and accept
>>> it with a warning in 2.5.

[Tim]
>> That's what I arrived at, although 2.4.3's checking behavior is
>> actually so inconsistent that "it" needs some defining (what exactly
>> are we trying to still accept?  e.g., that -1 doesn't trigger "I"
>> complaints but that -1L does above?  that one's surely a bug).

[Guido]
> No, it reflects that (up to 2.3 I believe) 0xffffffff was -1 but
> 0xffffffffL was 4294967295L.

But is it "a bug" _now_?  That's what I mean by "a bug".  To me, this
is simply inexplicable in 2.4 (and 2.5+):

>>> struct.pack("I", -1)  # neither std nor native modes complain
'\xff\xff\xff\xff'
>>> struct.pack(">I", -1)
'\xff\xff\xff\xff'

>>> struct.pack("I", -1L)  # but both complain if the input is long
Traceback (most recent call last):
 File "<stdin>", line 1, in ?
OverflowError: can't convert negative value to unsigned long
>>> struct.pack(">I", -1L)
Traceback (most recent call last):
 File "<stdin>", line 1, in ?
OverflowError: can't convert negative value to unsigned long

Particulary for the standard modes, the behavior also varies across
platforms (i.e., it wasn't true in any version of Python that
0xffffffff == -1 on most 64-bit boxes, and to have "standard mode"
behavior vary according to platform isn't particularly standard :-)).

>> To be clear, Thomas proposed "accepting it" (whatever that means) until 3.0.

> Fine with me.

So who has a definition for what "it" means?  I don't.  Does every
glitch have to be reproduced across the cross product of platform X
format-code X input-type X native-vs-standard X
direction-of-out-of-range?

Or would people be happier if struct simply never checked for
out-of-bounds?  At least the latter is doable with finite effort, and
is also explainable ("for an integral code of N bytes, pack() stores
the least-significant N bytes of the input viewed as a 2's-complement
integer").  I'd be happier with that than with something that can't be
explained short of exhaustive listing of cases across 5 dimensions.

Ditto with saying that for an integral type using N bytes, pack()
accepts any integer in -(2**(8*N-1)) through 2**(8*N)-1, complains
outside that range, and makes no complaining distinctions based on
platform, input type, standard-vs-native mode, signed-or-unsigned
format code, or direction of out-of-range.  That's also explainable.