[Python-Dev] test_gzip/test_tarfile failure om AMD64

Bob Ippolito bob at redivi.com
Mon May 29 17:45:55 CEST 2006


On May 29, 2006, at 3:14 AM, Thomas Wouters wrote:

>
>
> On 5/29/06, Bob Ippolito <bob at redivi.com> wrote:
> Well, the behavior change is in response to a bug <http:// 
> python.org/sf/1229380>. If nothing else, we should at least fix the  
> standard library such that it doesn't depend on struct bugs. This  
> is the only way to find them :)
>
> Feel free to comment how the zlib.crc32/gzip co-operation should be  
> fixed. I don't see an obviously correct fix. The trunk is currently  
> failing tests it shouldn't fail. Also note that the error isn't  
> with feeding signed values to unsigned formats (which is what the  
> bug is about) but the other way 'round, although I do believe both  
> should be accepted for the time being, while generating a warning.

Well, first I'm going to just correct the modules that are broken  
(zlib, gzip, tarfile, binhex and probably one or two others).

> Basically the struct module previously only checked for errors if  
> you don't specify an endian. That's really strange and leads to  
> very confusing results. The only code that really should be broken  
> by this additional check is code that existed before Python had a  
> long type and only signed values were available.
>
> Alas, reality is different. The fundamental difference between  
> types in Python and in C causes this, and code using struct is  
> usually meant specifically to bridge those two worlds. Furthermore,  
> struct is often used *fix* that issue, by flipping sign bits if  
> necessary:

Well, in C you get a compiler warning for stuff like this.

> >>> struct.unpack("<l", struct.pack("<l", 3221225472))
> (-1073741824,)
> >>> struct.unpack("<l", struct.pack("<L", 3221225472))
> (-1073741824,)
> >>> struct.unpack("<l", struct.pack("<l", -1073741824))
> (-1073741824,)
> >>> struct.unpack("<l", struct.pack("<L", -1073741824))
> (-1073741824,)
>
> Before this change, you didn't have to check whether the value is  
> negative before the struct.unpack/pack dance, regardless of which  
> format character you used. This misfeature is used (and many would  
> consider it convenient, even Pythonic, for struct to DWIM),  
> breaking it suddenly is bad.

struct doesn't really DWIM anyway, since integers are up-converted to  
longs and will overflow past what the (old or new) struct module will  
accept. Before there was a long type or automatic up-converting, the  
sign agnosticism worked.. but it doesn't really work correctly these  
days.

We have two choices, either fix it to behave consistently broken  
everywhere for numbers of every size (modulo every number that comes  
in so that it fits), or have it do proper range checking. A  
compromise is to do proper range checking as a warning, and do the  
modulo math anyway... but is that what we really want?

-bob

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20060529/cbfc11ac/attachment.html 


More information about the Python-Dev mailing list