This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: zipfile when sizeof(long) == 8
Type: Stage:
Components: Library (Lib) Versions: Python 2.2
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: tim.peters Nosy List: tim.peters, tww-china
Priority: normal Keywords: patch

Created on 2002-07-02 11:11 by tww-china, last changed 2022-04-10 16:05 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
aa tww-china, 2002-07-02 11:13 Lib/zipfile.py patch when sizeof(long) == 8
Messages (13)
msg40472 - (view) Author: The Written Word (Albert Chin) (tww-china) Date: 2002-07-02 11:11
This bug also applies to Python 2.0.x and 2.1.x (most
likely every version).

When sizeof (long) == 8, like on Tru64 UNIX,
zipfile.testzip () fails due to a CRC error. The
problem is that in Lib/zipfile.py:
  crc = binascii.crc32(bytes)
converts the 32-bit binascii.crc32() return value to a
64-bit value (crc). We need to force crc to remain a
32-bit value. Attached is a patch though maybe someone
else can think of something better.
msg40473 - (view) Author: Tim Peters (tim.peters) * (Python committer) Date: 2002-07-02 14:44
Logged In: YES 
user_id=31435

I believe you're having a problem, but I can't tell what it is.  
Exactly how does zipfile.testzip() fail?  What did it get and 
what did it expect?

It's not possible to "force crc to remain a 32-bit value" on a 64-
bit box with sizeof(long)==8 -- Python doesn't have any 32-bit 
type on such a box.  So it seems most likely that some 32-
bit value either is or isn't getting sign-extended when this 
fails, but I can't tell from the report which of the disagreeing 
values that may be, or which it *should* be.

IOW, we need more info about how this fails.  If you're 
hacking the result of binascii.crc32() and calling that "a fix", 
chances seem high that the correct fix lies in changing what 
crc32() returns.  But not yet enough info here to say.
msg40474 - (view) Author: The Written Word (Albert Chin) (tww-china) Date: 2002-07-02 15:06
Logged In: YES 
user_id=119770

Do you have access to a machine where sizeof (long) == 8?
Here's what I'm getting:

$ uname -a
OSF1 duh V4.0 878 alpha
$ python
>>> import zipfile
>>> zip = zipfile.ZipFile ('/tmp/a.zip', 'w')
>>> zip.write ('/vmuniz', 'vmunix')
>>> zip.close ()
>>> zip = zipfile.ZipFile ('/tmp/a.zip', 'r')
>>> zip.testzip()
2226205591 -2068761705

I addes some debugging statements to zipfile.read(). The
first number is the output of binascii.crc32() while the
second is the output of zinfo.CRC (the CRC value in the
zipfile header for 'vmuniz' in /tmp/a.zip).

Would binascii.crc32() *ever* return a negative number or
does it return an unsigned type? Looking at the source to
Modules/binascii.c, crc is an unsigned long but the value
returned is signed long.
msg40475 - (view) Author: The Written Word (Albert Chin) (tww-china) Date: 2002-07-02 15:42
Logged In: YES 
user_id=119770

Bug #453208 indicates a similar problem.
msg40476 - (view) Author: The Written Word (Albert Chin) (tww-china) Date: 2002-07-02 15:47
Logged In: YES 
user_id=119770

From zipfile.py:
  ...
  structCentralDir = "<4s4B4H3l5H2l"
  ...
  def _RealGetContents(self):
    ...
            centdir = fp.read(46)
            total = total + 46
            if centdir[0:4] != stringCentralDir:
                raise BadZipfile, "Bad magic number for
central directory"
            centdir = struct.unpack(structCentralDir, centdir)

When a zipfile is created, the CRC is written with:
  def write(self, filename, arcname=None, compress_type=None):
    ...
        self.fp.write(struct.pack("<lll", zinfo.CRC,
zinfo.compress_size,
              zinfo.file_size))

Changing the "3l" to "3L" or "3I" in structCentralDir is
another workaround but as we wrote with "l", we should also
read with "l" (maybe this is the real problem).
msg40477 - (view) Author: Tim Peters (tim.peters) * (Python committer) Date: 2002-07-02 20:20
Logged In: YES 
user_id=31435

No, I don't have access to a 64-bit box.

Do you have access to CVS Python?  If so, please try again.  
I patched it to try to make binascii.crc32() return the same 
result across platforms.

Modules/binascii.c; new revision: 2.35
msg40478 - (view) Author: The Written Word (Albert Chin) (tww-china) Date: 2002-07-02 21:01
Logged In: YES 
user_id=119770

Tested the new Modules/binascii.c against 2.2.1 on Tru64
4.0D, 5.1, and HP-UX 11i and it works. Thanks!
msg40479 - (view) Author: The Written Word (Albert Chin) (tww-china) Date: 2002-07-02 21:41
Logged In: YES 
user_id=119770

Ok, well, testing worked fine on the test file I created but
running against Lib/test/test_zipfile.py gives:
Traceback (most recent call last):
  File "test_zipfile.py", line 35, in ?
    zipTest(file, zipfile.ZIP_STORED, writtenData)
  File "test_zipfile.py", line 16, in zipTest
    readData2 = zip.read(srcname)
  File "/opt/TWWfsw/python221/lib/python2.2/zipfile.py",
line 351, in read
    raise BadZipfile, "Bad CRC-32 for file %s" % name
zipfile.BadZipfile: Bad CRC-32 for file junk9630.tmp
msg40480 - (view) Author: Tim Peters (tim.peters) * (Python committer) Date: 2002-07-02 21:54
Logged In: YES 
user_id=31435

So what did it get, and what did it expect?  I.e., same stuff all 
over again.
msg40481 - (view) Author: Tim Peters (tim.peters) * (Python committer) Date: 2002-07-02 22:25
Logged In: YES 
user_id=31435

Please try again.  New patch tries to force the entry 
conditions in crc32(), as well as the return value.

Modules/binascii.c; new revision: 2.36
msg40482 - (view) Author: The Written Word (Albert Chin) (tww-china) Date: 2002-07-02 22:41
Logged In: YES 
user_id=119770

Ok, hang on. I'm doing a clean build to make sure I wasn't
using anything from an old install.
msg40483 - (view) Author: The Written Word (Albert Chin) (tww-china) Date: 2002-07-03 01:30
Logged In: YES 
user_id=119770

Ok, Modules/binascii.c v2.36 works good!
msg40484 - (view) Author: Tim Peters (tim.peters) * (Python committer) Date: 2002-07-03 01:58
Logged In: YES 
user_id=31435

Thanks for your help, Albert!  While I started my ill-spent 
computer career on 64-bit Crays, you're the only 64-bit 
platform I have anymore <wink>.

This report is Closed.
History
Date User Action Args
2022-04-10 16:05:28adminsetgithub: 36837
2002-07-02 11:11:54tww-chinacreate