Unable to read large files from zip

Kevin Ar18 kevinar18 at hotmail.com
Tue Aug 28 21:10:59 EDT 2007


I posted this on the forum, but nobody seems to know the solution: http://python-forum.org/py/viewtopic.php?t=5230

I have a zip file that is several GB in size, and one of the files inside of it is several GB in size.  When it comes time to read the 5+GB file from inside the zip file, it fails with the following error:
File "...\zipfile.py", line 491, in read bytes = self.fp.read(zinfo.compress_size)
OverflowError: long it too large to convert to int
Note: all the other smaller files up to that point come out just fine.
Here's the code:
------------------
import zipfile
import re
dataObj = zipfile.ZipFile("zip.zip","r")
for i in dataObj.namelist():
-----print i+" -- >="+str(dataObj.getinfo(i).compress_size /1024 / 1024)+"MB"
-----if(i[-1] == "/"):
----------print "Directory -- won't extract"
-----else:
----------fileName = re.split(r".*/",i,0)[1]
----------fileData = dataObj.read(i)


There have been one or more posts about 2GB limits with the zipfile module, as well as this bug report: http://bugs.python.org/issue1189216  Also, older zip formats have a 4GB limit.  However, I can't say for sure what the problem is.  Does anyone know if my code is wrong or if there is a problem with Python itself?
If Python has a bug in it, then is there any other alternative library that I can use (It must be free source: BSD, MIT, Public Domain, Python license; not copyleft/*GPL)?  If not that, is there any similarly licensed code in another language (like c++, lisp, etc...) that I can use?
_________________________________________________________________
Messenger Café — open for fun 24/7. Hot games, cool activities served daily. Visit now.
http://cafemessenger.com?ocid=TXT_TAGLM_AugWLtagline


More information about the Python-list mailing list