Problem with zipfile and newlines
John Machin
sjmachin at lexicon.net
Mon Mar 10 17:37:18 EDT 2008
On Mar 10, 11:14 pm, Duncan Booth <duncan.bo... at invalid.invalid>
wrote:
> "Neil Crighton" <neilcrigh... at gmail.com> wrote:
> > I'm using the zipfile library to read a zip file in Windows, and it
> > seems to be adding too many newlines to extracted files. I've found
> > that for extracted text-encoded files, removing all instances of '\r'
> > in the extracted file seems to fix the problem, but I can't find an
> > easy solution for binary files.
>
> > The code I'm using is something like:
>
> > from zipfile import Zipfile
> > z = Zipfile(open('zippedfile.zip'))
> > extractedfile = z.read('filename_in_zippedfile')
>
> > I'm using Python version 2.5. Has anyone else had this problem
> > before, or know how to fix it?
>
> > Thanks,
>
> Zip files aren't text. Try opening the zipfile file in binary mode:
>
> open('zippedfile.zip', 'rb')
Good pickup, but that indicates that the OP may have *TWO* problems,
the first of which is not posting the code that was actually executed.
If the OP actually executed the code that he posted, it is highly
likely to have died in a hole long before it got to the z.read()
stage, e.g.
>>> import zipfile
>>> z = zipfile.ZipFile(open('foo.zip'))
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "C:\python25\lib\zipfile.py", line 346, in __init__
self._GetContents()
File "C:\python25\lib\zipfile.py", line 366, in _GetContents
self._RealGetContents()
File "C:\python25\lib\zipfile.py", line 404, in _RealGetContents
centdir = struct.unpack(structCentralDir, centdir)
File "C:\python25\lib\struct.py", line 87, in unpack
return o.unpack(s)
struct.error: unpack requires a string argument of length 46
>>> z = zipfile.ZipFile(open('foo.zip', 'rb')) # OK
>>> z = zipfile.ZipFile('foo.zip', 'r') # OK
If it somehow made it through the open stage, it surely would have
blown up at the read stage, when trying to decompress a contained
file.
Cheers,
John
More information about the Python-list
mailing list