zipfile stupidly broken

Nick Craig-Wood nick at craig-wood.com
Fri May 18 06:30:04 EDT 2007


Martin Maney <maney at two14.net> wrote:
>  To quote from zipfile.py (2.4 library):
> 
>      # Search the last END_BLOCK bytes of the file for the record signature.
>      # The comment is appended to the ZIP file and has a 16 bit length.
>      # So the comment may be up to 64K long.  We limit the search for the
>      # signature to a few Kbytes at the end of the file for efficiency.
>      # also, the signature must not appear in the comment.
>      END_BLOCK = min(filesize, 1024 * 4)
> 
>  So the author knows that there's a hard limit of 64K on the comment
>  size, but feels it's more important to fail a little more quickly when
>  fed something that's not a zipfile - or a perfectly legitimate zipfile
>  that doesn't observe his ad-hoc 4K limitation.  I don't have time to
>  find a gentler way to say it because I have to find a work around for
>  this arbitrary limit (1): this is stupid.

To search 64k for all zip files would slow down the opening of all zip
files whereas most zipfiles don't have comments.

The code in _EndRecData should probably read 1k first, and then retry
with 64k.

>  (1) the leading candidate is to copy and paste the whole frigging
>  zipfile module so I can patch it, but that's even uglier than it is
>  stupid.  "This battery is pining for the fjords!"

You don't need to do that, you can just "monkey patch" the _EndRecData
function.

-- 
Nick Craig-Wood <nick at craig-wood.com> -- http://www.craig-wood.com/nick



More information about the Python-list mailing list