[Python-checkins] cpython (3.4): Fix issue #14315: The zipfile module now ignores extra fields in the central

gregory.p.smith python-checkins at python.org
Fri May 30 08:43:01 CEST 2014


http://hg.python.org/cpython/rev/33843896ce4e
changeset:   90897:33843896ce4e
branch:      3.4
parent:      90894:baa7b5555656
user:        Gregory P. Smith <greg at krypto.org>
date:        Thu May 29 23:42:14 2014 -0700
summary:
  Fix issue #14315: The zipfile module now ignores extra fields in the central
directory that are too short to be parsed instead of letting a struct.unpack
error bubble up as this "bad data" appears in many real world zip files in the
wild and is ignored by other zip tools.

files:
  Lib/test/test_zipfile.py |  15 +++++++++++++++
  Lib/zipfile.py           |   2 +-
  Misc/NEWS                |   5 +++++
  3 files changed, 21 insertions(+), 1 deletions(-)


diff --git a/Lib/test/test_zipfile.py b/Lib/test/test_zipfile.py
--- a/Lib/test/test_zipfile.py
+++ b/Lib/test/test_zipfile.py
@@ -1290,6 +1290,21 @@
         self.assertRaises(ValueError,
                           zipfile.ZipInfo, 'seventies', (1979, 1, 1, 0, 0, 0))
 
+    def test_zipfile_with_short_extra_field(self):
+        """If an extra field in the header is less than 4 bytes, skip it."""
+        zipdata = (
+            b'PK\x03\x04\x14\x00\x00\x00\x00\x00\x93\x9b\xad@\x8b\x9e'
+            b'\xd9\xd3\x01\x00\x00\x00\x01\x00\x00\x00\x03\x00\x03\x00ab'
+            b'c\x00\x00\x00APK\x01\x02\x14\x03\x14\x00\x00\x00\x00'
+            b'\x00\x93\x9b\xad@\x8b\x9e\xd9\xd3\x01\x00\x00\x00\x01\x00\x00'
+            b'\x00\x03\x00\x02\x00\x00\x00\x00\x00\x00\x00\x00\x00\xa4\x81\x00'
+            b'\x00\x00\x00abc\x00\x00PK\x05\x06\x00\x00\x00\x00'
+            b'\x01\x00\x01\x003\x00\x00\x00%\x00\x00\x00\x00\x00'
+        )
+        with zipfile.ZipFile(io.BytesIO(zipdata), 'r') as zipf:
+            # testzip returns the name of the first corrupt file, or None
+            self.assertIsNone(zipf.testzip())
+
     def tearDown(self):
         unlink(TESTFN)
         unlink(TESTFN2)
diff --git a/Lib/zipfile.py b/Lib/zipfile.py
--- a/Lib/zipfile.py
+++ b/Lib/zipfile.py
@@ -411,7 +411,7 @@
         # Try to decode the extra field.
         extra = self.extra
         unpack = struct.unpack
-        while extra:
+        while len(extra) >= 4:
             tp, ln = unpack('<HH', extra[:4])
             if tp == 1:
                 if ln >= 24:
diff --git a/Misc/NEWS b/Misc/NEWS
--- a/Misc/NEWS
+++ b/Misc/NEWS
@@ -18,6 +18,11 @@
 Library
 -------
 
+- Issue #14315: The zipfile module now ignores extra fields in the central
+  directory that are too short to be parsed instead of letting a struct.unpack
+  error bubble up as this "bad data" appears in many real world zip files in
+  the wild and is ignored by other zip tools.
+
 - Issue #21402: tkinter.ttk now works when default root window is not set.
 
 - Issue #10203: sqlite3.Row now truly supports sequence protocol.  In particulr

-- 
Repository URL: http://hg.python.org/cpython


More information about the Python-checkins mailing list