Script to make Windows XP-readable ZIP file

pac pac at fernside.com
Thu May 18 22:22:59 EDT 2006


I'm preparing to distribute a Windows XP Python program and some
ancillary files,
and I wanted to put everything in a .ZIP archive.  It proved to be
inordinately
difficult and I thought I would post my solution here.   Is there a
better one?

Suppose you have a set of files in a directory c:\a\b and some
additional
files in c:\a\b\subdir.  Using a Python script, you would
like to make a Windows-readable archive (.zip) that preserves this
directory structure, and where the root directory of the archive is
c:\a\b.  In other words, all the files from c:\a\b appear in the
archive
without a path prefix, and all the files in c:\a\b\subdir have a path
prefix of \subdir.

This looks like it should be easy with module zipfile and the handy
function os.walk.  Create a zip file, call os.walk, and add the files
to
the archive like so:

import os
import zipfile

z =
zipfile.ZipFile(r"c:\a\b\myzip.zip",mode="w",compression=zipfile.ZIP_DEFLATED)

for dirpath,dirs,files in os.walk(r"c:\a\b"):
    for a_file in files:
        a_path = os.path.join(dirpath,a_file)
	z.write(a_path)  # Change, see below
z.close()

This creates an archive that can be read by WinZip or by another Python
script
that uses zipfile.  But when you try to view it with the Windows
compressed folder
viewer it will appear empty.  If you try to extract the files anyway
(because
you know they are really there), you get a Windows Security Warning and
XP
refuses to decompress the folder - XP is apparently afraid it might be
bird flu
or something.

If you change the line marked #Change to "z.write(a_path,file)",
explicitly naming
each file, now the compressed folder viewer will show all the files in
the archive.
XP will not treat it like a virus and it will extract the files.
However, the
archive does not contain a subdirectory; all the files are in a single
directory.

Some experimentation suggests that Windows does not like any filename
in the
archive that begins with either a drive designator like c:, or has a
path containing
a leading slash like "\a\b\afile.txt".  Relative paths like
"subdir\afile.txt" are
okay, and cause the desired behavior when the archive is extracted,
e.g., a new directory subdir is created and afile.txt is placed in it.

Since the method ZipFile.write needs a valid pathname for each file,
the correct
solution to the original problem entails messing around with the OS's
current
working directory.  Position the CWD in the desired base directory of
the archive,
add the files to the archive using their relative pathnames, and put
the CWD back
where it was when you started:

import os
import zipfile

z =
zipfile.ZipFile(r"c:\a\b\myzip.zip",mode="w",compression=zipfile.ZIP_DEFLATED)

cwd = os.getcwd()
os.chdir(base_dir)
try:
    for dirpath,dirs,files in os.walk(''):   # This starts the walk at
the CWD
        for a_file in files:
            a_path = os.path.join(dirpath,a_file)
            z.write(a_path,a_path)   # Can the second argument be
omitted?
    z.close()
finally:
    os.chdir(cwd)

This produces an archive that can be extracted by Windows XP using its
built-in
capability, by WinZip, or by another Python script.  Now that I have
the solution it
seems to make sense, but it wasn't at all obvious when I started.

Paul Cornelius




More information about the Python-list mailing list