[Tutor] Unzipping a Zip of folders that have zips within them that I'd like to unzip all at once.

Peter Otten __peter__ at web.de
Wed Sep 19 12:58:11 CEST 2012


Gregory Lund wrote:

> I teach at a college and I'm trying to use Python (2.6 because I'm
> running my tool in ArcGIS) to unzip a .zip file that contains one
> folder full of student folders, each with 1 or more submissions
> (zipfiles) that represent student submissions to weekly lab
> assignments.

Your lack of response in the previous thread

http://mail.python.org/pipermail/tutor/2012-August/090742.html

is not a big motivation to answer this one.
 
> It all starts with an originalzip.zip (for example) that has a single
> folder (originalfolder)
> Within the 'originalfolder' folder there are anywhere from 1 - 40
> folders (that are not zipped). (these are the students userid folders)
> Within each of the (1-40) student userid folders is anywhere from 1-10
> zipfiles and perhaps a .pdf or .docx (if for example they submitted
> more than one revision of the assignment, there are more than 1)
> 
> Folder Structure
> 
> originalzip.zip
> --originalfolder
>   --folder1 (w/ two zip folders)
>     --internalzip1_a.zip
>     --internalfolder1_a
>       --files
>       --folders
>     --internalzip1_b.zip
>     --internalfolder1_b
>       --files
>       --folders
>   --folder2 (w/1 zip folder)
>     --internalzip2.zip
>     --internalfolder2
>       --files
>       --folders
>   --etc....
> 
> My goal is to:
> a) Unzip the 'originalzip.zip'
> b) go to the 'originalfolder' (the unzipped version of the
> originalzip.zip) c) go into the first folder (folder1) in the original
> folder and unzip any and all zipfiles within it
> d) go to the second folder (folder2) in the original folder and unzip
> any and all zipfiles within it
> e) continue until all folders within originalfolders have been checked
> for internalzips
> 
> 
> ### Note, I am a beginner both with this tutor environment and in python.
> I apologize in advance if my code below is 'not up to par' but I am
> trying to keep it simple in nature and use ample comments to keep
> track of what I am doing. I also don't know if I should post sample
> data (zipfile of a folder of folders with zipfiles), and if so, where?
> 
> I have some code that works to extract the 'originalzip.zip', to an
> 'originalfolder' but it won't go to the folders (folder1, folder2,
> etc.) to unzip the zipfiles within them.
> It gets hung up on access to the first student folder and won't unzip it.

Hm, I would have expeced an exception. Perhaps you should omit the ArcGIS 
integration until everything else works.

> I think it's a simple fix, but I've been messing with it for quite a
> while and can't figure it out.
> 
> Code below:
> 
> #1 Required imports.

Excessive comments impair readability. Comments stating the obvious are 
particularly bad.

> import os, os.path, zipfile, arcpy
> 
> #2 I'm Utilizing 'GetParameterAsText' so that this code can be run as
> a tool in ArcGIS
> 
> #2.1 in_zip is a variable for "What zipfile (LAB) do you want to extract?"
> in_Zip = arcpy.GetParameterAsText(0)
> cZ = in_Zip

Why two names for one value?
 
> #2.2 outDir is a variable for "What is your output Directory?"
> outDir = os.getcwd()
> 
> #3 Extracting the initial zipfolder:
> #3.1 Opening the original zipfile
> z = zipfile.ZipFile(cZ)
> 
> #4 Extracting the cZ (original zipfile)into the output directory.
> z.extractall(outDir)
> 
> #5 Getting a list of contents of the original zipfile
> zipContents = z.namelist()
> 
> #6 Unzipping the Inner Zips:
> 
> #6.1 Looping through the items that were in the original zipfile, and
> now in a folder...
> #   ...For each item in the zipContents....
> for item in zipContents:

You make no attempt to filter out the contents that are not zipfiles. That 
will cause an exception further down where you try to unzip.
 
>     #6.2 Get the location (note the location is the 'outDir' plus what
> namelist() gave me)
>     #(have never used 'os.sep', had to look it up, when someone suggested
>     #it)
>     itemLoc = outDir + os.sep + item

The standard way (which is also more robust) is to use os.path.join():

      itemLoc = os.path.join(outDir, item)
 
>     #6.3 Opens the first (2nd, 3rd, etc files) of the internal zip file
>     #(*.zip)
>     z = zipfile.ZipFile(itemLoc)
> 
>     #6.4 Extract all files in *.zip in the same folder as the zipfile
>     z.extractall(os.path.split(itemLoc)[0])

The zip files's contents will probably end up in the same folder as the 
zipfile. Is that what you want?
> 
>     #6.5 determining the list of items in each students' folder
>     student_lab = z.namelist()

Unused variable alert.

> 
> #7 THE END.
> 
> Thank you for any and all suggestions/ fixes, constructive criticism
> and assistance with my beginner code!

Have you considered the simpler code I gave in 

http://mail.python.org/pipermail/tutor/2012-August/090743.html

before prodding on?



More information about the Tutor mailing list