[Tutor] Unzipping a Zip of folders that have zips within them that I'd like to unzip all at once.
Peter Otten
__peter__ at web.de
Wed Sep 19 12:58:11 CEST 2012
Gregory Lund wrote:
> I teach at a college and I'm trying to use Python (2.6 because I'm
> running my tool in ArcGIS) to unzip a .zip file that contains one
> folder full of student folders, each with 1 or more submissions
> (zipfiles) that represent student submissions to weekly lab
> assignments.
Your lack of response in the previous thread
http://mail.python.org/pipermail/tutor/2012-August/090742.html
is not a big motivation to answer this one.
> It all starts with an originalzip.zip (for example) that has a single
> folder (originalfolder)
> Within the 'originalfolder' folder there are anywhere from 1 - 40
> folders (that are not zipped). (these are the students userid folders)
> Within each of the (1-40) student userid folders is anywhere from 1-10
> zipfiles and perhaps a .pdf or .docx (if for example they submitted
> more than one revision of the assignment, there are more than 1)
>
> Folder Structure
>
> originalzip.zip
> --originalfolder
> --folder1 (w/ two zip folders)
> --internalzip1_a.zip
> --internalfolder1_a
> --files
> --folders
> --internalzip1_b.zip
> --internalfolder1_b
> --files
> --folders
> --folder2 (w/1 zip folder)
> --internalzip2.zip
> --internalfolder2
> --files
> --folders
> --etc....
>
> My goal is to:
> a) Unzip the 'originalzip.zip'
> b) go to the 'originalfolder' (the unzipped version of the
> originalzip.zip) c) go into the first folder (folder1) in the original
> folder and unzip any and all zipfiles within it
> d) go to the second folder (folder2) in the original folder and unzip
> any and all zipfiles within it
> e) continue until all folders within originalfolders have been checked
> for internalzips
>
>
> ### Note, I am a beginner both with this tutor environment and in python.
> I apologize in advance if my code below is 'not up to par' but I am
> trying to keep it simple in nature and use ample comments to keep
> track of what I am doing. I also don't know if I should post sample
> data (zipfile of a folder of folders with zipfiles), and if so, where?
>
> I have some code that works to extract the 'originalzip.zip', to an
> 'originalfolder' but it won't go to the folders (folder1, folder2,
> etc.) to unzip the zipfiles within them.
> It gets hung up on access to the first student folder and won't unzip it.
Hm, I would have expeced an exception. Perhaps you should omit the ArcGIS
integration until everything else works.
> I think it's a simple fix, but I've been messing with it for quite a
> while and can't figure it out.
>
> Code below:
>
> #1 Required imports.
Excessive comments impair readability. Comments stating the obvious are
particularly bad.
> import os, os.path, zipfile, arcpy
>
> #2 I'm Utilizing 'GetParameterAsText' so that this code can be run as
> a tool in ArcGIS
>
> #2.1 in_zip is a variable for "What zipfile (LAB) do you want to extract?"
> in_Zip = arcpy.GetParameterAsText(0)
> cZ = in_Zip
Why two names for one value?
> #2.2 outDir is a variable for "What is your output Directory?"
> outDir = os.getcwd()
>
> #3 Extracting the initial zipfolder:
> #3.1 Opening the original zipfile
> z = zipfile.ZipFile(cZ)
>
> #4 Extracting the cZ (original zipfile)into the output directory.
> z.extractall(outDir)
>
> #5 Getting a list of contents of the original zipfile
> zipContents = z.namelist()
>
> #6 Unzipping the Inner Zips:
>
> #6.1 Looping through the items that were in the original zipfile, and
> now in a folder...
> # ...For each item in the zipContents....
> for item in zipContents:
You make no attempt to filter out the contents that are not zipfiles. That
will cause an exception further down where you try to unzip.
> #6.2 Get the location (note the location is the 'outDir' plus what
> namelist() gave me)
> #(have never used 'os.sep', had to look it up, when someone suggested
> #it)
> itemLoc = outDir + os.sep + item
The standard way (which is also more robust) is to use os.path.join():
itemLoc = os.path.join(outDir, item)
> #6.3 Opens the first (2nd, 3rd, etc files) of the internal zip file
> #(*.zip)
> z = zipfile.ZipFile(itemLoc)
>
> #6.4 Extract all files in *.zip in the same folder as the zipfile
> z.extractall(os.path.split(itemLoc)[0])
The zip files's contents will probably end up in the same folder as the
zipfile. Is that what you want?
>
> #6.5 determining the list of items in each students' folder
> student_lab = z.namelist()
Unused variable alert.
>
> #7 THE END.
>
> Thank you for any and all suggestions/ fixes, constructive criticism
> and assistance with my beginner code!
Have you considered the simpler code I gave in
http://mail.python.org/pipermail/tutor/2012-August/090743.html
before prodding on?
More information about the Tutor
mailing list