Replacing files in a zip archive

Scott David Daniels Scott.Daniels at Acm.Org
Thu Apr 30 20:27:45 EDT 2009


Дамјан Георгиевски wrote:
>> I'm writing a script that should modify ODF files. ODF files are just
>> .zip archives with some .xml files, images etc.
>>
>> So far I open the zip file and play with the xml with lxml.etree, but
>> I can't replace the files in it.
>>
>> Is there some recipe that does this ?
> 
> I ended writing this, pretty specific subclass of ZipFile....

Careful, you might get surprised.  Suppose you have this archive:
     z = zipfile.ZipFile('bumble.zip', 'w')
     z.writestr('one.xml', '<text>Frankly, my dear,</text>')
     z.writestr('two.xml', "<text>I don't give a damn.</text>")
     z.writestr('one.xml', '<text>Frankly, Scarlett, </text>')
     z.close()

Note what you get if, after executing the above, you execute:
     rz = zipfile.ZipFile('bumble.zip', 'w')
     print rz.read('one.xml'), rz.read('two.xml')
     rz.close()

If you use your code to replace 'one.xml', because of the .pop
you'll wind up with the equivalent of:
     nz = zipfile.ZipFile('other.zip', 'w')
     nz.writestr('one.xml', '<text>new tree output</text>')
     nz.writestr('two.xml', "<text>I don't give a damn.</text>")
     nz.writestr('one.xml', '<text>Frankly, Scarlett, </text>')
     nz.close()

Which will produce the same output as the original, confounding
your user.  You could just write the new values out, since .read
picks the last entry (as I believe it should).  Alternatively, if
you want to replace it "in place", you'll need a bit more smarts
when there is more than one copy of a file in the archive (when
z.namelist.count(filename) > 1).

--Scott David Daniels
Scott.Daniels at Acm.Org



More information about the Python-list mailing list