Why does StringIO discard its initial value?

David Fraser davidf at sjsoft.com
Fri Apr 15 11:53:18 EDT 2005


Raymond Hettinger wrote:
> [David Fraser]
> 
>>Others may find this helpful ; it's a pure Python wrapper for cStringIO
>>that makes it behave like StringIO in not having initialized objects
>>readonly. Would it be an idea to extend cStringIO like this in the
>>standard library? It shouldn't lose performance if used like a standard
>>cStringIO, but it prevents frustration :-)
> 
> 
> IMO, that would be a step backwards.  Initializing the object and then
> writing to it is not a good practice.  The cStringIOAPI needs to be as
> file-like as possible.  With files, we create an emtpy object and then
> starting writing (the append mode for existing files is a different story).
> Good code ought to maintain that parallelism so that it is easier to
> substitute a real file for a writeable cStringIO object.
> 
> This whole thread (except for the documentation issue which has been
> fixed) is about fighting the API rather than letting it be a guide to good
> code.
> 
> If there were something wrong with the API, Guido would have long
> since fired up the time machine and changed the timeline so that all
> would be as right as rain ;-)

But surely the whole point of files is that you can do more than either 
creating a new file or appending to an existing one (seek, write?)

The reason I wrote this was to enable manipulating zip files inside zip 
files, in memory. This is on translate.sourceforge.net - I wanted to 
manipulate Mozilla XPI files, and replace file contents etc. within the 
XPI. The XPI files are zip format that contains jars inside (also zip 
format). I needed to alter the contents of files within the inner zip files.

The zip classes in Python can handle adding files but not replacing 
them. The cStringIO is as described above.

So I created extensions to the zipfile.ZipFile class that allow it to 
delete existing files, and add them again with new contents (thus 
replacing them).

And I created wStringIO so that I could do this all inplace on the 
existing zip files.

This all required some extra hacking because of the dual-layer zip files.

But all this as far as I see would have been really tricky using the 
existing zipfile and cStringIO classes, which both assume (conceptually) 
that files are either readable or new or merely appendable (for zipfile).

The problem for me was not that cStringIO classes are too similar to 
files, it was that they are too dissimilar. All of this would work with 
either StringIO (but too slow) or real files (but I needed it in memory 
because of the zipfiles being inside other zip files).

Am I missing something?

David



More information about the Python-list mailing list