[Python-Dev] Request for Pronouncement: PEP 441 - Improving Python ZIP Application Support

Guido van Rossum guido at python.org
Mon Feb 23 20:47:28 CET 2015


So is the PEP ready for pronouncement or should there be more discussion?
Also, do you have a BDFL-delegate or do you want me to review it?

On Mon, Feb 23, 2015 at 11:41 AM, Thomas Wouters <thomas at python.org> wrote:

>
>
> On Mon, Feb 23, 2015 at 8:22 PM, Ethan Furman <ethan at stoneleaf.us> wrote:
>
>> On 02/23/2015 11:01 AM, Daniel Holth wrote:
>> > On Mon, Feb 23, 2015 at 1:49 PM, Paul Moore wrote:
>> >> On 23 February 2015 at 18:40, Brett Cannon wrote:
>> >>>
>> >>> Couldn't you just keep it in memory as bytes and then write directly
>> over
>> >>> the file? I realize that's a bit wasteful memory-wise but it is
>> possible.
>> >>> The docs could mention the memory cost is something to watch out for
>> when
>> >>> doing an in-place replacement. Heck the code could even make it an
>> >>> io.BytesIO instance so the rest of the code doesn't have to care
>> about this
>> >>> special case.
>> >>
>> >> I did consider this option, and I still quite like it. In fact,
>> >> originally I wrote the API to *only* be in-place, until I realised
>> >> that wouldn't work for things bigger than memory (but who has a Python
>> >> app that's bigger than RAM?)
>> >>
>> >> I'm happy to modify the API along these lines (details to be thrashed
>> >> out) if people think it's worthwhile.
>> >
>> > Sounds reasonable. It could be done by just reading the entire file
>> > contents after the shebang and re-writing them with the necessary
>> > offset all in RAM, truncating the file if necessary, without involving
>> > the zipfile module very much; the shebang could have some amount of
>> > padding by default; the file could just be re-compressed in memory
>> > depending on your appetite for complexity.
>>
>> This could be a completely stupid question, but how does the zip file
>> know where the individual files are?  More to the
>> point, does the index work via relative or absolute offset?  If absolute,
>> wouldn't the index have to be rewritten if the
>> zip portion of the file moves?
>>
>
> Yes and no. The ZIP format uses a 'central directory' which is a record of
> each file in the archive. The offsets are relative (although the
> specification is a little vague on what they're relative *to* when using a
> .zip file. The wording talks about disk numbers, ZIP being from the era of
> floppy disks.) You find the central directory by searching from the end (or
> reading a specific spot at the end, if you don't support archive comments.
> zipimport, for example, doesn't support archive comments) and it turns out
> you can find the central directory from just that information (and as far
> as I know, all tools do.) However, there are still some offsets that would
> change if you add stuff to the front of the ZIP file (or remove it), and
> some zip tools will complain (usually just in verbose mode, though.)
>
> --
> Thomas Wouters <thomas at python.org>
>
> Hi! I'm an email virus! Think twice before sending your email to help me
> spread!
>
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/guido%40python.org
>
>


-- 
--Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20150223/cd0a377e/attachment.html>


More information about the Python-Dev mailing list