[Python-Dev] Intention to accept PEP 552 soon (deterministic pyc files)

Guido van Rossum guido at python.org
Tue Oct 3 11:47:05 EDT 2017


I'm fine with adding an API, though I don't think that an API that knows
about all current (historic) and future formats belongs in importlib.util
-- that module only concerns itself with the *current* format.

In terms of the API design I'd make take an IO[bytes] and just read and
parse the header, so after that you can use marshal.load() straight from
the file object. File size, mtime and bitfield should be represented as
ints (the parser should take care of endianness).The hash should be a bytes.

On Tue, Oct 3, 2017 at 8:24 AM, Antoine Pitrou <solipsis at pitrou.net> wrote:

> On Tue, 3 Oct 2017 08:15:04 -0700
> Guido van Rossum <gvanrossum at gmail.com> wrote:
> > It's really not that hard. You just check the magic number and if it's
> the
> > new one, skip 4 words. No need to understand the internals of the header.
>
> Still, I agree with Barry that an API would be nice.
>
> Regards
>
> Antoine.
>
> >
> > On Oct 3, 2017 08:06, "Barry Warsaw" <barry at python.org> wrote:
> >
> > > Guido van Rossum wrote:
> > > > There have been no further comments. PEP 552 is now accepted.
> > > >
> > > > Congrats, Benjamin! Go ahead and send your implementation for
> > > review.Oops.
> > > > Let me try that again.
> > >
> > > While I'm very glad PEP 552 has been accepted, it occurs to me that it
> > > will now be more difficult to parse the various pyc file formats from
> > > Python.  E.g. I used to be able to just open the pyc in binary mode,
> > > read all the bytes, and then lop off the first 8 bytes to get to the
> > > code object.  With the addition of the source file size, I now have to
> > > (maybe, if I have to also read old-style pyc files) lop off the front
> 12
> > > bytes, but okay.
> > >
> > > With PEP 552, I have to do a lot more work to just get at the code
> > > object.  How many bytes at the front of the file do I need to skip
> past?
> > >  What about all the metadata at the front of the pyc, how do I
> interpret
> > > that if I want to get at it from Python code?
> > >
> > > Should the PEP 552 implementation add an API, probably to
> > > importlib.util, that would understand all current and future formats?
> > > Something like this perhaps?
> > >
> > > class PycFileSpec:
> > >     magic_number: bytes
> > >     timestamp: Optional[bytes] # maybe an int? datetime?
> > >     source_size: Optional[bytes]
> > >     bit_field: Optional[bytes]
> > >     code_object: bytes
> > >
> > > def parse_pyc(path: str) -> PycFileSpec:
> > >
> > > Cheers,
> > > -Barry
> > >
> > > _______________________________________________
> > > Python-Dev mailing list
> > > Python-Dev at python.org
> > > https://mail.python.org/mailman/listinfo/python-dev
> > > Unsubscribe: https://mail.python.org/mailman/options/python-dev/
> > > guido%40python.org
> > >
> >
>
>
>
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: https://mail.python.org/mailman/options/python-dev/
> guido%40python.org
>



-- 
--Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20171003/f59cef86/attachment.html>


More information about the Python-Dev mailing list