[Python-ideas] Add a cryptographic hash (e.g SHA1) of source toPython Compiled objects?

Brett Cannon brett at python.org
Fri Feb 6 20:39:59 CET 2009


On Fri, Feb 6, 2009 at 10:58,  <rocky at gnu.org> wrote:
> Brett Cannon writes:
>  > On Thu, Feb 5, 2009 at 19:38,  <rocky at gnu.org> wrote:
>  > > Brett Cannon writes:
>  > >  > On Wed, Feb 4, 2009 at 02:18, Arnaud Delobelle <arnodel at googlemail.com> wrote:
>  > >  > > 2009/2/4  <rocky at gnu.org>:
>  > >  > >
>  > >  > >> There's also the mtime that needs to be ignored mentioned in prior
>  > >  > >> posts. And is there a table which converts a magic number version back
>  > >  > >> into a string with the Python version number? Thanks.
>  > >  > >
>  > >  > > You can look at Python/import.c, near the top of the file.
>  > >  >
>  > >  > The other option to see how all of this works is importlib as found in
>  > >  > the py3k branch. That's in pure Python so it's easier to follow.
>  > >  >
>  > >  > -Brett
>  > >  >
>  > >
>  > > Sorry for the delayed response - I finally had a chance to check out the
>  > > py3k code and look.
>  > >
>  > > Perhaps I'm missing something. Although there is some really cool,
>  > > well-written and neat Python code there (and some of the private
>  > > methods there seem to me like they should public and somewhere else,
>  >
>  > Still working on exposing the API.
>  >
>  > > perhaps in os or os.path),
>  >
>  > Nothing in that module belongs in os.
>
> There's probably some confusion as to what I was referring to or what
> I took you to mean when you mentioned importlib. I took that to mean
> the files in that directory "importlib".

No, that's right.

> At any rate that's what I looked at.
> One of the files is _bootstrap.py which has:
>
> def _path_join(*args):
>    """Replacement for os.path.join."""
>    return path_sep.join(x[:-len(path_sep)] if x.endswith(path_sep) else x
>                            for x in args)
>
> def _path_exists(path):
>    """Replacement for os.path.exists."""
>    try:
>        _os.stat(path)
>    except OSError:
>        return False
>    else:
>        return True
>
> def _path_is_mode_type(path, mode):
>    """Test whether the path is the specified mode type."""
>    try:
>        stat_info = _os.stat(path)
>    except OSError:
>        return False
>    return (stat_info.st_mode & 0o170000) == mode
>
> For _path_join, posixpath.py has something similar and perhaps even the same
> functionality although it's different code.
>
> _path_is_mode_type doesn't exist in posixpath.py
>
> _path_exists seems to be almost a duplicate of lexists using which
> uses lstat instead of _os.stat.
>

All of that code is duplicated, most of it copy-and-paste, from some
code from the os module or its helper modules. The only reason it is
there is for bootstrapping reasons when that code will be used as the
implementation of import (module can't rely on non-builtin modules).

>
>  >
>  > > As Arnaud mentioned, Python/import.c has this magic-number mapping in
>  > > comments near the top of the file. Of course one could take those
>  > > comments and turn it into a dictionary, but I was hoping Python had
>  > > such a dictionary/function built in already since needs to be
>  > > maintained along with changes to the magic number.
>  >
>  > It actually doesn't need to be maintained.
>
> I meant the mapping between magic number and version that it
> represents. For a use case, recall again what the problem is: you are
> given python code and a file that purports to be the source and want
> to verify. The current proposal (with its current weaknesses) requires
> getting the compiler the same. When that's not the same one could say
> "sorry, Python compiler version mismatch -- go figure it out", but
> more helpful would be to indicate that you compiled with version X (as
> a string rather than a magic number) and the python code was compiled
> with version Y. This means the source might be the same, we just don't
> really know.

I still don't see the benefit of knowing what version of Python a
magic number matches to. So I know some bytecode was compiled by
Python 2.5 while I am running Python 2.6. What benefit do I derive
from knowing that compared to just knowing that it was not compiled by
Python 2.6? I mean are you ultimately planning on launching a
different interpreter based on what generated the bytecode?

-Brett



More information about the Python-ideas mailing list