[Python-Dev] How to interpret get_code from PEP 302?

Brett Cannon brett at python.org
Tue Aug 21 20:22:01 CEST 2007


[Thanks to Guido, Paul, and Nick for replying; going to just reply to
Paul since everyone said the same thing and it's his fault I didn't
understand the wording.  =)]

On 8/21/07, Paul Moore <p.f.moore at gmail.com> wrote:
> On 21/08/07, Brett Cannon <brett at python.org> wrote:
> > PEP 302 ("New Import Hooks") has an optional extensions section so
> > that tools like py2exe and py2app have an easier time.  Part of the
> > optional extensions is the method get_code that is to return the code
> > object for the specified method (if the loader can handle it).
> >
> > But there is a lack in the definition of how get_code is supposed to
> > be implemented.  The definition says that the "method should return
> > the code object associated with the module", which is fine.  But then
> > it goes on to state that "If the loader doesn't have the code object
> > but it _does_ have the source code, it should return the compiled
> > source code".  This throws me as this makes it sound like bytecode
> > does not need to be used if the loader does not already have a code
> > object and there is no source to be had; any bytecode can be ignored.
> >
> > Now I doubt this is how it is supposed to be read.  Does anyone
> > disagree with that?  If not, I will change the wording to mention that
> > bytecode must be used if no source is available (and that the magic
> > number must be verified).
>
> Hmm, yes, that's muddled. Maybe it made sense to me when I wrote it
> :-) (I think it was my wording rather than Just's)
>
> get_code must *always* return the same code object that
> loader.load_module is using - whether that be bytecode or compiled
> source (and it must respect things like file timestamps where
> appropriate just like load_module does).

Damn, I was afraid you were going to say something like that. =)  I
was hoping for an answer that would allow me to use the source if
available and then use bytecode as a backup.  That might not have the
best performance, but it is the simplest to implement as that skips
the bytecode timestamp check.  This way means I need to refactor some
things to do less import stuff and just do code object creation and
abstract everything else that deals with modules directly to another
function; nothing big but it would have been nice to avoid.  =)

This also means pkgutil is possibly non-compliant as it's get_code
implementation does the above suggestion (tries for source to avoid
timestamp check, and uses bytecode as backup).

> What the sentence you quote
> is trying to say is that if there's a need to compile source, the
> get_code method must do this on behalf of the caller - it can't return
> None and expect the caller to try get_source and compile it manually.
> Someone who only wants a code object should never need to call
> get_source.
>
> I'm not sure that's any clearer! If you need further clarification,
> let me know (either on or off list). I'd appreciate it if you can
> clear the PEP's wording up.

Basically reword it as what you said; the code object returned by
get_code needs to be equivalent to the one load_module would use if
requested to load the same module (e.g., if load_module would use the
bytecode, then it must use the bytecode as it might subtly differ
somehow, like with bytecode optimizations or something).  And mention
that writing out new bytecode is not required (but should it be
optional or not allowed at all?).

-Brett


More information about the Python-Dev mailing list