[Python-Dev] PEP 302, PEP 338 and imp.getloader (was Re: a Python interface for the AST (WAS: DRAFT: python-dev...)

Thu Nov 24 14:10:07 CET 2005

Phillip J. Eby wrote:
> This isn't hard to implement per se; setuptools for example has a 
> 'get_importer' function, and going from importer to loader is simple:

Thanks, I think I'll definitely be able to build something out of that.

> So with the above function you could do something like:
> 
> def get_loader(fullname, path):
>     for path_item in path:
>         try:
>             loader = get_importer(path_item).find_module(fullname)
>             if loader is not None:
>                 return loader
>         except ImportError:
>             continue
>     else:
>         return None
> 
> in order to implement the rest.

I think sys.meta_path needs to figure into that before digging through 
sys.path, but otherwise the concept seems basically correct.

[NickC]
>> ** I'm open to suggestions on how to deal with argv[0] and __file__. They
>> should be set to whatever __file__ would be set to by the module 
>> loader, but
>> the Importer Protocol in PEP 302 doesn't seem to expose that 
>> information. The
>> current proposal is a compromise that matches the existing behaviour 
>> of -m
>> (which supports scripts like regrtest.py) while still giving a meaningful
>> value for scripts which are not part of the normal filesystem.

[PJE]
> Ugh.  Those are tricky, no question.  I can think of several simple 
> answers for each, all of which are wrong in some way.  :)

Indeed. I tried turning to "exec co in d" and "execfile(name, d)" for 
guidance, and didn't find any real help there. The only thing they 
automatically add to the supplied dictionary is __builtins__.

The consequence is that any code executed using "exec" or "execfile" sees its 
name as being "__builtin__" because the lookup for '__name__' falls back to 
the builtin namespace.

Further, "__file__" and "__loader__" won't be set at all when using these 
functions, which may be something of a surprise for some modules (to say the 
least).

My current thinking is to actually try to distance the runpy module from 
"exec" and "execfile" significantly more than I'd originally intended. That 
way, I can explicitly focus on making it look like the item was invoked from 
the command line, without worrying about behaviour differences between this 
and the exec statement. It also means runpy can avoid the "implicitly modify 
the current namespace" behaviour that exec and execfile currently have.

The basic function runpy.run_code would look like:

   def run_code(code, init_globals=None,
                      mod_name=None, mod_file=None, mod_loader=None):
       """Executes a string of source code or a code object
          Returns the resulting top level namespace dictionary
       """
       # Handle omitted arguments
       if mod_name is None:
           mod_name = "<run>"
       if mod_file is None:
           mod_file = "<run>"
       if mod_loader is None:
           mod_loader = StandardImportLoader(".")
       # Set up the top level namespace dictionary
       run_globals = {}
       if init_globals is not None:
           run_globals.update(init_globals)
       run_globals.update(__name__ = mod_name,
                          __file__ = mod_file,
                          __loader__ = mod_loader)
       # Run it!
       exec code in run_globals
       return run_globals

Note that run_code always creates a new execution dictionary and returns it, 
in contrast to exec and execfile. This is so that naively doing:

   run_code("print 'Hi there!'", globals())

or:

   run_code("print 'Hi there!'", locals())

doesn't trash __name__, __file__ or __loader__ in the current module (which 
would be bad).

And runpy.run_module would look something like:

   def run_module(mod_name, run_globals=None, run_name=None, as_script=False)
       loader = _get_loader(mod_name) # Handle lack of imp.get_loader
       code = loader.get_code(mod_name)
       filename = _get_filename(loader, mod_name) # Handle lack of protocol
       if run_name is None:
           run_name = mod_name
       if as_script:
           sys.argv[0] = filename
       return run_code(code, run_globals, run_name, filename, loader)

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://www.boredomandlaziness.org