[Python-Dev] PEP 511 (code transformers) rejected

Nick Coghlan ncoghlan at gmail.com
Fri Nov 3 01:36:14 EDT 2017


On 3 November 2017 at 03:19, Brett Cannon <brett at python.org> wrote:
> On Wed, 1 Nov 2017 at 16:17 Lukasz Langa <lukasz at langa.pl> wrote:
>>
>> I find this sad. In the JavaScript community the existence of Babel is
>> very important for the long-term evolution of the language independently
>> from the runtime. With Babel, JavaScript programmers can utilize new
>> language syntax while being able to deploy on dated browsers. While there's
>> always some experimentation, I doubt our community would abuse the new
>> syntactic freedom that the PEP provided.
>>
>> Then again, maybe we should do what Babel did, e.g. release a tool like it
>> totally separately from the runtime.
>
> I think the trick here would be getting people more comfortable with
> ahead-of-time compilation and then adding the appropriate support to
> bytecode files to load other "optimization" levels/tags. Then you load the
> .pyc files and rely on co_lnotab as Victor pointed out to get your source
> mapping by compiling your source code explicitly instead of as a side-effect
> of import. And since this approach would then just be about generalizing how
> to specify different tags to match against in .pyc file names it's easier to
> get accepted.

I'm not sure it's quite that simple, as you still need to define:

- how does the import system know that a given input file is a
"cache-only" import?
- how do linecache and similar tools know what source file the pyc maps back to?

 Right now, the source-file/cache-file relationship is hardcoded in
two functions:

* https://docs.python.org/3/library/importlib.html#importlib.util.cache_from_source;
and
* https://docs.python.org/3/library/importlib.html#importlib.util.source_from_cache

If we look at the code from hylang's custom importer for ".hy" files
[1] we can see that the "cache_from_source" implementation has a
convenient property: it ignores the source extension entirely, which
means it works for input paths with arbitrary file extensions, not
just Python source files.

This means that hy's import system integration can use that helper,
but if you have a "foo.hy" source file and a
"__pycache__/foo-<cache-tags>.pyc" ouptut file, the regular import
machinery will *ignore* the latter file, and you have to register Hy's
customer importer in order for Python to acknowledge that the cached
file exists.

The reverse lookup, by contrast, always assumes that the source suffix
is a ".py" file (which is already broken for "pyw" source files on
Windows). Correcting for that at the standard library level would
require changing the cache filename format to include an optional
additional element: the source file extension (cache_to_source doesn't
assume it has access to the pyc file itself - only the filename).

So if we went down that path, then the import system level additions
we'd want would probably be along the lines of:

- an enhancement to the cache file naming scheme to allow source file
extensions to be saved in PYC filenames
- an update to the SourceFileLoader to use that new naming scheme when
implicitly compiling source files with the pyw extension
- a new "CacheOnlyLoader" together with a new CACHE_ONLY_SUFFIXES list
- a new ".pyb" suffix (for "Python backport") as the sole default
entry in CACHE_ONLY_SUFFIXES (awful pun alert: you could also argue
that this suffix makes sense because "pyb files come before pyc
files")

To make this syntactic polyfill approach usable with older Python
versions (including 2.7), importlib2 could be resynced to the first
importlib version that supported this (importlib2 is currently up to
date with Python 3.5's multi-phase initialisation support, since that
was the last major functional change in importlib).

Cheers,
Nick.


[1] https://github.com/hylang/hy/blob/master/hy/importer.py

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia


More information about the Python-Dev mailing list