[Python-ideas] Incorporating something like byteplay into the stdlib

Yury Selivanov yselivanov.ml at gmail.com
Fri Feb 12 17:58:37 EST 2016


On 2016-02-12 5:27 PM, Andrew Barnert wrote:
> On Feb 12, 2016, at 13:45, Yury Selivanov <yselivanov.ml at gmail.com> wrote:
>>> On 2016-02-12 4:13 PM, Andrew Barnert wrote:
>>> On Feb 12, 2016, at 12:36, Yury Selivanov <yselivanov.ml at gmail.com> wrote:
>>>>> On 2016-02-11 10:58 PM, Andrew Barnert via Python-ideas wrote:
>>>>> tl;dr: We should turn dis.Bytecode into a builtin mutable structure similar to byteplay.Code, to make PEP 511 bytecode transformers implementable.
>>>> Big -1 on the idea, sorry.
>>>>
>>>> CPython's bytecode is the implementation detail of CPython.  PyPy has some opcodes that CPython doesn't have, for example.  Who knows, maybe in CPython 4.0 we won't have code objects at all :)
>>>>
>>>> Adding something to the standard library means that it will be supported for years to come.  It means that the code is safe to use.  Which, in turn, guarantees that there will be plenty of code that depends on this new functionality.  At first some of that code will be bytecode optimizers, later someone implements LINQ-like extension, and in no time we lose our freedom to work with opcodes.
>>>>
>>>> If this "new functionality" is something that depends on CPython's internals, it will only fracture the ecosystem.  PyPy, or Pyston, or IronPython developers will either have to support byteplay-like stuff (which might be impossible), or explain their users why some libraries don't work on their platform.
>>> This sounds like potentially a argument against adding bytecode processors in PEP 511.[^1] But if PEP 511 *does* add bytecode processors, I don't see how my proposal makes things any worse.
>> The main (and only?) motivation behind PEP 511 is the optimization of CPython.  Maybe the new APIs will only be exposed at C level.
> Have you read PEP 511? It exposes an API for adding bytecode processors, explicitly explains that the reason for this API is to allow people to write new bytecode optimizers in Python, and includes a toy example of a bytecode transformer. I'm not imagining some far-fetched idea that someone might suggest in the future, I'm responding to what's actually written in the PEP.

I guess I read an earlier version which was focused only on AST 
transformations.  Maybe PEP 511 should be focused just on just one thing.

>
>>> Having dis (and inspect, and types.CodeType, and so on) be part of the stdlib makes it easier, not harder, to change CPython without breaking code that may need to introspect it for some reason.
>> You don't need mutability for introspection.
> Of course. When you split an analogy in half and only reply to the first half of it like this, the half-analogy has no content. So what?

Sorry, somehow I failed to read that paragraph in one piece.  My bad.

>
>>> In the same way, having a mutable dis would make it easier, not harder, to change CPython without breaking bytecode processors.
>> PEP 492 added a bunch of new opcodes.  Serhiy is exploring an opportunity of adding few more LOAD_CONST_N opcodes.  How would a mutable byteplay-code-like object in the dis module help that?
> If someone had written a bytecode processor on top of the dis module, and wanted to update it to take advantage of LOAD_CONST_N, it would be easy to do so--even on a local copy of CPython patched with Serhiy's changes. If they'd instead written it on top of a third-party module, they'd have to wait for that module to be updated (probably after the next major version of Python comes out), or update it locally. Which one of those sounds easiest to you?

My point (which is the *key* point) is that if we decide to have only 
LOAD_CONST_N opcodes and remove plain old LOAD_CONST -- all optimizers 
will break, no matter what library they use.  That's just a sad reality 
of working on the bytecode level.

For instance, PEP 492 split WITH_CLEANUP opcode into WITH_CLEANUP_START 
and WITH_CLEANUP_FINISH.  *Any* bytecode manipulation code that expected 
to see WITH_CLEANUP after SETUP_WITH *was* broken.

In short: I don't want to add more stuff to CPython that can make it 
harder for us to modify its low-level internals.

>> Interacting with bytecode in Python is generally considered unsafe, and used mostly for the purposes of experimentation, for which a PyPI module is enough.
> That's an argument against the PEP 511 API for adding bytecode processors--and, again, also possibly an argument against mutable function.__code__ and so on. But how is it an argument against my proposal?

function.__code__ exists and mutable regardless of PEP 511 and byteplay 
:)  Let's not add it to the mix.

You're right, I guess this is a common argument for both PEP511's 
code_transformer and a byteplay in stdlib.

>
>>>    [^1]: Then again, it's just as good an argument against import hooks, and exposing the __code__ member on function objects so decorators can change it, and so on, and years with those features hasn't created a catastrophe...
>> Import hooks (and even AST/parse/compile) is a much more high-level API.  I'm not sure we can compare them to byteplay.
> You're responding selectively here. Your argument is that people shouldn't mess with bytecode. If we don't want people to mess with bytecode, we shouldn't expose bytecode to be messed with. But you can write a decorator that sets f.__code__ = types.CodeType(...) with a replaced bytecode string, and all of the details on how to do that are fully documented in the dis and inspect modules. Making it tedious and error-prone is not a good way to discourage something.
>
> Meanwhile, the "low-level" part of this already exists: the dis module lists all the opcodes, disassembles bytecode, represents that disassembled form, etc.

Although function.__code__ is mutable, almost nobody actually mutates 
it.  We have dis module primarily for introspection and research 
purposes, view it as a handy tool to see how CPython really works.

I'm OK if PEP 511 adds some AST transformation hooks (because AST is a 
higher-level abstraction).  Adding code-object transformation hooks and 
a library to mutate (or produce new) code objects seems very wrong to me.

Yury


More information about the Python-ideas mailing list