[Python-ideas] .pyu nicode syntax symbols (was Re: Empty set, Empty dict)

Tue Jul 1 02:39:14 CEST 2014

On Tue, Jul 1, 2014 at 9:48 AM, Andrew Barnert <abarnert at yahoo.com> wrote:
> First, two quick side notes:
>
> It might be nice if the compiler were as easy to hook as the importer. Alternatively, it might be nice if there were a way to do "inline bytecode assembly" in CPython, similar to the way you do inline assembly in many C compilers, so the answer to random's question is just "asm [('BUILD_SET', 0)]" or something similar. Either of those would make this problem trivial.
>

That would be interesting, but it raises the possibility of mucking up
the stack. (Imagine if you put BUILD_SET 1 in there instead. What's it
going to make a set of? What's going to happen to the rest of the
stack? Do you REALLY want to debug that?)

Back when I did a lot of C and C++ programming, I used to make good
use of a "drop to assembly" feature. There were two broad areas where
I'd use it: either to access a CPU feature that the compiler and
library didn't offer me (like CPUID, in its early days), or to
hand-optimize some code. Then compilers got better and better, and the
first set of cases got replaced with library functions... and the
second lot ended up being no better than the compiler's output, and
potentially a lot worse - particularly because they're non-portable.
Allowing a "drop to bytecode" in CPython would have the exact same
effects, I think. Some people would use it to create an empty set,
others would use it to replace variable swapping with a marginally
faster and *almost* identical stack-based swap:

x,y = y,x
LOAD_GLOBAL y
LOAD_GLOBAL x
ROT_TWO
STORE_GLOBAL x
STORE_GLOBAL y

becomes

LOAD_GLOBAL x
LOAD_GLOBAL y
STORE_GLOBAL x
STORE_GLOBAL y

Seems fine, right? But it's a subtle change to semantics (evaluation
order), and not much benefit anyway. Plus, if it's decided that this
semantic change is safe (if it's provably not going to have any
significance), a future version of CPython would be able to make the
exact same optimization, while leaving the code readable, and portable
to other Python implementations.

So while an inline bytecode assembler might have some uses, I suspect
it'd be an attractive nuisance more than anything else.

> On Monday, June 30, 2014 3:12 PM, Chris Angelico <rosuav at gmail.com> wrote:
>>On Tue, Jul 1, 2014 at 3:18 AM,  <random832 at fastmail.us> wrote:
>
>>> On Sat, Jun 28, 2014, at 01:28, Chris Angelico wrote:
>>>> empty_set_literal =
>>>> type(lambda:0)(type((lambda:0).__code__)(0,0,0,3,67,b't\x00\x00d\x01\x00h\x00\x00\x83\x02\x00\x01d\x00\x00S',(None,"I'm
>
> I think it makes more sense to use types.FunctionType and types.CodeType here than to generate two extra functions for each function, even if that means you have to put an import types at the top of every munged source file.

Sure. This is just a proof-of-concept anyway, and it's not meant to be
good code. Either way works, I just tried to minimize name usage (and
potential name collisions).

> But I think what he was suggesting is something like this: Let py_compile.compile generate the .pyc file as normal, then munge the bytecode in that file, instead of compiling each function, munging its bytecode, and emitting source that creates the munged functions.
>
>
> Besides being a lot less work, his version works for ∅ at top level, in class definitions, in lambda expressions, etc., not just for def statements. And it doesn't require finding and identifying all of the things to munge in a source file (which I assume you'd do bottom-up based on the ast.parse tree or something).
>

Sure. But all I was doing was responding to the implied statement that
it's not possible to write a .py file that makes a function with
BUILD_SET 0 in it. Translating a .pyu directly into a .pyc is still
possible, but was not the proposal.

> But either way, this still doesn't solve the big problem. Compiling a function by hand and then tweaking the bytecode is easy; doing it programmatically is more painful. You obviously need the function to compile, so you have to replace the ∅ with something else whose bytecode you can search-and-replace. But what? That something else has to be valid in an expression context (so it compiles), has to compile to a 3-byte opcode (otherwise, replacing it will screw up any jump targets that point after it), can't add any globals/constants/etc. to the list (otherwise, removing it will screw up any LOAD_FOO statements that refer to a higher-numbered foo), and can't appear anywhere in the code being compiled.
>

What I did was put in a literal string.

https://github.com/Rosuav/shed/blob/master/empty_set.py

It uses "∅ is set()" as a marker, depending on that string not
existing in the source. (I could compile the function twice, once with
that string, and then a second time with another string; the first
compilation would show what consts it uses, and the program could then
generate an arbitrary constant which doesn't exist.) The opcode is the
right length (assuming it doesn't go for EXTENDED_ARG, which I've
never heard of; it seems to be necessary if you have more than 64K
consts/globals/locals in a function???), and the resulting function
has an unnecessary const in it. It wouldn't be hard to drop it (the
code already parses through everything; it could just go "if it's
LOAD_CONST, three options - if it's the marker, switch in a BUILD_SET,
if it's less than the marker, no change, if it's more than the marker,
decrement"), but it doesn't seem to be a problem to have an extra
const in there.

> One more thing that I'm sure you thought of, but may not have thought through all the way: To make this generally useful, you can't just hardcode creating a zero-arg top-level function; you need to copy all of the code and function constructor arguments from the compiled function.
>

It handles arguments and stuff. All the attributes of the original
function object get passed through unchanged to the resulting
function, with the exception of the bytecode, obviously.

> So, if the function is a closure, how do you do that? You need to pass a list of closure cell objects that bind to the appropriate co_cellvars from the current frame, and I don't think there's a way to do that from Python. So, you need to do that by bytecode-hacking the outer function in the same way, just so it can build the inner function. And, even if you could build closure cells, once you've replaced the inner function definition with a function constructor from bytecode, when the resulting code gets compiled, it won't have any cellvars anymore.
>

Ah, that part I've no idea about. But it wouldn't be impossible for
someone to develop that a bit further.

> And going back to the top, all of these problems are why I think random's solution would be a lot easier than yours, but why my solution (first build compiler hooks or inline assembly, then use that to implement the empty set trivially) would be no harder than either (and a lot more generally useful), and also why I think this really isn't worth doing.
>

Right. I absolutely agree with your conclusion (not worth doing), and
always have had that view. This is proof that it's kinda possible, but
still a bad idea. Now, if someone comes up with a really compelling
use-case for an empty set literal, then maybe it'd be more important;
but if that happens, CPython will probably grow an empty set literal
in ASCII somehow, and then the .pyu translation can just turn ∅ into
that.

ChrisA