[Python-ideas] History on proposals for Macros?

Mon Mar 30 06:12:03 CEST 2015

On 03/29/2015 08:36 PM, Andrew Barnert wrote:
>>> Something related to this that I've wanted to experiment with, but
>>> is hard to do in python to be able to split a function signature and
>>> body, and be able to use them independently. (But in a well defined
>>> way.)

> Almost everything you're asking for is already there.

Yes, I have looked into most of what you mention here.

> A function object contains, among other things, a sequence of closure
> cells, a local and global environment, default parameter values, and a
> code object.
>
> A code object contains, among other things, parameter names, a count of
> locals, and a bytecode string.
>
> You can see the attributes of these objects at runtime, and the inspect
> module docs describe what they mean. You can also construct these
> objects at runtime by using their constructors (you have to use
> types.FunctionType and types.CodeType; the built-in help can show you
> the parameters).
>
> You can also compile source code (or an AST) to a code object with the
> compile function.
>
> You can call a code object with the exec function, which takes a
> namespace (or, optionally, separate local and global namespaces--and, in
> a slightly hacky way, you can also override the builtin namespace).
>
> There are also Signature objects in the inspect module, but they're not
> "live" usable objects, they're nicely-organized-for-human-use
> representations of the signature of a function. So practically you'd use
> a dummy function object or just a dict or something, and create a new
> function from its attributes/members and the new code.
>
> So, except for minor spelling differences, that's exactly what you're
> asking for, and it's already there.

I think it's more than minor spelling differences.  :-)

I've played around with de-constructing functions and using the 
constructors to put them back together again, enough to know it's actually 
quite hard to get everything right with other than the original parts.

The exec function may be a good start to experimenting with this.  I havn't 
used it with code objects enough to be familiar with what limits that has. 
  It may not be that difficult to copy the C source for exec and create a 
new function more specific to this idea.  (as a test)

Even if that's slow, it may be good enough as a start.

> But if I've guessed right about_why_  you want this, it doesn't do what
> you'd want it to, and I don't think there's any way it could.
>
> Bytecode accesses locals (including arguments), constants, and closure
> cells by index from the frame object, not by name from a locals dict
> (although the frame has one of those as well, in case you want to debug
> or introspect, or call locals()). So, when you call a function, Python
> sets up the frame object, matching positional and keyword arguments (and
> default values in the function object) up to parameters and building up
> the sequence of locals. The frame is also essential for returning and
> uncaught exceptions (it has a back pointer to the calling frame).

This isn't a problem if the callable_code object creates a new frame.

It is an issue when running a code block in the current frame.  But I think 
there may be a way to get around that.

> The big thing you can't do directly is to create new closure cells
> programmatically from Python. The compiler has to know which of your
> locals will be used as closure variables by any embedded functions; it
> then stores these specially within your code object, so the MAKE_CLOSURE
> bytecode that creates a function object out of each embedded function
> can create matching closure cells to store in the embedded function
> object. This is the part that you need to add into what Python already
> has, and I'm not sure there's a clean way to do it.

> But you really should learn how all the existing stuff works (the
> inspect docs, the dis module, and the help for the constructors in the
> types module are actually sufficient for this, without having to read
> the C source code, in 3.4 and later) and find out for yourself, because
> if I'm wrong, you may come up with something cool. (Plus, it's fun and
> useful to learn about.)

I'm familiar with most of how python works, and even hacked a bit on 
ceval.c. (for fun)  I haven't played much with the AST side of things, but 
I do know generally how python is put together and works.

> That only gets you through the first half of your message--enough to
> make inc_x work as a local function (well, almost--without a "nonlocal
> x" statement it's going to compile x as a local variable rather than a
> closure variable, and however you execute it, you're just going to get
> UnboundLocalError).
>
> What about the second part, where you execute code in an existing
> frame?
 >
> That's even trickier.

Yes, I definitely agree.  I think one of the tests of a good idea is that 
it makes something that is normally hard (or tricky), simple and easy.  But 
actually doing that may be quite hard.  (or even not possible.)

> A frame already has its complete lists of locals,
> cells, and constants at construction time. If your code objects never
> used any new locals or constants, and never touched any nonlocal
> variables that weren't already touched by the calling function, all you
> need is some kind of "relocation" step that remaps the indices compiled
> into the bytecode into the indices in the calling frame (there's enough
> info in the code objects to do the mapping; for actually creating the
> relocated bytecode from the original you'll want something like the
> byteplay module, which unfortunately doesn't exist for 3.4--although I
> have an incomplete port that might be good enough to play with if you're
> interested).

> You can almost get away with the "no new locals or nonlocal cells" part,
> but "no new constants" is pretty restrictive. For example, if you
> compile inc_x into a fragment that can be executed inline, the number 1
> is going to be constant #0 in its code object. And now, you try to
> "relocate" it to run in a frame with a different code object, and
> (unless that different code object happened to refer to 1 as a constant
> as well) there's nothing to match it up to.
>
> And again, I don't see a way around this without an even more drastic
> rearchitecting of how Python frames work--but again, I think it's worth
> looking for yourself in hopes that I'm wrong.

Think of these things as non_local_blocks.  The difference is they would 
use dynamic scope instead of static scope.  Or to put it another way, they 
would inherit the scope they are executed in.

> There's another problem: every function body compiler to code ends with
> a return bytecode. If you didn't write one explicitly, you get the
> equivalent of "return None". That means that, unless you solve that in
> some way, executing a fragment inline is always going to return from the
> calling function. And what do you want to do about explicit return? Or
> break and continue?

As a non_local_block, a return could be a needed requirement.  It would 
return the value to the current location in the current frame.

Break and continue are harder. Probably give an error the same as what 
would happen if they are used outside a loop.  A break or return would need 
to be local to the loop.  So you can't have a break in a non_local_block 
unless the block also has the loop in it.  That keeps the associated parts 
local to each other.

> Of the two approaches, I think the first one seems cleaner. If you can
> make a closure cell out of x and then wrap inc_x's code in a normal
> closure that references it, that still feels like Python. (And having to
> make "nonlocal x" explicit seems like a good thing, not a limitation.)

Agree... I picture the first approach as a needed step to get to the second 
part.

> Fixing up fragments to run in a different frame, and modifying frames at
> runtime to allow them to be fixed up, seems a lot hackier. And the whole
> return issue is pretty serious, too.
>
> One last possibility to consider is something between the two: a
> different kind of object, defined differently, like a proc in Ruby
> (which is defined with a block rather than a def) might solve some of
> the problems with either approach. And stealing from Ruby again, procs
> have their own "mini-frames"; there's a two-level stack where every
> function stack frame has a proc stack frame, which allows a solution to
> the return-value problem that wouldn't be available with either closures
> or fragments.

This may be closer to how I am thinking it would work.  :-)

> (However, note that the return-value problem is much more
> serious in Ruby, where everything is supposed to be an expression, with
> a value; in Python you can just say "fragment calls are statements, so
> they don't have values" if that's what you want.)

It seems to me they can be either.  Python ignores None when it's returned 
by a function and not assigned to anything.  And if a value is returned, 
than it's returned to the current position in the current frame.  The 
return in this case is a non-local-block return.  So I think it wouldn't be 
an issue.

>
> One last note inline:
>
>>> A signature object could have a default body that returns the
>>> closure.
>>>
>>> And a body (or code) could have a default signature that*takes*  a
>>> namespace.
>>>
>>>
>>> Then a function becomes ...
>>>
>>> code(sig(...))    <--->   function(...)
>>>
>>>
>>>
>>> The separate parts could be created with a decorator.
>>>
>>> @signature def sig_x(x): pass
>>>
>>> @code def inc_x(): x += 1
>>>
>>> @code def dec_x(): x -= 1
>>>
>>>
>>> In most cases it's best to think of applying code bodies to names
>>> spaces.
>>>
>>> names = sig_x(0) inc_x(names) dec_x(names)
>>>
>>> That is nicer than continuations as each code block is a well
>>> defined unit that executes to completion and doesn't require
>>> suspending the frame.
>>>
>>> (Yes, it can be done with dictionaries, but that wouldn't give the
>>> macro like functionality (see below) this would.  And there may be
>>> other benifits to having it at a lower more efficient level.)
>>>
>>>
>>> To allow macro like ability a code block needs to be executable in
>>> the current scope.  That can be done just by doing...
>>>
>>> code(locals())      #  Dependable?
>>>
>>>
>>> And sugar to do that could be...
>>>
>>> if x < 10: ^^ inc_x   #just an example syntax. else: ^^ dec_x   #
>>> Note the ^^ looks like the M in Macro.;-)
>>>
>>>
>>> Possibly the decorators could be used with lambda directly to get
>>> inline functionality.
>>>
>>> code(lambda : x + 1)

> This is a very different thing from what you were doing above. A
> function that modifies a closure cell's value, like inc_x, can't be
> written as a lambda (because assignments are statements). And this
> lambda is completely pointless if you're going to use it in a context
> where you ignore its return value (like the way you used inc_x above).
> So, I'm not sure what you're trying to do here, but I think you may have
> another problem to solve on top of the ones I already mentioned.

It was an incomplete example.  It should have been...

     add_1_to_x = code(lambda: x + 1)

and then later you could use it in the same way as above.

     x = ^^ add_1_to_x

This is just an example to show how the first option above connects to the 
examples below.  with "^^: x + 1" being equivalent to "code(lambda: x + 1)".

Which would also be equivalent to ...

    @code
    def add_1_to_x():
        return x + 1

    x = ^^ add_1_to_x

>>> And a bit of sugar to shorten the common uses if needed.
>>>
>>> spam(x + 1, code(lambda : x + 1))
>>>
>>> spam(x + 1, ^^: x + 1)