[Python-ideas] Access to function objects

Sun Aug 7 20:21:14 CEST 2011

On Aug 7, 2011 6:11 AM, "Guido van Rossum" <guido at python.org> wrote:
>
> On Sun, Aug 7, 2011 at 3:46 AM, Eric Snow <ericsnowcurrently at gmail.com>
wrote:
> > On Sat, Aug 6, 2011 at 11:36 PM, Steven D'Aprano <steve at pearwood.info>
wrote:
> >> Eric Snow wrote:
> >>> Why not bind the called function-object to the frame locals, rather
> >>> than the one for which the code object was created, perhaps as
> >>> "__function__"?
> >>
> >> I'm afraid I can't interpret this (which may be my ignorance rather
than
> >> your fault).
> >
> > No, the fault is likely mine.  But it seems so clear to me. :)
> >
> >> The only guess I can make is based on what you say later:
> >>
> >> "One new implicit name in locals().
> >>
> >> so I presume you mean that the function should see a local variable
(perhaps
> >> called "me", or "this"?) that is bound to itself.
> >
> > One called __function__ (or the like).  A "dunder" name is used to
> > indicate its special nature and limit conflict with existing code.
>
> Without thinking too much about this I like it.
>
> > The function that was called would be bound to that name at function
> > execution time.  Keep in mind that I am talking about the frame
> > locals, not anything stored on the code object nor on the function
> > object.  Not to overdramatize it, but it would happen at the beginning
> > of every call of every function.  I don't know what that overhead
> > would be.
>
> It could be made into a "cell", which is the same way all locals are
> normally represented. This is very fast. Further the code for it could
> be triggered by the appearance of __function__ (if that's the keyword
> we choose) in the function body. I don't really care what happens if
> people use locals() -- that's inefficient and outmoded anyway. (Note
> that!)
>
> >> Presumably if a function wants to use that same name as a local,
nothing bad
> >> will happen, since the local assignment will just override the implicit
> >> assignment. But what about code that expects to see a nonlocal or
global
> >> with the same name?
>
> That's why a __dunder__ name is used.
>
> >> What happens when two functions, sharing the same code object, get
called
> >> from two threads at the same time? Are their locals independent?
> >
> > I'm afraid I don't know.  I expect that each would get executed in
> > separate execution frames, and so have separate frame locals.
>
> The frames are completely independent. They all point to the same code
> object and under the proposal they will all point to the same function
> object. I see no problems here except self-inflicted, like using
> __function__ to hold state that can't be accessed concurrently safely;
> note that recursive invocations have the same issue. I see it as a
> non-problem.
>
> >> For most uses, standard recursion via the name is good enough, it's
only a
> >> few corner cases where self-reflection (as I call it) is needed.
>
> Right. If it were expected that people would start writing recursive
> calls using __function__ routinely, in situations where a name
> reference works, I'd be very unhappy with the new feature. (And if
> someone wants to make the argument that recursive calls using
> __function__ are actually better in some way I am willing to
> filibuster.)
>
> >> And I say
> >> that as somebody who does want a way for functions to know themselves.
I
> >> don't think that use-case is so important that it should be implicitly
added
> >> to every function, on the off-chance it is needed, rather than
explicitly on
> >> demand.
> >
> > For me the use case involves determining what function called my
> > function.  Currently you can tell in which execution frame a function
> > was called, and thereby which code object, but reliably matching that
> > to a function is not so simple.  I recognize that my case is likely
> > not a general one.
>
> But it is a nice one. It solves some issues that pdb currently solves
> by just using a file/line reference.
>
> >>> To finish things off, bind to every new code object
> >>> the function for which it was created, perhaps as "co_func".  That way
> >>> you will always know what function object was called and which one the
> >>> code object came from originally.
> >>
> >> What benefit will this give? Have you ever looked at a code object and
said,
> >> "I need a way of knowing which function this is from?" If so, I'd love
to
> >> know what problem you were trying to solve at the time!
> >
> > You caught me! :)  I don't already have a use case for this part.  I
> > had only considered that without this you could not determine where a
> > code object came from, or if a function had borrowed another's code
> > object.  This is certainly only useful in the case that one function
> > is using the code object of another, which we have all agreed is not
> > that common.  However, with a co_func I felt that all the bases would
> > be covered.
>
> Ah, but you can't do that! There are many situations where a single
> code object is used to create many different function objects. E.g.
> every time you have a nested function. Also the code object is
> immutable. This part is all carefully considered and should be left
> alone.

>From all the responses it's apparent I have not communicated the idea well.
The idea is for the *called* function object to be bound to __function__.
Here's an example:

def g():
    def f():
        return
    return f

f1 = g()
f1()
f2 = g()
f2()

(Thanks for pointing out that f1 and f2 share a code object.)

In the call to g, g would be bound to __function__ in the function body.  In
the call to f1, __function__ would be f1 (in f_locals).  And in the call to
f2, __function__ would be f2.  Also, while __function__ is not used in this
example, it would still be there in the frame locals.  Otherwise a function
called by f would be unable to use it (via inspect.stack() and the like).

Also, at definition time the original function would be bound as co_func.
It would not be associated with __function__, except indirectly.  This is so
that you can tell the difference between the original function and one that
is using the code object of the first.

-eric

>
> >> Code objects don't always get created as part of a function. They can
be
> >> returned by compile. What should co_func be set to then?
> >
> > None, since there was no function object created along with the code
> > object.  Same with generator expressions.
>
> Just forget this part.
>
> >> Finally, if the function has a reference to the code object, and the
code
> >> object has a reference to the function, you have a reference cycle.
That's
> >> not the end of the world now as it used to be, in early Python before
the
> >> garbage collector was added, but still, there better be a really good
> >> use-case to justify it.
> >>
> >> (Perhaps a weak reference might be more appropriate?)
> >
> > Good point.
> >
> > Mostly I am trying to look for an angle that works without a lot of
> > trouble.  Can't fault me for trying in my own incoherent way. :)
>
> On the rest of that rejected PEP:
>
> - I'm not actually sure how easy it is to implement the setting of
> __function__ when the frame is created. IIRC the frame creation is
> rather far removed from the function object, as there are various
> cases where there is no function object (class bodies, module-level
> code) and in other cases the function is called via a bound method.
> Someone should write a working patch to figure out if this is a
> problem in practice.
>
> - The primary use case for __function__ to me seems to access function
> attributes, but I'm not sure what's wrong with referencing these via
> the function name. Maybe it's when there's a method involved, since
> then you'd have to write <classname>.<methodname>.<attrname>.
>
> - It seems that the "current class" question has already been solved
> for super(). If more is needed I'd be okay with extending the
> machinery used by super() so that you can access the magic "current
> class" variable explicitly too.
>
> - For "current module" I've encountered a number of use cases, mostly
> having to do with wanting to define new names dynamically. Somehow I
> have found:
>
>  globals()[x] = y  # Note that x is a variable, not a literal
>
> cumbersome; I'd rather write:
>
>  setattr(__this_module__, x, y)
>
> There are IIRC also some use cases where an API expects a module
> object (or at least something whose attributes it can set and/or get)
> and passing the current module is clumsy:
>
>  foo(sys.modules[__name__])
>
> On the whole these use cases are all fairly weak though and I would
> give it a +0 at best. But rather than a prolonged discussion of the
> merits and use cases, I strongly recommend that somebody tries to come
> up with a working implementation and we'll strengthen the PEP from
> there.
>
> --
> --Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20110807/f26be1a1/attachment.html>