PEP thought experiment: Unix style exec for function/method calls

Mon Jun 26 15:47:29 EDT 2006

This reminds me of an silly little optimization  I used to use all the 
times when coding in assembler on PIC MCUs.
A call followed by a return can be turned into jump. Saves one 
instruction and one level on the call stack.
I think most optimizing compilers already do something of this sort, at 
least in the embedded world :)

Jeethu Rao

Michael wrote:
> Hi,
>
>
> [ I'm calling this PEP thought experiment because I'm discussing language
>   ideas for python which if implemented would probably be quite powerful
>   and useful, but the increased risk of obfuscation when the ideas are
>   used outside my expected/desired problem domain probably massively
>   outweigh the benefits. (if you're wondering why, it's akin to adding
>   a structured goto with context)
>
>   However I think as a thought experiment it's quite useful, since any
>   language feature can be implemented in different ways, and I'm wondering
>   if anyone's tried this, or if it's come up before (I can't find either
>   if they have...). ]
>
> I'm having difficulty finding any previous discussion on this --  I
> keep finding people either having problems calling os.exec(lepev), or
> with using python's exec statement. Neither of which I mean here.
>
> Just for a moment, let's just take one definition for one of the
> os.exec* commands:
>
>     execv(...)
>         execv(path, args)
>
>         Execute an executable path with arguments, replacing current
>         process.
>             path: path of executable file
>             args: tuple or list of strings
>
> Also: Note that execv inherits the system environment.
>
> Suppose we could do the same for a python function - suppose we could
> call the python function but either /without/ creating a new stack
> frame or /replacing/ the current stack frame with the new one.
>
> Anyway, I've been thinking recently that the same capability in python
> would be useful. However, almost any possible language feature:
>    * Has probably already been discussed to death in the past
>    * There's often a nice idiom working around the lack of said feature.
>
> So I'm more on an exploratory forage than asking for a language change
> here ;)
>
> Since os.exec* exists and "exec" already  exists in python, I need to
> differentiate what I mean by a unix style exec from python. So for
> convenience I'll call it "cexe".
>
> Now, suppose I have:
>     ----------
>     def set_name():
>         name = raw_input("Enter your name! > ")
>         cexe greet()
>
>     def greet():
>         print "hello", name
>
>     cexe set_name()
>     print "We don't reach here"
>     ----------
>
> This would execute, ask for the user's name, say hello to them and then
> exit - not reaching the final print "We don't reach here" statement.
>
> Let's ignore for the moment that this example sucks (and is a good example
> of the danger of this as a language feature), what I want to do here is
> use this to explain the meaning of "cexe".
>
> There's two cases to consider:
>   cexe some_func_noargs()
>
>     This transfers execution to the function that would normally be called
>     if I simply called without using "cexe" some_func_noargs() . However,
>     unlike a function call, we're /replacing/ the current thread of
>     execution with the thread of execution in some_func_noargs(), rather
>     than stacking the current location, in order to come back to later.
>
>     ie, in the above this could also be viewed as "call without creating a
>     new return point" or "call without bothering to create a new stack
>     frame".
>
>     It's this last point why in the above example "name" leaks between the
>     two function calls - due to it being used as a cexe call.
>
> Case 2:
>   given...
>      def some_func_withargs(colour,tone, *listopts, **dictopts)
>
>   consider...
>      cexe some_func_withargs(foo,bar, *argv, **argd)
>
>      This would be much the same as the previous case, except in the new
>      execution point, the name colour & tone map to the values foo & bar had
>      in the original context, whilst listopts and dictopts map the values
>      that argv & argd had in the original content)
>
> One consequence here though is that in actual practice the final print
> statement of the code above never actually gets executed. (Much like if
> that was inside a function, writing something after "return foo", wouldn't
> be executed)
>
> The reason I'm curious here about previous discussion is because
> conceptually there's obviously other semantics you can apply - such as
> the current stack frame is /replaced/ by the new stack frame. This is
> perhaps a more accurate mapping to the Unix exec call. 
>
> If that was the case, it would mean that locals would not "leak" between
> functions (which is desirable), and our example above could be rewritten
> as follows:
>
>     ----------
>     def get_and_use_value_from_user(tag, callforward):
>         somevalue = raw_input(tag)
>         cexe callforward(name)
>
>     def greet(name):
>         print "hello", name
>
>     cexe get_and_use_value_from_user("Enter your name! > ", greet)
>     print "We don't reach here"
>     ----------
>
> OK, so this probably seems pretty pointless to many people, but I'm
> curious about improving the tools to deal with state machines. Often
> people use switch statements in other languages to deal with them, and
> for certain classes of state machines you can use replace them with
> generators. But that's not appropriate for everything...
>
> My particular thought that started all this off actually stems from this:
>
> Essentially by doing a cexe we're actually creating a composite function
> out of disparate functions (perhaps shared or not shared local context).
> ie ...
>     ----------
>     def count():
>         print "Counting to 3!"
>         cexe one()
>
>     def one():
>         print "one!"
>         cexe two()
>
>     def two():
>         print "two!"
>         cexe three()
>
>     def three():
>         print "three!"
>
>     count() # Note I'm not doing cexe count() here
>     ----------
> ... essentially dynamically constructs an execution context similar to a
> single function, ie the above collapses to something like:
>
>     ----------
>     def count():
>         print "Counting to 3!"
>         print "one!"
>         print "two!"
>         print "three!"
>
>     count() # Note I'm not doing cexe count() here
>     ----------
> It's this recognition that made me wonder this:
>
> This works well for state machines, and generators are a nice model for
> dealing with resumable things (and a state machine can be viewed as a
> resumable "thing").
>
> Now suppose we take all that one stage further and provide said
> composite generator, with some additional context in the way we do
> with Kamaelia - cf http://kamaelia.sf.net/MiniAxon/ , we could
> potentially do this:
>
> (choosing something relatively substantial to show I'm not just
>  being whimsical, and to provide somthing perhaps more "real")
>
> class TCP_StateMachine(Axon.Component.component):
>     def CLOSED(self):
>        if not self.anyReady(): yield self.pause()
>        event = self.recv("inbox")
>        if "appl passive open" == event.type: cexe self.LISTEN()
>        if "active open" == event.type:
>            self.send(SYN(event.payload), "network")
>            cexe self.SYN_SENT()
>
>     def LISTEN(self):
>        if not self.anyReady(): yield self.pause()
>        event = self.recv("inbox")
>        if "recv syn" == event.type:
>            self.send(   , "network")
>            cexe self.SYN_RCVD()
>        if "appl send data" == event.type:
>            self.send(   , "network")
>            cexe self.SYN_SENT()
>
>     def SYN_RCVD(self):
>        if not self.anyReady(): yield self.pause()
>        event = self.recv("inbox")
>        if "recv rst" == event.type:  cexe self.LISTEN()
>        if "recv ack" == event.type:  cexe self.ESTABLISHED()
>        if "appl close" == event.type:
>            self.send(FIN(event.payload), "network")
>            cexe self.FIN_WAIT1()
>
>     def SYN_SENT(self):
>        if not self.anyReady(): yield self.pause()
>        event = self.recv("inbox")
>        if "appl close" == event.type: cexe self.CLOSED()
>        if "timeout" == event.type: cexe self.CLOSED()
>        if "recv syn-ack" == event.type:
>            self.send(ACK(event.payload), "network")
>            cexe self.ESTABLISHED()
>
>     def ESTABLISHED(self):
>        # more complex than others, so skipped, has its own data transfer
>        # state etc, so would make more sense to model as a subcomponent.
>
>     def FIN_WAIT_1(self):
>        if not self.anyReady(): yield self.pause()
>        event = self.recv("inbox")
>        if "recv ack" == event.type: cexe self.FIN_WAIT_2()
>
>        if "recv fin" == event.type:
>            self.send(ACK(event.payload), "network")
>            cexe self.CLOSING()
>
>        if "recv fin, ack" == event.type:
>            self.send(ACK(event.payload), "network")
>            cexe self.TIME_WAIT()
>
>     def FIN_WAIT_2(self):
>        if not self.anyReady(): yield self.pause()
>        event = self.recv("inbox")
>        if "recv fin" == event.type:
>            self.send(ACK(event.payload), "network")
>            cexe self.TIME_WAIT()
>
>     def CLOSING(self):
>        if not self.anyReady(): yield self.pause()
>        event = self.recv("inbox")
>        if "recv ack" == event.type: cexe self.TIME_WAIT()
>
>     def TIME_WAIT(self):
>        if not self.anyReady(): yield self.pause()
>        event = self.recv("inbox")
>        if "timeout 2MSL" == event.type: cexe self.CLOSED()
>
> Now obviously that's not particularly pretty, but the clear definition
> of states as methods, and clear transitions between states via the cexe
> calls, is relatively easy to follow through. ie it's fairly clear it's
> implementing the standard TCP state machine.
>
> (Incidentally if you're wondering what relevance this has outside of
> just TCP, this sort of thing could be useful in games for modelling
> complex behaviours)
>
> What is less clear about this is that I'm working on the assumption that
> as well as the language change making "cexe" work, is that this also
> allows the above set of methods to be treated as if it's one large
> generator that's split over multiple function definitions. This is
> conceptually very similar to the idea that cexe would effectively
> "join" functions together, as alluded to above.
>
> This has a number of downsides for the main part of the language, so
> I wouldn't suggest that these changes actually happen - consider it a
> thought experiment if you like. (I think the single function/no wrapping
> of yield IS actually a good thing)
>
> However, I feel the above example is quite a compelling example of how
> a unix style exec for python method calls could be useful, especially
> when combined with generators. (note this is a thought experiment ;)
>
> It also struck me that any sufficiently interesting idea is likely to
> have already been implemented, though perhaps not looking quite like the
> above, so I thought I'd ask the questions:
>
>   * Has anyone tried this sort of thing?
>
>   * Has anyone tried simply not creating a new stack frame when doing
>     a function call in python? (or perhaps replacing the current one with
>     a new one)
>
>   * Has anyone else tried modelling the unix system exec function in
>     python? If so what did you find?
>
>   * Since I can't find anything in the archives, I'm presuming my
>     searching abilities are bust today - can anyone suggest any better
>     search terms or threads to look at?
>
>   * Am I mad? :)
>
> BTW, I'm aware that this has similarities to call with continuation,
> and that you can use statesaver.c & generators to achieve something
> vaguely similar to continuations, but I'm more after this specific
> approach, rather than that general approach. (After all, even ruby
> notes that their most common use for call/cc is to obfuscate code -
> often accidentally - and I'm not particularly interested in that :)
>
> Whereas the unix style exec is well understood by many people, and
> when it's appropriate can be extremely useful. My suspicion is that
> my ideasabove actually maps to a common idiom, but I'm curious to
> find that commonidiom.
>
> I'm fairly certain something like this could be implemented using
> greenlets, and also fairly certain that Stackless has been down this
> route in the past, but I'm not able to find something like this exec
> style call there. (Which is after all more constrained than your usual
> call with continuation approach)
>
> So, sorry for the length of this, but if anyone has any thoughts, I'd be
> very interested. If they don't, I hope it was interesting :)
>
> Regards,
>
>
> Michael.
>
>