[Python-ideas] Tulip / PEP 3156 - subprocess events

Sun Jan 20 07:13:39 CET 2013

On Sun, Jan 20, 2013 at 2:35 PM, Guido van Rossum <guido at python.org> wrote:
> TBH I don't see the protocol implementation getting any simpler
> because of this. There is some protocol initialization code that
> doesn't depend on the transport, and some that does. Using your
> approach, these all go in __init__(). Using the PEP's current
> proposal, the latter go in a separate method, connection_made().

When the two are separated without a clear definition of what else can
happen in between, *every other method on the protocol* needs to cope
with the fact that other calls to protocol methods may happen in
between the call to __init__ and the call to connection_made - you
simply can't write a protocol without dealing with that problem.

As you correctly figured out, my specific proposal was to move from:

    protocol = protocol_factory()
    protocol.connection_made(transport)

To a single event:

    protocol = protocol_factory(transport)

The *reason* I wanted to do this is that I *don't understand* what may
happen to my protocol implementation between construction and the call
to make_connection.

Your description of the current implementation actually worries me, as
it suggests to me that when I get a (transport, protocol) pair back
from a call to "create_connection", "connection_made" may *not* have
been called yet - the protocol may be in exactly the state I am
worried about, because the event loop is sending the notification in a
fire-and-forget fashion, instead of waiting until the call is
complete:

    protocol = protocol_factory()
    loop.call_soon(protocol.connection_made, transport)
    # The protocol isn't actually fully initialized here...

However, that description also made me realise why two distinct
operations are needed, so I'd like to change my suggestion to the
following:

    protocol = factory()
    yield from protocol.connection_made(transport) # Or callback equivalent

The protocol factory would still be used to create the protocol
object. However, the PEP would be updated to make it clear that
immediately after creation the *only* permitted method invocation on
the result is "connection_made", which will complete the protocol
initialization process.

The connection_made event handler would be redefined to return a
*Future* (or equivalent object) rather than completing synchronously.
create_connection would then call connection_made and *wait for it to
finish*, rather than using call_soon in a fire-and-forget fashion.

The advantage of this is that the rationale for the various possible
states become clear:

- the protocol factory is invoked synchronously, and is thus not
allowed to perform any blocking actions (but may trigger
"fire-and-forget" operations)
- connection_made is invoked asynchronously, and is thus able to wait
for various operations
- a protocol returned from create_connection is certain to have had
connection_made already called, thus a protocol implementation may
safely assume in other methods that both __init__ and connection_made
will have been called during the initialization process.

Cheers,
Nick.

--
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia