[Python-ideas] The async API of the future: PEP 3153 (async-pep)

Guido van Rossum guido at python.org
Sat Oct 13 01:22:39 CEST 2012


[Hopefully this is the last spin-off thread from "asyncore: included
batteries don't fit"]

[LvH]
>> > If there's one take away idea from async-pep, it's reusable protocols.

[Guido]
>> Is there a newer version that what's on
>> http://www.python.org/dev/peps/pep-3153/ ? It seems to be missing any
>> specific proposals, after spending a lot of time giving a rationale
>> and defining some terms. The version on
>> https://github.com/lvh/async-pep doesn't seem to be any more complete.

[LvH]
> Correct.

So it's totally unfinished?

> If I had to change it today, I'd throw out consumers and producers and just
> stick to a protocol API.
>
> Do you feel that there should be less talk about rationale?

No, but I feel that there should be some actual specification. I am
also looking forward to an actual meaty bit of example code -- ISTR
you mentioned you had something, but that it was incomplete, and I
can't find the link.

>> > The PEP should probably be a number of PEPs. At first sight, it seems
>> > that this number is at least four:
>> >
>> > 1. Protocol and transport abstractions, making no mention of
>> > asynchronous IO
>> > (this is what I want 3153 to be, because it's small, manageable, and
>> > virtually everyone appears to agree it's a fantastic idea)
>>
>> But the devil is in the details. *What* specifically are you
>> proposing? How would you write a protocol handler/parser without any
>> reference to I/O? Most protocols are two-way streets -- you read some
>> stuff, and you write some stuff, then you read some more. (HTTP may be
>> the exception here, if you don't keep the connection open.)
>
> It's not that there's *no* reference to IO: it's just that that reference is
> abstracted away in data_received and the protocol's transport object, just
> like Twisted's IProtocol.

The words "data_received" don't even occur in the PEP.

>> > 2. A base reactor interface
>>
>> I agree that this should be a separate PEP. But I do think that in
>> practice there will be dependencies between the different PEPs you are
>> proposing.
>
> Absolutely.
>
>> > 3. A way of structuring callbacks: probably deferreds with a built-in
>> > inlineCallbacks for people who want to write synchronous-looking code
>> > with
>> > explicit yields for asynchronous procedures
>>
>> Your previous two ideas sound like you're not tied to backward
>> compatibility with Tornado and/or Twisted (not even via an adaptation
>> layer). Given that we're talking Python 3.4 here that's fine with me
>> (though I think we should be careful to offer a path forward for those
>> packages and their users, even if it means making changes to the
>> libraries).
>
> I'm assuming that by previous ideas you mean points 1, 2: protocol interface
> + reactor interface.

Yes.

> I don't see why twisted's IProtocol couldn't grow an adapter for stdlib
> Protocols. Ditto for Tornado. Similarly, the reactor interface could be
> *provided* (through a fairly simple translation layer) by different
> implementations, including twisted.

Right.

>> But Twisted Deferred is pretty arcane, and I would much
>> rather not use it as the basis of a forward-looking design. I'd much
>> rather see what we can mooch off PEP 3148 (Futures).
>
> I think this needs to be addressed in a separate mail, since more stuff has
> been said about deferreds in this thread.

Yes, that's in the thread with subject "The async API of the future:
Twisted and Deferreds".

>> > 4+ adapting the stdlib tools to using these new things
>>
>> We at least need to have an idea for how this could be done. We're
>> talking serious rewrites of many of our most fundamental existing
>> synchronous protocol libraries (e.g. httplib, email, possibly even
>> io.TextWrapper), most of which have had only scant updates even
>> through the Python 3 transition apart from complications to deal with
>> the bytes/str dichotomy.
>
> I certainly agree that this is a very large amount of work. However, it has
> obvious huge advantages in terms of code reuse. I'm not sure if I understand
> the technical barrier though. It should be quite easy to create a blocking
> API with a protocol implementation that doesn't care; just call
> data_received with all your data at once, and presto! (Since transports in
> general don't provide guarantees as to how bytes will arrive, existing
> Twisted IProtocols have to do this already anyway, and that seems to work
> fine.)

Hmm... I guess that depends on how your legacy code works. As Barry
mentioned somewhere, the email package's feedparser() is an attempt at
implementing this -- but he sounded he has doubts that it works as-is
in an async environment.

However I am more worried about pull-based APIs. Take (as an extreme
example) the standard stream API for reading, especially
TextIOWrapper. I could see how we could turn the *writing* APIs async
easily enough, but I don't see how to do it for the reading end -- you
can't seriously propose to read the entire file into the buffer and
then satisfy all reads from memory.

>> > Re: forward path for existing asyncore code. I don't remember this being
>> > raised as an issue. If anything, it was mentioned in passing, and I think
>> > the answer to it was something to the tune of "asyncore's API is broken,
>> > fixing it is more important than backwards compat". Essentially I agree with
>> > Guido that the important part is an upgrade path to a good third-party
>> > library, which is the part about asyncore that REALLY sucks right now.
>>
>> I have the feeling that the main reason asyncore sucks is that it
>> requires you to subclass its Dispatcher class, which has a rather
>> treacherous interface.
>
> There's at least a few others, but sure, that's an obvious one. Many of the
> objections I can raise however don't matter if there's already an *existing
> working solution*. I mean, sure, it can't do SSL, but if you have code that
> does what you want right now, then obviously SSL isn't actually needed.

I think you mean this as an indication that providing the forward path
for existing asyncore apps shouldn't be rocket science, right? Sure, I
don't want to worry about that, I just want to make sure that we don't
*completely* paint ourselves into the wrong corner when it comes to
that.

>> > Regardless, an API upgrade is probably a good idea. I'm not sure if it
>> > should go in the first PEP: given the separation I've outlined above (which
>> > may be too spread out...), there's no obvious place to put it besides it
>> > being a new PEP.
>>
>> Aren't all your proposals API upgrades?
>
> Sorry, that was incredibly poor wording. I meant something more of an
> adapter: an upgrade path for existing asyncore code to new and shiny 3153
> code.

Yes, now it makes sense.

>> > Re base reactor interface: drawing maximally from the lessons learned in
>> > twisted, I think IReactorCore (start, stop, etc), IReactorTime (call later,
>> > etc), asynchronous-looking name lookup, fd handling are the important
>> > parts.
>>
>> That actually sounds more concrete than I'd like a reactor interface
>> to be. In the App Engine world, there is a definite need for a
>> reactor, but it cannot talk about file descriptors at all -- all I/O
>> is defined in terms of RPC operations which have their own (several
>> layers of) async management but still need to be plugged in to user
>> code that might want to benefit from other reactor functionality such
>> as scheduling and placing a call at a certain moment in the future.
>
> I have a hard time understanding how that would work well outside of
> something like GAE. IIUC, that level of abstraction was chosen because it
> made sense for GAE (and I don't disagree), but I'm not sure it makes sense
> here.

I think I answered this in the reactors thread -- I propose an I/O
object abstraction that is not directly tied to a file descriptor, but
for which a concrete implementation can be made to support file
descriptors, and another to support App Engine RPC.

> In this example, where would eg the select/epoll/whatever calls happen? Is
> it something that calls the reactor that then in turn calls whatever?

App Engine doesn't have select/epoll/whatever, so it would have a
reactor implementation that doesn't use them. But the standard Unix
reactor would support file descriptors using select/etc.

Please respond in the reactors thread.

>> > call_every can be implemented in terms of call_later on a separate object,
>> > so I think it should be (eg twisted.internet.task.LoopingCall). One thing
>> > that is apparently forgotten about is event loop integration. The prime way
>> > of having two event loops cooperate is *NOT* "run both in parallel", it's
>> > "have one call the other". Even though not all loops support this, I think
>> > it's important to get this as part of the interface (raise an exception for
>> > all I care if it doesn't work).
>>
>> This is definitely one of the things we ought to get right. My own
>> thoughts are slightly (perhaps only cosmetically) different again:
>> ideally each event loop would have a primitive operation to tell it to
>> run for a little while, and then some other code could tie several
>> event loops together.
>
> As an API, that's pretty close to Twisted's IReactorCore.iterate, I think.
> It'd work well enough. The issue is only with event loops that don't
> cooperate so well.

Again, a topic for the reactor thread.

But I'm really hoping you'll make good on your promise of redoing
async-pep, giving some actual specifications and example code, so I
can play with it.

-- 
--Guido van Rossum (python.org/~guido)



More information about the Python-ideas mailing list