[Async-sig] Asyncio, implementing a new fair scheduling reactor

Fri Jun 16 07:47:03 EDT 2017

Just a couple of thoughts:

1. A good place to start is Linux BFS (simple, correct, good
performance). Typical test case is "make -j4", perhaps there's a way
to simulate something similar?

2. There's a space to research async/await scheduling (in academic
sense), older research may have been focused on .net, and current
research probably on browsers and node, in any case javascript.

d.

On 15 June 2017 at 23:40, Pau Freixes <pfreixes at gmail.com> wrote:
> Hi guys, recently I've been trying to implement a POC of the default
> event loop implemented by asyncio but using a fair scheduling reactor.
> At the moment is just
> a POC [1], something to test the rationale and pending to be evolved
> in someone more mature, but before of that I would prefer to share my
> thoughts and get all of the comments from you.
>
> The current implementation is based on a FIFO queue that is filled
> with all of the callbacks that have to be executed, these callbacks
> can stand for:
>
> 1) Run tasks, either to be started or resumed
> 2) Run future callbacks.
> 3) Run scheduled callbacks.
> 4) Run file descriptors callbacks.
>
> Worth mentioning that most of them internally are chained in somehow,
> perhaps a future callback can wake up a resumed task. Also, have in
> mind that the API published by asyncio to schedule callbacks can be
> used by asyncio itself or by the user, callsoon for example.
>
> The usual flow the reactor is the following one:
>
> 1) Check the file descriptors with events, and stack the handlers into
> the reactor.
> 2) Pop all outdated scheduled callbacks and push them into the reactor.
> 3) Iterate for N first elements at the queue, where N stands for the number of
> the handles stacked at that moment. Future handles stacked during that
> iteration won't be handled, they must wait until next whole iteration
> 4) Go to the point 1.
>
> As you can observe here, the IO is only made once per loop and should
> wait until all handles that are in a specific moment are executed.
>
> This implements in somehow a natural backpressure, the read and also
> the accept the new connections will rely on the buffers run by the
> operating system.
>
> That implementation can be seen as simple, but it stands on a solid
> strategy and follows KISS design that helps to scare the bugs.
>
> Why fair scheduling?
>
> Not all code that is written in the same module, in terms of loop
> sharing, has the same requirements. Some part might need N and other
> parts M. When this implementation cant be decoupled, and it means that
> the cost of placing them into a separated pieces inside of your
> architecture are too expensive, in that scenario the developer cant
> express this difference to make the underlying implementation aware of
> that.
>
> For example, an API with a regular endpoint accessed by the user and
> another one with the health-check of the system, which has completely
> different requirements in terms of IO. Not only due to the nature of
> the resources accessed, also because of the frequency of use.
> Meanwhile, the healthcheck is accessed to a known a frequency at X
> seconds, the other endpoint has a variable frequency of use.
>
> Do you believe that asyncio will be able to preserve the health-check
> frequency at any moment? Absolutely not.
>
> Therefore, the idea of implementing a fair scheduling reactor is based
> on the needed of address these kind of situations, giving to the
> developer an interface to isolate different resources.
>
> Basic principles
>
> The basic principles of the implementation are:
>
> - The cost of the scheduling has to be the same of the current
> implementation, no overhead
> - The design has to follow the current one, having the implicit
> backpressure that was commented.
>
> I will focus in the second principle, taking into account that the
> first one is a matter of implementation.
>
> To achieve the same behavior, the new implementation only split the
> resources - handles, schedules, file descriptors - in isolated
> partitions to then implement for each partition the same algorithm
> than the current one. The developer can create a new partition using a
> new function called `spawn`, this function takes as an argument a
> coroutine, the task wrapped to that coroutine and all of the resources
> created inside this coroutine will belong to that partition. For
> example:
>
>>>> async def background_task():
>>>>     task = [ fetch() for i in range(1000)]
>>>>     return (await asyncio.gather(*t))
>>>>
>>>> async def foo():
>>>>     return (await asyncio.spawn(background_task()))
>
>
> All resources created inside the scope of the `background_tasks` are
> isolated to one partition. The 1000 sockets will schedule callbacks
> that will be stacked in the same queue.
>
> The partition is by default identified with the hash of the task that
> warps the `background_task`, but the user can pass an alternative
> value.
>
>>>> async def foo():
>>>>     return (await asyncio.spawn(healtheck(), partition='healthcheck'))
>
> Internally the implementation has a default ROOT partition that is
> used for all of these resources that are not executed inside of the
> scope of a spawn function. As you can guess, if you don't use the
> spawn method the reactor will run exactly as the current
> implementation. Having all the resources in the same queue.
>
>
> Round robin between partitions.
>
> The differents partitions that exist at some moment share the CPU
> resource using a round robin strategy. It gives the same chance to all
> partitions to run the same amount of handles, but with one
> particularity. Each time that a partition runs out of handles, the
> loop is restarted again to handle the file descriptors and the delayed
> calls but only for that specific partition that runs out of handles.
>
> The side effect is clear, have the same backpressure mechanism. But,
> per partition.
>
>
> The black hole of the current implementation.
>
> There is always a but, at least I've found a situation where this
> strategy can perform in the same way as the current one, without
> applying any fair scheduling. Although the code uses the spawn method.
>
> Have a look at the following snippet:
>
>>>> async def healtheck(request):
>>>>     await request.resp()
>>>>
>>>> async def view(request):
>>>>     return (await asyncio.spawn(healthcheck(request)))
>
>
> The task that wraps the healtcheck coroutine that is being isolated in
> a partition, won't be scheduled until the data from the file
> descriptor that is read by a callback that is in fact executed inside
> of the ROOT partition. Therefore, in the worst case scenario, the fair
> scheduling will become a simple FIFO scheduling.
>
> IMHO there is not an easy way to solve that issue, or at least without
> changing the current picture. And try to solve it, might end up having
> a messy implementation and a buggy code.
>
> Although, I believe that this still worth it, having in mind the
> benefits that it will bring us for all of those cases where the user
> needs to isolate resources.
>
> Thoughts, comments, and others will be welcomed.
>
> [1] https://github.com/pfreixes/qloop
>
> --
> --pau
> _______________________________________________
> Async-sig mailing list
> Async-sig at python.org
> https://mail.python.org/mailman/listinfo/async-sig
> Code of Conduct: https://www.python.org/psf/codeofconduct/