[Async-sig] Feedback, loop.load() function

Tue Aug 22 07:55:01 EDT 2017

Hi,

> I personally don't think we need this in asyncio.  While the function has a
> relatively low overhead, it's still an overhead, it's a couple more syscalls
> on each loop iteration, and it's a bit of a technical debt.
>
> The bar for adding new functionality to asyncio is very high, and I don't
> see a good rationale for adding this function.  Is it for debugging
> purposes?  Or for profiling live applications?  If it's the latter, then
> there are so many other things we want to see, and some of them are
> protocol-specific.

Let's try to evolve the rationale and put some extra links.

The load of an Asyncio loop can be at some point easily inferred using
the sleeping time vs the overall time,
this information brings us to understand how to saturate is the loop
with a metric that informs you how many CPU
resources are being used, or most important how many CPU resources left.

How helpful can be this method?

In our organization, e use back-pressure at the application layer of
our REST microservices architecture. It allows us to prevent
overloading the services. Once the back pressures kicks in we can
scale horizontally our services to cope the current load. This is
already implemented for other languages and we are currently working
on how to implement it with the aiohttp(asyncio) stack. For more info
about this technique these articles [1] [2]

We are not the first ones running microservices at scale, and this
pattern has been implemented by other organizations. I would like to
mention the Google case [2]. From that link I would like to bold the
following paragraph:

"""
A better solution is to measure capacity directly in available
resources. For example, you may have a total of 500 CPU cores and 1 TB
of memory reserved for a given service in a given datacenter.
Naturally, it works much better to use those numbers directly to model
a datacenter's capacity. We often speak about the cost of a request to
refer to a normalized measure of how much CPU time it has consumed
(over different CPU architectures, with consideration of performance
differences).

In a majority of cases (although certainly not in all), we've found
that simply using CPU consumption as the signal for provisioning works
well
"""

>From my understanding, the comment is pretty aligned with the
implementation proposal for the Asyncio loop Having, as a result, a
way to measure if there are enough resources to cope the ongoing
metric.

[1] https://dzone.com/articles/applying-back-pressure-when
[2] http://engineering.voxer.com/2013/09/16/backpressure-in-nodejs/
[3] https://landing.google.com/sre/book/chapters/handling-overload.html

> If we want to add some tracing/profiling functions there should be a way to
> disable them, otherwise the performance of event loops like uvloop will
> degrade, and I'm not sure that all of its users want to pay a price for
> something they won't ever be using.  All of this just adds to the
> complexity.

The goal will be, have an implementation without impact performance
for real applications. I'm still not sure if this is reachable with
the uvloop, I would like to start working on this as soon as possible,
having the proper numbers and the possibilities to implement this in
libuv will help to get the proper answer.

If at last, there is no way to make it negligible, then I would agree
that is needed a way to switch off or switch on.