[Async-sig] Asyncio loop instrumentation

Mon Jan 1 12:02:34 EST 2018

HI Antonie,

Regarding your questions

>
> What does it mean exactly? Is it the ratio of CPU time over wall clock
> time?

This can be considered a metric that informs you how much CPU
resources are being consumed by your loop, in the best case scenario
where there is only your process, this metric will match with the CPU
usage - important notice that will match with CPU where your process
is executed. Having many processes fighting for the same CPU this
number will be significantly different, taking into account that the
resources
are being divided by many consumers.

Therefore I would like to notice that this load is relative to your
loop rather than an objective value taken from the CPU metric.

To make so with `psutil` you must gather the CPU usage from that
specific CPU where your loop is currently running. Not an impossible
problem
but making it from something trivial to something more complicated.

In the case of the `time.thread_time`  I cant see how I could do that.
You would gather information related to the thread where your loop is
currently running, but there
s nothing straightforward that will help you to take into account
other threads that are fighting for that
specific CPU.

The solution presented is not perfect, and there is still some corner
cases where the load factor might not be enough accurate. The way of
the `load` method has to guess
if the loop is fighting for the CPU resources with other processes is
basically attributing only at maximum the timeout as sleeping time,
perhaps:

t0 = time()
select(fds, timeout=1)
t1 = time()
sleeping_time = min(t1 - t0, 1)

Therefore, if the call to the select took more than 1 second because
the scheduler decided to give the CPU to another process this lambda
time that goes beyond 1 second will be considered
as resource usage time. As you can imagine, the problem with that is
what happens when the select was ready before of 1 second, and the
schedule did not give back the CPU because there
was another more priority process, in that case, this time will be
attributed as sleeping time.

>> For this proposal [4], POC, I've preferred make a reduced list of events:
>>
>> * `loop_start` : Executed when the loop starts for the first time.
>> * `tick_start` : Executed when a new loop tick is started.
>> * `io_start` : Executed when a new IO process starts.
>> * `io_end` : Executed when the IO process ends.
>> * `tick_end` : Executed when the loop tick ends.
>> * `loop_stop` : Executed when the loop stops.
>
> What do you call a "IO process" in this context?

Basically the call to the `select/poll/whatever` syscall that will ask
for read or write to a set of file descriptors.

Thanks,

-- 
--pau