Asyncio Queue implementation suggestion

Alberto Sentieri 22t at tripolho.com
Wed Sep 16 13:39:51 EDT 2020


I have a suggestion about the implementation of asyncio queues that 
could improve performance. I might be missing something, however. I am 
sort of new to Python. Below a short description of the problem I am facing.

I wrote a daemon in Python 3 (running in Linux) which test many devices 
at the same time, to be used in a factory environment. This daemon 
include multiple communication events to a back-end running in another 
country. I am using a class for each device I test, and embedded into 
the class I use asyncio. Due to the application itself and the number of 
devices tested simultaneously, I soon run out of file descriptor. Well, 
I increased the number of file descriptor in the application and then I 
started running into problems like “ValueError: filedescriptor out of 
range in select()”. I guess this problem is related to a package called 
serial_asyncio, and of course, that could be corrected. However I became 
curious about the number of open file descriptors opened: why so many?

Apparently asyncio Queues use a Linux pipe and each queue require 2 file 
descriptors. Am I correct? So I asked my self: if a asyncio queue is 
just a mechanism of piping information between two asyncio tasks, which 
should never run at the same time, why do I need the operating system in 
the middle of that? Isn’t the whole idea about asyncio that the 
operating system would be avoided whenever possible? No one will put 
anything into a queue if asyncio called epoll, because some Python code 
should be running to push things into the queue. If there is nothing in 
a particular queue, nothing will show up while asyncio is waiting for a 
file descriptor event. So, if I am correct, it would be more efficient 
to put the queue in a ready-queue list whenever something is pushed into 
it. Then, just before asyncio calls epoll (or select), it would check 
that ready queue, and it would process it before the epoll call. I mean 
that epoll would not be called unless all the queues have been properly 
processed. Queues would be implemented in a much simpler way, using 
local memory: a simple array may be enough to do the job. With that the 
OS would be avoided, and a much lower number of file descriptors would 
be necessary.



More information about the Python-list mailing list