[New-bugs-announce] [issue32776] asyncio SIGCHLD scalability problems

holger report at bugs.python.org
Mon Feb 5 16:22:42 EST 2018


New submission from holger <holger+lp at freyther.de>:

I intended to use the asyncio framework for building an end-to-end test for our software. In the test I would spawn somewhere between 5k to 10k processes and have the same number of sockets to manage.

When I built a prototype I ran into some scaling issues. Instead of launching our real software I tested it with calls to sleep 30. At some point started processes would finish, a SIGCHLD would be delivered to python and then it would fail:

 Exception ignored when trying to write to the signal wakeup fd:
 BlockingIOError: [Errno 11] Resource temporarily unavailable

Using strace I saw something like:

send(5, "\0", 1, 0)                     = -1 EAGAIN (Resource temporarily unavailable)
waitpid(12218, 0xbf8592d8, WNOHANG)     = 0
waitpid(12219, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], WNOHANG) = 12219
send(5, "\0", 1, 0)                     = -1 EAGAIN (Resource temporarily unavailable)
waitpid(12220, 0xbf8592d8, WNOHANG)     = 0
waitpid(12221, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], WNOHANG) = 12221
--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=12293, si_uid=1001, si_status=0, si_utime=0, si_stime=
0} ---
getpid()                                = 11832
write(5, "\21", 1)                      = -1 EAGAIN (Resource temporarily unavailable)
sigreturn({mask=[]})                    = 12221
write(2, "Exception ignored when trying to"..., 64) = 64
write(2, "BlockingIOError: [Errno 11] Reso"..., 61) = 61


Looking at the code I see that si_pid of the signal will be ignored and instead wait(2) will be called for all processes. This doesn't seem to scale well enough for my intended use case.

I think what could be done is one of the following:

* Switch to signalfd for the event notification?
* Take si_pid and instead of just notifying that work is there.. inform about the PID that exited?
* Use wait(-1,... if there can be only one SIGCHLD handler to collect any dead child

----------
components: asyncio
messages: 311692
nosy: asvetlov, holger+lp, yselivanov
priority: normal
severity: normal
status: open
title: asyncio SIGCHLD scalability problems

_______________________________________
Python tracker <report at bugs.python.org>
<https://bugs.python.org/issue32776>
_______________________________________


More information about the New-bugs-announce mailing list