Eventfd with epoll BlockingIOError

jenkris at tutanota.com jenkris at tutanota.com
Thu Nov 25 17:29:07 EST 2021


Thanks very much for your reply.  

I am now getting a single event returned in Python, but it's not the right event, as I'll explain below. 

I rearranged the Python code based on your comments:

#!/usr/bin/python3
import sys
import os
import select

print("Inside Python")

event_fd = int(sys.argv[3])

print("Eventfd received by Python")
print(event_fd)

event_write_value = 100

ep = select.epoll(-1)
ep.register(event_fd, select.EPOLLIN | select.EPOLLOUT )

os.set_blocking(event_fd, False)

#__________

print("Starting poll loop")

for fd_event in ep.poll():
    print("Python fd_event")
    print(fd_event)
    fd_received = fd_event[0]
    event_received = fd_event[1]

You advised to leave off select.EPOLLOUT from the line ep.register(event_fd, select.EPOLLIN | select.EPOLLOUT ) -- which makes sense because I'm not waiting for that event -- but without it both processes freeze in the for loop (below print("Starting poll loop")) so we never receive an EPOLLIN event.  So I included it, and here is the screen output from gdb:

Inside Python
Eventfd received by Python
5
Everything OK in Python
Starting poll loop
Python fd_event
(5, 4)
Writing to Python
5 Received from Python
8 Writing to Python
Failed epoll_wait Bad file descriptor
5 Received from Python
8 Writing to Python
Failed epoll_wait Bad file descriptor
5 Received from Python
-1time taken 0.000629
Failed to close epoll file descriptor
Unlink_shm status: Bad file descriptor
fn() took 0.000717 seconds to execute
[Inferior 1 (process 26718) exited normally]
(gdb) q

The Python fd_event tuple is 5, 4 -- 5 is the correct file descriptor and 4 is an EPOLLOUT event, which is not what I want. 

The eventfd is created in C as nonblocking:

int eventfd_initialize() {
  int efd = eventfd(0, EFD_NONBLOCK);
  return efd; }

When C writes it calls epoll_wait:

ssize_t epoll_write(int event_fd, int epoll_fd, struct epoll_event * event_struc, int action_code)
{
   int64_t ewbuf[1];
   ewbuf[0] = (int64_t)action_code;
   int maxevents = 1;
   int timeout = -1;

   fprintf(stdout, " Writing to Python \n%d", event_fd);

    write(event_fd, &ewbuf, 8);

    if (epoll_wait(epoll_fd, event_struc, maxevents, timeout) == -1)
    {
        fprintf(stderr, "Failed epoll_wait %s\n", strerror(errno));
    }

    ssize_t rdval = read(event_fd, &ewbuf, 8);   

    fprintf(stdout, " Received from Python \n%ld", rdval);

    return 0;
}

The C side initializes its epoll this way:

int epoll_initialize(int efd, int64_t * output_array)
{
  struct epoll_event ev = {};
  int epoll_fd = epoll_create1(0);

  struct epoll_event * ptr_ev = &ev;
  
  if(epoll_fd == -1)
  {
    fprintf(stderr, "Failed to create epoll file descriptor\n");
    return 1;
  }

  ev.events = EPOLLIN | EPOLLOUT;
  ev.data.fd = efd; //was 0

  if(epoll_ctl(epoll_fd, EPOLL_CTL_ADD, efd, &ev) == -1)
  {
      fprintf(stderr, "Failed to add file descriptor to epoll\n");
      close(epoll_fd);
      return 1;
  }

  output_array[0] = epoll_fd;
  output_array[1] = (int64_t)ptr_ev; //&ev;

  return 0;
}

Technically C is not waiting for an EPOLLIN event, but again without it both processes freeze unless either C or Python includes both events.  So that appears to be where the problem is. 

The Linux epoll man page says, "epoll_wait waits for I/O events, blocking the calling thread if no events are currently available."   https://man7.org/linux/man-pages/man7/epoll.7.html.  That may be the clue to why both processes freeze when I poll on only one event in each one. 

Thanks for any ideas based on this update, and thanks again for your earlier reply. 

Jen


-- 
 Sent with Tutanota, the secure & ad-free mailbox. 



Nov 25, 2021, 06:34 by barry at barrys-emacs.org:

>
>
>
>> On 24 Nov 2021, at 22:42, Jen via Python-list <>> python-list at python.org>> > wrote:
>>
>> I have a C program that uses fork-execv to run Python 3.10 in a child process, and I am using eventfd with epoll for IPC between them.  The eventfd file descriptor is created in C and passed to Python through execv.  Once the Python child process starts I print the file descriptor to verify that it is correct (it is).  
>>
>> In this scenario C will write to the eventfd at intervals and Python will read the eventfd and take action based on the value in the eventfd.  But in the Python while True loop I get "BlockingIOError: [Errno 11] Resource temporarily unavailable" then with each new read it prints "Failed epoll_wait Bad file descriptor." 
>>
>> This is the Python code:
>>
>> #!/usr/bin/python3
>> import sys
>> import os
>> import select
>>
>> print("Inside Python")
>>
>> event_fd = int(sys.argv[3])
>>
>>
>> print("Eventfd received by Python")
>> print(event_fd)
>>
>> ep = select.epoll(-1)
>> ep.register(event_fd, select.EPOLLIN | select.EPOLLOUT)
>>
>
> This says tell me if I can read or write to the event_fd.
> write will be allowed until the kernel buffers are full.
>
> Usually you only add EPOLLOUT if you have data to write.
> In this case do not set EPOLLOUT.
>
> And if you know that you will never fill the kernel buffers then you
> do not need to bother polling for write.
>
>
>>
>> event_write_value = 100
>>
>> while True:
>>
>>     print("Waiting in Python for event")
>>     ep.poll(timeout=None, maxevents=- 1)
>>
>
> You have to get the result of the poll() and process the list of entries that are returned.
>
> You must check that POLLIN is set before attempting the read.
>
>
>
>>     v = os.eventfd_read(event_fd)
>>
>
> Will raise EWOULDBLOCK because there is no data available to read.
>
> Here is the docs from python:
>
> poll.> poll> (> [> timeout> ]> )
>
> Polls the set of registered file descriptors, and returns a possibly-empty listcontaining > (fd,>  > event)>  2-tuples for the descriptors that have events orerrors to report. > fd>  is the file descriptor, and > event>  is a bitmask withbits set for the reported events for that descriptor — > POLLIN>  forwaiting input, > POLLOUT>  to indicate that the descriptor can be writtento, and so forth. An empty list indicates that the call timed out and no filedescriptors had any events to report. If > timeout>  is given, it specifies thelength of time in milliseconds which the system will wait for events beforereturning. If > timeout>  is omitted, negative, or > None <>> , the call willblock until there is an event for this poll object.
>
> You end up with code like this:
>
> for fd_event in ep.poll():
> fd, event == fd_event
> if (event&select.POLLIN) != 0 and fd == event_fd:
> v = os.eventfd_read(event_fd)
>
>
>>
>>     if v != 99:
>>         print("found")
>>         print(v)
>>
>>         os.eventfd_write(event_fd, event_write_value)
>>
>>     if v == 99:
>>         os.close(event_fd)
>>
>> This is the C code that writes to Python, then waits for Python to write back:
>>
>> ssize_t epoll_write(int event_fd, int epoll_fd, struct epoll_event * event_struc, int action_code)
>> {
>>    int64_t ewbuf[1];
>>    ewbuf[0] = (int64_t)action_code;
>>    int maxevents = 1;
>>    int timeout = -1;
>>
>>    fprintf(stdout, " Writing to Python \n%d", event_fd);
>>
>>    write(event_fd, &ewbuf, 8);
>>
>>     if (epoll_wait(epoll_fd, event_struc, maxevents, timeout) == -1)
>>     {
>>         fprintf(stderr, "Failed epoll_wait %s\n", strerror(errno));
>>     }
>>
>>     ssize_t rdval = read(event_fd, &ewbuf, 8);   
>>
>>     fprintf(stdout, " Received from Python \n%ld", rdval);
>>
>>     return 0;
>> }
>>
>> This is the screen output when I run with gdb:
>>
>>           Inside Python
>> Eventfd received by Python
>> 5
>> Waiting in Python for event
>> Traceback (most recent call last):
>>   File "/usr/local/lib/python3.10/runpy.py", line 196, in >>  >> _run_module_as_main
>>     return _run_code(code, main_globals, None,
>>   File "/usr/local/lib/python3.10/runpy.py", line 86, in _run_code
>>     exec(code, run_globals)
>>   File "/opt/P01_SH/NPC_CPython.py", line 36, in <module>
>>     v = os.eventfd_read(event_fd)
>> BlockingIOError: [Errno 11] Resource temporarily unavailable
>>
>
> Expected as there you have not checked that there is data to read.
> Check for POLLIN being set.
>
>
>> Writing to Python
>> 5 Received from Python
>> 8 Writing to Python
>> Failed epoll_wait Bad file descriptor
>> 5 Received from Python
>> 8 Writing to Python
>> Failed epoll_wait Bad file descriptor
>> 5 Received from Python
>> -1time taken 0.000548
>> Failed to close epoll file descriptor
>> Unlink_shm status: Bad file descriptor
>> fn() took 0.000648 seconds to execute
>> [Inferior 1 (process 12618) exited normally]
>> (gdb)
>>
>> So my question is why do I get "BlockingIOError: [Errno 11] Resource temporarily unavailable" and "Failed epoll_wait Bad file descriptor" from Python? 
>>
>
> If your protocol is not trivia you should implement a state machine to know what to do at each event.
>
> Barry
>
>
>>
>> -- 
>>  Sent with Tutanota, the secure & ad-free mailbox. 
>> -- 
>> https://mail.python.org/mailman/listinfo/python-list
>>



More information about the Python-list mailing list