fork/exec & close file descriptors

Marko Rauhamaa marko at pacujo.net
Wed Jun 3 09:33:49 EDT 2015


Marko Rauhamaa <marko at pacujo.net>:

> random832 at fastmail.us:
>
>> Why does the child process need to report the error at all? The parent
>> process will find out naturally when *it* tries to close the same file
>> descriptor.
>
> That's not how it goes.
>
> File descriptors are reference counted in the Linux kernel. Closes are
> no-ops except for the last one that brings the reference count to zero.
>
> If the parent should close the file before the child, no error is
> returned to the parent.

First of all, it's not the descriptors that are refcounted -- it's the
files referred to by the descriptors.

However, I was also wrong about close() being a no-operation. Here's the
relevant kernel source code snippet:

========================================================================
int __close_fd(struct files_struct *files, unsigned fd)
{
	struct file *file;
	struct fdtable *fdt;

	spin_lock(&files->file_lock);
	fdt = files_fdtable(files);
	if (fd >= fdt->max_fds)
		goto out_unlock;
	file = fdt->fd[fd];
	if (!file)
		goto out_unlock;
	rcu_assign_pointer(fdt->fd[fd], NULL);
	__clear_close_on_exec(fd, fdt);
	__put_unused_fd(files, fd);
	spin_unlock(&files->file_lock);
	return filp_close(file, files);

out_unlock:
	spin_unlock(&files->file_lock);
	return -EBADF;
}

int filp_close(struct file *filp, fl_owner_t id)
{
	int retval = 0;

	if (!file_count(filp)) {
		printk(KERN_ERR "VFS: Close: file count is 0\n");
		return 0;
	}

	if (filp->f_op->flush)
		retval = filp->f_op->flush(filp, id);

	if (likely(!(filp->f_mode & FMODE_PATH))) {
		dnotify_flush(filp, id);
		locks_remove_posix(filp, id);
	}
	fput(filp);
	return retval;
}
========================================================================

What is revealed is that:

 1. The file descriptor is released regardless of the return value of
    close(2):

    __put_unused_fd(files, fd);

 2. The file object's refcount is decremented accordingly regardless of
    the return value of close(2).

 3. The return value reflects the success of the optional flush() method
    of the file object.

 4. The flush() method is called with each call to close(2), not only
    the last one.

IOW, os.close() closes the file even if it should report a failure. I
couldn't have guessed that behavior from the man page.

So the strategy you proposed is the right one: have the child process
ignore any possible errors from os.close(). The parent will have an
opportunity to deal with them.

And now Linux is back in the good graces, only the man page is
misleading.


Marko



More information about the Python-list mailing list