[Python-Dev] PEP 433: Choose the default value of the new cloexec parameter

Charles-François Natali cf.natali at gmail.com
Fri Jan 25 09:56:52 CET 2013


Hello,

> I tried to list in the PEP 433 advantages and drawbacks of each option.
>
> If I recorded correctly opinions, the different options have the
> following supporters:
>
>  a) cloexec=False by default
>  b) cloexec=True by default: Charles-François Natali
>  c) configurable default value: Antoine Pitrou, Nick Coghlan, Guido van Rossum

You can actually count me in the cloexec=False camp, and against the
idea of a configurable default value. Here's why:

Why cloexec shouldn't be set by default:
- While it's really tempting to fix one of Unix historical worst
decisions, I don't think we can set file descriptors cloexec by
default: this would break some applications (I don't think there would
be too many of them, but still), but most notably, this would break
POSIX semantics. If Python didn't expose POSIX syscalls and file
descriptors, but only high-level file streams/sockets/etc, then we
could probably go ahead, but now it's too late. Someone said earlier
on python-dev that many people use Python for prototyping, and indeed,
when using POSIX API, you expect POSIX semantics.

Why the default value shouldn't be tunable:
- I think it's useless: if the default cloexec behavior can be altered
(either by a command-line flag, an environment variable or a sys
module function), then libraries cannot rely on it and have to make
file descriptors cloexec on an individual basis, since the default
flag can be disabled. So it would basically be useless for the Python
standard library, and any third-party library. So the only use case is
for application writers that use raw exec() (since subprocess already
closes file descriptors > 3, and AFAICT we don't expose a way to
create processes "manually" on Windows), but there I think they fall
into two categories: those who are aware of the problem of file
descriptor inheritance, and who therefore set their FDs cloexec
manually, and those who are not familiar with this issue, and who'll
never look up a sys.setdefaultcloexec() tunable (and if they do, they
might think: "Hey, if that's so nice, why isn't it on by default?
Wait, it might break applications? I'll just leave the default
then.").
- But most importantly, I think such a tunable flag is a really wrong
idea because it's a global tunable that alters the underlying
operating system semantics. Consider this code:
"""
r, w = os.pipe()
if os.fork() == 0:
    os.execve(['myprog'])
"""

With a tunable flag, just by looking at this code, you have no way to
know whether the file descriptor will be inherited by the child
process. That would be introducing an hidden global variable silently
changing the semantics of the underlying operating system, and that's
just so wrong.

Sure, we do have global tunables:
"""
sys.setcheckinterval()
sys.setrecursionlimit()
sys.setswitchinterval()

hash_randomization
"""

But those alter "extralinguistic" behavior, i.e. they don't affect the
semantics of the language or underlying operating system in a way that
would break or change the behavior of a "conforming" program.

Although it's not as bad, just to belabor the point, imagine we
introduced a new method:
"""
sys.enable_integer_division(boolean)
Depending on the value of this flag, the division of two integers will
either yield a floating point or truncated integer value.
"""

Global variables are bad, hidden global variables are worse, and
hidden global variables altering language/operating system semantics
are evil :-)

What I'd like to see:
- Adding a "cloexec" parameter to file descriptor creating
functions/classes is fine, it will make it easier for a
library/application writer to create file descriptors cloexec,
especially in an atomic way.
- We should go over the standard library, and create FDs cloexec if
they're not handed over to the caller, either because they're
opened/closed before returning, or because the underlying file
descriptor is kept private (not fileno() method, although it's
relatively rare). That's the approach chosen by glibc, and it makes
sense: if another thread forks() while a thread is in the middle of
getpwnam(), you don't want to leak an open file descriptor to
/etc/passwd (or /etc/shadow).

cf


More information about the Python-Dev mailing list