Why exception from os.path.exists()?

Fri Jun 1 11:58:42 EDT 2018

On 6/1/18 9:58 AM, Chris Angelico wrote:
> On Fri, Jun 1, 2018 at 11:41 PM, Richard Damon <Richard at damon-family.org> wrote:
>> The confusion is that in python, a string with an embedded null is
>> something pretty much like a string without an embedded null, so the
>> programmer might not think of it as being the wrong type. Thus we have
>> several options.
>>
>> 1) we can treat os.path.exists('foo\0bar') the same as
>> os.path.exists(1.5) and raise the exception.
> 1.5 raises TypeError, which is correct. But the type of "foo\0bar" is
> str, which is a perfectly valid type. ValueError is more correct here.
> And that's what currently happens.
>
> Possibly more confusing, though, is this:
>
>>>> os.path.exists(1)
> True
>>>> os.path.exists(2)
> True
>>>> os.path.exists(3)
> False
>
> I think it's testing that the file descriptors exist, because
> os.path.exists is defined in terms of os.stat, which can stat a path
> or an FD. So os.path.exists(fd) is True if that fd is open, and False
> if it isn't. But os.path.exists is not documented as accepting FDs.
> Accident of implementation or undocumented feature? Or maybe
> accidental feature?
>
>> 2) we can treat os.path.exists('foo\0bar') as specifying a file that can
>> never exists and bypass the system call are return false.
> That's what's being proposed.
>
>> 3) we can process os.path.exists('foo\0bar') by just passing the string
>> to the system call, making it the same as os.path.exists('foo')
>>
>> The last is probably the one that we can say is likely wrong, but
>> arguments could be made for either of the first two.
> More than "likely wrong"; it's definitely wrong, and deceptively so. I
> don't think anyone would support this case.
>
> ChrisA

I would say that one way to look at it is that os.path.exists
fundamentally (at the OS level) expects a parameter of the 'type' of
either a nul terminated string or a file descriptor (aka fixed width
integer). One issue we have is that these 'types' don't directly map to
Python types.

We can basically make a call to os.path.exists with 4 different types of
parameter:

1) The parameter has a totally wrong type of type that just doesn't map
to one of the expected type. This gives a TypeError exeception.

2) The parameter has a Python type that maps to right OS 'type' but has
a value that prevents us from properly converting it to a corresponding
value of that type. This could be a integral value out of range for the
fixed width type used, or a string which contains an embedded nul.
Currently these generate an OverflowError for out of range integer and a
ValueError for a bad string

3) The parameter can be mapped to the proper type but the value is
somehow illegal (the number fits the type, but isn't legal for a file
descriptor, or a string has a value that can't represent a real file).
In this case, os.path.exists doesn't try to validate the parameter but
just passes it along and returns a value based on the answer it gets.

4) The parameter represents a legal value of a right type, so as above
we pass the value and get back the answer.

The fundamental question is about case 2. Should os.path.exist, having
been give a value of the right 'Python Type' but not matching the type
of the operating system parameter identify this as an error (as it
currently does), or should it be changed to decide that if it could
somehow get that parameter to the os, then it would say that the file
doesn't exist, and so return false. I would say that if you accept that,
should we also say that if we pass a totally wrong type, why shouldn't
we again return false instead of a TypeError, after all, if we pass it a
dictionary, they certainly is no file like that in existence,

The real question comes which method is more useful, which is most apt
to be the one we want, and which one is the better building block for a
program.

One thing to note as an advantage for the current method, it is trivial
with the current decision to write a mypathexists that would accept
strings with nuls embedded and return false, just call os.path.exists
inside a try, and catch the ValueError and return false. You could also
extend it to catch OverflowError and/or TypeError. On the other hand, if
os.path.exists swallows these errors and just returns false, then it is
a lot more work to make a wrapper that throws the errors, you basically
would need to precheck for bad values and throw, and then if you move to
a system that happened to allow nuls in the file name (and the python
code knew that), your wrapper code now is wrong as you had to build in
implementation knowledge into the user code.

-- 
Richard Damon