Why exception from os.path.exists()?

Barry Scott barry at barrys-emacs.org
Mon Jun 4 06:16:21 EDT 2018



> On 1 Jun 2018, at 14:23, Paul Moore <p.f.moore at gmail.com> wrote:
> 
> On 1 June 2018 at 13:15, Barry Scott <barry at barrys-emacs.org> wrote:
>> I think the reason for the \0 check is that if the string is passed to the
>> operating system with the \0 you can get surprising results.
>> 
>> If \0 was not checked for you would be able to get True from:
>> 
>>        os.file.exists('/home\0ignore me')
>> 
>> This is because a posix system only sees '/home'.

Turns out that this is a limitation on Windows as well.
The \0 is not allowed for Windows, macOS and Posix.

> 
> So because the OS API can't handle filenames with \0 in (because that
> API uses null-terminated strings) Python has to special case its
> handling of the check. That's fine.
> 
>> Surely ValueError is reasonable?
> 
> Well, if the OS API can't handle filenames with embedded \0, we can be
> sure that such a file doesn't exist - so returning False is
> reasonable.

I think most of the file APIs check for \0 and raise ValueError on python3
and  TypeError on python2.

os.path.exists() is not special and I don't think should be be changed.

> 
>> Once you know that all of the string you provided is given to the operating
>> system it can then do whatever checks it sees fit to and return a suitable
>> result.
> 
> As the programmer, I don't care. The Python interpreter should take
> care of that for me, and if I say "does file 'a\0b' exist?" I want an
> answer. And I don't see how anything other than "no it doesn't" is
> correct. Python allows strings with embedded \0 characters, so it's
> possible to express that question in Python - os.path.exists('a\0b').
> What can be expressed in terms of the low-level (C-based) operating
> system API shouldn't be relevant.
> 
> Disclaimer - the Python "os" module *does* expose low-level
> OS-dependent functionality, so it's not necessarily reasonable to
> extend this argument to other functions in os. But it seems like a
> pretty solid argument in this particular case.
> 
>> As an aside Windows has lots of special filenames that you have to know about
>> if you are writting robust file handling. AUX, COM1, \this\is\also\COM1 etc.
> 
> I don't think that's relevant in this context.

I think it is. This started because the OP was surprised that they needed to check for \0.
There are related surprised waiting. I'm point out that its more then \0 a robust
piece of code will need to consider.

Barry




More information about the Python-list mailing list