Why exception from os.path.exists()?

Steven D'Aprano steve+comp.lang.python at pearwood.info
Fri Jun 1 20:14:43 EDT 2018


On Sat, 02 Jun 2018 09:56:58 +1000, Chris Angelico wrote:

> On Sat, Jun 2, 2018 at 9:37 AM, Steven D'Aprano
> <steve+comp.lang.python at pearwood.info> wrote:
>> On Thu, 31 May 2018 17:43:28 +0000, Grant Edwards wrote:
>>
>>> Except on the platform in quetion filenames _don't_ contain an
>>> embedded \0.  What was passed was _not_ a path/filename.
>>
>> "/wibble/rubbish/nobodyexpectsthespanishinquistion" is not a pathname
>> on my system either, and os.path.exists() returns False for that. As it
>> is supposed to.
>>
>> I'd be willing to bet that:
>>
>> import secrets  # Python 3.6+
>> s = "/" + secrets.token_hex(1024) + "/spam"
>>
>> is not a pathname on any computer in the world. (If it is even legal.)
>> And yet os.path.exists(s) returns False.
> 
> With both of these, the path cannot exist because its first component
> does not exist.

Since /wibble doesn't exist, neither does /wibble/a\0b


py> os.path.exists("/wibble")
False
py> os.path.exists("/wibble/a\0b")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/storage/torrents/torrents/python/Python-3.6.4/Lib/
genericpath.py", line 19, in exists
    os.stat(path)
ValueError: embedded null byte


Oops.


> Absent a /wibble on your system, the entire long path is
> unable to exist. That is a natural consequence of the hierarchical
> structure of file systems. I'm fairly sure 2KB of path is valid on all
> major OSes today, 

But probably not 2K in a single path component.

But that's not really my point: I was responding to Grant, who claimed 
that \0 is not a pathname (or filename) and therefore ValueError is the 
correct response. But there are lots of things which aren't pathnames, or 
even which *cannot be* pathnames, and yet they return False. What makes 
\0 so special?


> which means that it's exactly the same as /wibble -
> the first component doesn't exist, ergo the path doesn't exist.
> 
>> The maximum number of file components under POSIX is (I believe) 256.
>> And yet:
>>
>> py> os.path.exists("/a"*1000000)
>> False
>>
>> "/a" by one million cannot possibly be a path under POSIX.
> 
> I can't actually find that listed anywhere. Citation needed. 

https://eklitzke.org/path-max-is-tricky



> But
> assuming you're right, POSIX is still a set of minimum requirements -
> not maximums, to my knowledge. 

It isn't even a set of minimum requirements. "<" is legal under POSIX, 
but not Windows.



> If some operating system permits longer
> paths with more components, it won't be non-compliant on that basis. So
> it's still plausible to ask "does this path exist", and it's perfectly
> correct to look at the first "/a/" and check if there's anything named
> "a" in your root directory, and return False upon finding none. The
> question is sane, unlike os.path.exists([]).

Correct.

Just as it is sane to ask if path "a\0b" exists. If it happens to be 
illegal on POSIX, just as "<" is illegal under Windows, it is still sane 
to ask, and you should get False returned.




-- 
Steven D'Aprano
"Ever since I learned about confirmation bias, I've been seeing
it everywhere." -- Jon Ronson




More information about the Python-list mailing list