Why exception from os.path.exists()?

Barry Scott barry at barrys-emacs.org
Fri Jun 1 08:15:38 EDT 2018


On Thursday, 31 May 2018 14:03:01 BST Marko Rauhamaa wrote:
> Chris Angelico <rosuav at gmail.com>:
> > On Thu, May 31, 2018 at 10:03 PM, Marko Rauhamaa <marko at pacujo.net> wrote:
> >> This surprising exception can even be a security issue:
> >>    >>> os.path.exists("\0")
> >>    
> >>    Traceback (most recent call last):
> >>      File "<stdin>", line 1, in <module>
> >>      File "/usr/lib64/python3.6/genericpath.py", line 19, in exists
> >>      
> >>        os.stat(path)
> >>    
> >>    ValueError: embedded null byte
> > 
> > [...]
> > 
> > A Unix path name cannot contain a null byte, so what you have is a
> > fundamentally invalid name. ValueError is perfectly acceptable.
> 
> At the very least, that should be emphasized in the documentation. The
> pathname may come from an external source. It is routine to check for
> "/", "." and ".." but most developers (!?) would not think of checking
> for "\0". That means few test suites would catch this issue and few
> developers would think of catching ValueError here. The end result is
> unpredictable.

I think the reason for the \0 check is that if the string is passed to the 
operating system with the \0 you can get surprising results.

If \0 was not checked for you would be able to get True from:

	os.file.exists('/home\0ignore me')

This is because a posix system only sees '/home'.
Surely ValueError is reasonable?

Once you know that all of the string you provided is given to the operating 
system it can then do whatever checks it sees fit to and return a suitable 
result.

As an aside Windows has lots of special filenames that you have to know about 
if you are writting robust file handling. AUX, COM1, \this\is\also\COM1 etc.

Barry

> 
> 
> Marko







More information about the Python-list mailing list