Remove directory tree without following symlinks

Nobody nobody at nowhere.invalid
Sat Apr 23 12:29:06 EDT 2016


On Sat, 23 Apr 2016 00:56:33 +1000, Steven D'Aprano wrote:

> I want to remove a directory, including all files and subdirectories under
> it, but without following symlinks. I want the symlinks to be deleted, not
> the files pointed to by those symlinks.

Note that this is non-trivial to do securely, i.e. where an adversary has
write permission on any of the directories involved. Due to the potential
for race conditions between checking whether a name refers to a directory
and recursing into it, the process can be tricked into deleting any
directory tree for which it has the appropriate permissions.

The solution requires:

1. That you always chdir() into each directory and remove entries using
their plain filename, rather than trying to remove entries from a
higher-level directory using a relative path.

2. When chdir()ing into each subdirectory, you need to do e.g.:

	st1 = os.stat(".")
	os.chdir(subdir)
	st2 = os.stat("..")
	if st1.st_dev != st2.st_dev or st1.st_ino != st2.st_ino:
	    raise SomeKindOfException()

If the test fails, it means that the directory you just chdir()d into
isn't actually a subdirectory of the one you just left, e.g. because the
directory entry was replaced between checking it and chdir()ing into it.

On Linux, an alternative is to use fchdir() rather than chdir(), which
changes to a directory specified by an open file descriptor for that
directory rather than by name. Provided that the directory was open()ed
without any race condition (e.g. using O_NOFOLLOW), subsequent fstat() and
fchdir() calls are guaranteed to use the same directory regardless of any
filesystem changes.




More information about the Python-list mailing list