Rename file without overwriting existing files

Jussi Piitulainen jussi.piitulainen at helsinki.fi
Thu Feb 9 07:54:41 EST 2017


Steve D'Aprano writes:

> On Mon, 30 Jan 2017 09:39 pm, Peter Otten wrote:
>
>>>>> def rename(source, dest):
>> ...     os.link(source, dest)
>> ...     os.unlink(source)
>> ...
>>>>> rename("foo", "baz")
>>>>> os.listdir()
>> ['bar', 'baz']
>>>>> rename("bar", "baz")
>> Traceback (most recent call last):
>>   File "<stdin>", line 1, in <module>
>>   File "<stdin>", line 2, in rename
>> FileExistsError: [Errno 17] File exists: 'bar' -> 'baz'
>
>
> Thanks Peter!
>
> That's not quite ideal, as it isn't atomic: it is possible that the link
> will succeed, but the unlink won't. But I prefer that over the alternative,
> which is over-writing a file and causing data loss.
>
> So to summarise, os.rename(source, destination):
>
> - is atomic on POSIX systems, if source and destination are both on the 
>   same file system;
>
> - may not be atomic on Windows?
>
> - may over-write an existing destination on POSIX systems, but not on
>   Windows;
>
> - and it doesn't work across file systems.
>
> os.replace(source, destination) is similar, except that it may over-write an
> existing destination on Windows as well as on POSIX systems.
>
>
> The link/unlink trick:
>
> - avoids over-writing existing files on POSIX systems at least;
>
> - but maybe not Windows?
>
> - isn't atomic, so in the worst case you end up with two links to
>   the one file;
>
> - but os.link may not be available on all platforms;
>
> - and it won't work across file systems.
>
>
> Putting that all together, here's my attempt at a version of file rename
> which doesn't over-write existing files:
>
>
> import os
> import shutil
>
> def rename(src, dest):
>     """Rename src to dest only if dest doesn't already exist (almost)."""
>     if hasattr(os, 'link'):
>         try:
>             os.link(src, dest)
>         except OSError:
>             pass
>         else:
>             os.unlink(src)
>             return
>     # Fallback to an implementation which is vulnerable to a 
>     # Time Of Check to Time Of Use bug.
>     # Try to reduce the window for this race condition by minimizing
>     # the number of lookups needed between one call and the next.
>     move = shutil.move
>     if not os.file.exists(dest):
>         move(src, dest)
>     else:
>         raise shutil.Error("Destination path '%s' already exists" % dest)
>
>
>
> Any comments? Any bugs? Any cross-platform way to slay this TOCTOU bug once
> and for all?

To claim the filename before crossing a filesystem boundary, how about:

1) create a temporary file in the target directory (tempfile.mkstemp)

2) link the temporary file to the target name (in the same directory)

3) unlink the temporary name

4) now it should be safe to move the source file to the target name

5) set permissions and whatever other attributes there are?

Or maybe copy the source file to the temporary name, link the copy to
the target name, unlink the temporary name, unlink the source file;
failing the link step: unlink the temporary name but do not unlink the
source file.



More information about the Python-list mailing list