[Tutor] file exists question

Tue Mar 10 01:44:55 CET 2015

On 10 March 2015 at 00:22, Steven D'Aprano <steve at pearwood.info> wrote:
> On Mon, Mar 09, 2015 at 04:50:11PM +0000, Alan Gauld wrote:
>>
>> Somebody posted a question asking how to fond out if a file
>> exists. The message was in the queue and I thought I'd approved
>> it but it hasn't shown up yet. Sorry to the OP if I've messed up.
>>
>> The answer is that you use the os.path.exists() function.
>> It takes a path as an argument which can be relative to
>> the cwd or absolute.
>
> os.path.exists is a little bit of an anti-pattern though. It has two
> problems: there is a race condition here, waiting to bite you. Just
> because the file exists now, doesn't mean it will exist a millisecond
> later when you try to open it. Also, even if the file exists, there is
> no guarantee you can open it.
>
>
> Code like this is buggy:
>
>
> filename = raw_input("What file do you want to open? ")
> if os.path.exists(filename):
>     with open(filename) as f:
>         text = f.read()
>     process(text)

The potential bug is often a non-issue in simple use cases. Also it
does read a little nicer to use an if statement than catching an
exception. You may disagree but I find that in the earlier stages of
teaching programming it's inappropriate to try and ensure that every
suggestion is suitable for production code.

<snip>
>
> The only use I have found for os.path.exists is to try to *avoid* an
> existing file. (Even here, it is still subject to one of the above
> problems: just because a file *doesn't* exist now, a millisecond it
> might.) For example, automatically numbering files rather than
> overwriting them:
>
>
> filename = raw_input("Save file as...? ")
> name, ext = os.path.splitext(filename)
> n = 1
> while os.path.exists(filename):
>     # automatically pick a new name
>     filename = "%s~%d" % (name, n)
>     n += 1
> try:
>     with open(filename, 'w') as f:
>         f.write(text)
> except (IOError, OSError):
>     pass
>
>
> I'm not 100% happy with that solution, because there is a risk that some
> other process will create a file with the same name in the gap between
> calling os.path.exists and calling open, but I don't know how to solve
> that race condition. What I really want is an option to open() that only
> opens a new file, and fails if the file already exists.

If you go to the file descriptor level then you can do

def open_new(filename):
    fd = os.open(filename, os.O_WRONLY | os.O_CREAT | os.O_EXCL)
    return os.fdopen(fd)

which I think is what you want.

Oscar