When will os.remove fail?

eryk sun eryksun at gmail.com
Tue Mar 14 13:12:38 EDT 2017


On Tue, Mar 14, 2017 at 11:32 AM, Steve D'Aprano
<steve+python at pearwood.info> wrote:
> On Mon, 13 Mar 2017 08:47 pm, eryk sun wrote:
>
>> One hurdle to getting delete access is the sharing mode. If there are
>> existing File objects that reference the file, they all have to share
>> delete access. Otherwise the open fails with a sharing violation. This
>> is often the show stopper because the C runtime (and thus CPython,
>> usually) opens files with read and write sharing but not delete
>> sharing.
>
> This is the famous "can't delete a file which is open under Windows"
> problem, am I right?

If you aren't allowed shared delete access, the error you'll get is a
sharing violation (32). If the file security doesn't allow delete
access, then the error is access denied (5). In Python both of these
raise a PermissionError, so telling the difference requires checking
the winerror attribute.

Two more cases are (1) a read-only file and (2) a file that's
memory-mapped as code or data. In these two cases you can open the
file with delete access, which allows you to rename it, but setting
the delete disposition fails with access denied.

If you have the right to set the file's attributes, then you can at
least work around the read-only problem via os.chmod(filename,
stat.S_IWRITE).

For the mapped file case the best you can do is rename the file to
another directory. At least it gets it out of the way if you're doing
an upgrade to a running program. This workaround is rarely used -- I
think mostly due to people not knowing that it's possible.

> I take it that you *can* delete open files, but only if the process that
> opens them takes special care to use "delete sharing". Is that correct?

Unix programmers can't simply use delete sharing as a way to get
familiar semantics on Windows. A file pending delete is in limbo on
Windows. It can't be opened again, and as long as there are File
objects that reference the common file control block (FCB), it can't
be unlinked. One of those File objects may even be used to restore the
file (i.e. unset the delete disposition) if it has delete access.
Notably, this limbo file prevents the containing directory from being
deleted.

> I don't have a machine to test this on, but I'd like to deal with this
> situation in my (cross-platform) code. If I have one Python script do this:
>
> with open("My Documents/foo") as f:
>     time.sleep(100000)
>
> and while it is sleeping another script does this:
>
> os.remove("My Documents/foo")
>
> what exception will I get? Is that unique to this situation, or is a generic
> exception that could mean anything?

This will fail with a sharing violation, i.e. winerror == 32.

> My aim is to do:
>
> try:
>     os.remove(thefile)
> except SomeError:
>     # Could be a virus checker or other transient process.
>     time.sleep(0.2)
>     os.remove(thefile)  # try again
>
> Does that seem reasonable to you, as a Windows user?

If you get an access-denied error, then you can try to remove the
read-only attribute if it's set. Otherwise you can try to rename the
file to get it out of the way. But if you're locked out by a sharing
violation, there isn't anything you can do short of terminating the
offending program.

Virus scanners and other filter drivers generally aren't an immediate
problem with deleting. They share all access. But they might keep a
file from being immediately unlinked, which could cause problems in
functions like shutil.rmtree. Removing the parent directory can fail
if the file hasn't been unlinked yet.

>> In user mode, a kernel object such as a File instance is referenced as
>> a handle.
>
> Out of curiosity, I only know the term "handle" from classic Macintosh
> (pre-OS X) where a handle was a managed pointer to a pointer to a chunk of
> memory. Being managed, the OS could move the memory around without the
> handles ending up pointing to garbage. Is that the same meaning in Windows
> land?

That sounds similar to an HGLOBAL handle used with
GlobalAlloc/GlobalLock. This was a feature from 16-bit Windows, and
it's kept around for compatibility with older APIs that still use it.

I was talking about handles for kernel objects. Every process has a
table of handle entries that's used to refer to kernel objects. These
objects are allocated in the shared kernel space (the upper range of
virtual memory), so user-mode code can't refer to them directly.
Instead a program passes a handle to a system call, and kernel code
and drivers use the object manager to look up the pointer reference.

A kernel object has an associated type object (e.g. the "File" type),
and there's a "Type" metatype like in Python. The list of supported
methods is pretty basic:

    DumpProcedure, OpenProcedure, ParseProcedure,
    SecurityProcedure, QueryNameProcedure, OkayToCloseProcedure,
    CloseProcedure, DeleteProcedure

The close method is called when a process closes a handle to the
object. It gets passed a reference to the Process that's closing the
handle and also the object's handle count, both in the process and
across all processes, to allow for whatever cleanup is required when
the last handle is closed both in the process and in the system. The
delete method is called when the object is no longer referenced, which
is for pointer references, but that implicitly includes handle
references as well.

For the File type, these methods reference the associated Device and
call into the device stack with an I/O request packet (IRP). For
closing a handle the major function is IRP_MJ_CLEANUP. For deleting
the object the major function is IRP_MJ_CLOSE. The Driver object [1]
has a MajorFunction table of function pointers for dispatching IRPs by
major function code.

If a File object [2] refers to a file-system file/directory (as
opposed to a device), then the file-system context is the shared file
control block (FCB) and usually a private context control block (CCB),
which are respectively the object members FsContext and FsContext2. If
the FCB is flagged as delete-on-close, then when the last reference is
cleaned up, the file system does the work of unlinking and whatever
else it has to do to really delete a file. If the CCB is flagged as
delete-on-close (e.g. CreateFile was called with
FILE_FLAG_DELETE_ON_CLOSE), then when the File object is cleaned up it
transfers this flag to the shared FCB.

If you want to see this in practice, look at the published cleanup
code for the fastfat driver [3].

[1]: https://msdn.microsoft.com/en-us/library/ff544174
[2]: https://msdn.microsoft.com/en-us/library/ff545834
[3]: https://github.com/Microsoft/Windows-driver-samples/blob/master/filesys/fastfat/cleanup.c

> In principle, I could say:
>
> delete file X
>
> which then returns immediately, and if I try to open(X) it will fail. But I
> can still see it if I do a dir() on the parent directory?

Yes.

> Eventually the last reference to X will go away, and then it is unlinked.
> What happens if I pull the plug in the meantime? Will the file magically
> come back on rebooting?

Setting the delete disposition is just a flag in the in-memory FCB
structure, so if you pull the plug the file hasn't actually been
unlinked. It's still there.

Bear in mind that sharing delete access is rare on Windows. Normally
when a file is deleted there's only a single File object referencing
it, such as from DeleteFile calling NtOpenFile. Then it immediately
calls NtSetInformationFile to set the delete disposition. When it
closes the handle, this triggers the CloseProcedure => IRP_MJ_CLEANUP,
DeleteProcedure => IRP_MJ_CLOSE sequence that actually unlinks the
file.

>> Finally, I'm sure most people are familiar with the read-only file
>> attribute. If this attribute is set you can still open a file with
>> delete access to rename it, but setting the delete disposition will
>> fail with access denied.
>
> That was actually the situation I was thinking about when I started on this
> question. From Python code, I was considering writing something like this:
>
> def delete_readonly(thefile):
>     try:
>         os.remove(thefile)
>     except ReadOnlyFileError:  # what is this really?

It's a PermissionError with winerror == 5, i.e. access denied. As
mentioned above, you can use os.chmod to remove the read-only file
attribute.



More information about the Python-list mailing list