[Python-Dev] Ext4 data loss

Tue Mar 10 23:03:08 CET 2009

> If I understand the post properly, it's up to the app to call fsync(),

Correct.

> and it's only necessary when you're doing one of the rename dances, or
> updating a file in place. 

No. It's in general necessary when you want to be sure that the data is
on disk, even if the power is lost. So even if you write a file (say, a
.pyc) only once - if the lights go out, and on again, your .pyc might be
corrupted, as the file system may have chosen to flush the metadata onto
disk, but not the actual data (or only parts of it). This may
happen even if the close(2) operation was successful.

In the specific case of config files, that's unfortunate because you
then can't revert to the old state, either - because that may be gone.
Ideally, you want transactional updates - you get either the old config
or the new config after a crash. You can get that with explicit
fdatasync, or with a transactional database (which may chose to sync
only infrequently, but then will be able to rollback the old state if
the new one wasn't written completely).

But yes, I agree, it's the applications' responsibility to properly
sync. If I had to place sync calls into the standard library, they would
go into dumbdbm.

I somewhat disagree that it is the application's fault entirely, and not
the operating system's/file system's fault. Ideally, there would be an
option of specifying transaction brackets for file operations, so that
the system knows it cannot flush the unlink operation of the old file
before it has flushed the data of the new file. This would still allow
the system to schedule IO fairly freely, but also guarantee that not all
gets lost in a crash. I thought that the data=ordered ext3 mount option
was going in that direction - not sure what happened to it in ext4.

Regards,
Martin