[Python-Dev] Ext4 data loss

Christian Heimes lists at cheimes.de
Wed Mar 11 03:45:03 CET 2009


Antoine Pitrou wrote:
> Christian Heimes <lists <at> cheimes.de> writes:
>> I agree with you, fsync() shouldn't be called by default. I didn't plan
>> on adding fsync() calls all over our code. However I like to suggest a
>> file.sync() method and a synced flag for files to make the job of
>> application developers easier.
> 
> We already have os.fsync() and os.fdatasync(). Should the sync() (and
> datasync()?) method be added as an object-oriented convenience?

It's more than an object oriented convenience. fsync() takes a file
descriptor as argument. Therefore I assume fsync() only syncs the data
to disk that was written to the file descriptor. [*] In Python 2.x we
are using a FILE* based stream. In Python 3.x we have our own buffered
writer class.

In order to write all data to disk the FILE* stream must be flushed
first before fsync() is called:

    PyFileObject *f;
    if (fflush(f->f_fp) != 0) {
        /* report error */
    }
    if (fsync(fileno(f->f_fp)) != 0) {
        /* report error */
    }


Christian

[*] Is my assumption correct, anybody?


More information about the Python-Dev mailing list