Flushing buffer on file copy on linux

Cameron Simpson cs at zip.com.au
Tue Aug 14 23:09:59 EDT 2012


On 14Aug2012 22:55, J <dreadpiratejeff at gmail.com> wrote:
| Now, the problem I have is that linux tends to buffer data writes to a
| device, and I want to work around that.

To what _specific_ purpose? Benchmarking? Ensuring the device can be
pulled? Ensuring another program can see the data? (The last should be
taken care of regardless of the Linux OS level buffering.)

| When run in normal non-stress
| mode, the program is slow enough that the linux buffers flush and put
| the file on disk before the hash occurs.  However, when run in stress
| mode, what I'm finding is that it appears that the files are possibly
| being hashed while still in the buffer, before being flushed to disk.

You're probably right, but how are you inferring this?

| Generate the parent data file
| hash parent
| instead of copy, open parent and write to a new file object on disk
| with a 0 size buffer
| or flush() before close()
| hash the copy.
| 
| Does that seem reasonable? or is there a cleaner way to copy a file
| from one place to another and ensure the buffers are properly flushed
| (maybe something in os or sys that forces file buffers to be flushed?)

For OS buffers you either want to call sync() (flushes _all_ OS buffers
to disc before returning) or fsync(open-file-handle), which flushes at
least the blocks for that file (in practice, typically everything
outstanding for the same filesystem, alas).

So look at os.fsync in python.

You will need to do this after a python-level data flush but before file
close:

  with open("output-file", "w") as fp:
    fp.write(lots of stuff)...
    fp.flush()
    os.fsync(fp.fileno)

But be clear about your purpose: why do you care that the disc writes
themselve are complete? There are legitimate reasons for this, but
unless it is benchmarking or utterly mad (eg database) data integrity,
they generally aren't:-)

Cheers,
-- 
Cameron Simpson <cs at zip.com.au>

English is a living language, but simple illiteracy is no basis for
linguistic evolution.   - Dwight MacDonald



More information about the Python-list mailing list