shutil.copyfile is incomplete (truncated)

Nobody nobody at nowhere.com
Fri Apr 12 22:33:29 EDT 2013


On Fri, 12 Apr 2013 00:06:21 +0000, Steven D'Aprano wrote:

>> The close method is defined and flushing and closing a file, so it
>> should not return until that's done.
> 
> But note that "done" in this case means "the file system thinks it is 
> done", not *actually* done. Hard drives, especially the cheaper ones, 
> lie. They can say the file is written when in fact the data is still in 
> the hard drive's internal cache and not written to the disk platter. 
> Also, in my experience, hardware RAID controllers will eat your data, and 
> then your brains when you try to diagnose the problem.

None of which is likely to be relevant here, as any subsequent access to
the file will reference the in-memory copy; the disk will only get
involved if the data has already been flushed from the OS' cache and has
to be read back in from disk.

write(), close(), etc return once the data has been written to the
OS' disk cache. At that point, the OS usually won't have even started
sending the data to the drive, let alone waited for the drive to report
(or claim) that the data has been written to the physical disk.

If you want to wait for the data written to be written to the physical
disk (in order to obtain specific behaviour with respect to an unclean
shutdown), use f.flush() followed by os.fsync(f.fileno()).

But most of the time, there's no point. If you actually care about what
happens in the event of an unclean shutdown, you typically also need to
sync the directory, otherwise the file's contents will get sync'd but the
file's very existence might not be.




More information about the Python-list mailing list