[Python-Dev] wait time [was: Ext4 data loss]

Fri Mar 13 01:18:32 CET 2009

On Fri, 13 Mar 2009 08:01:27 am Jim Jewett wrote:
> On 3/12/09, "Martin v. Löwis" <martin at v.loewis.de> wrote:
> >> It is starting to look as though flush (and close?) should take an
> >> optional wait parameter, to indicate how much re-assurance you're
> >> willing to wait for.
> >
> > Unfortunately, such a thing would be unimplementable on most of
> > today's operating systems.
>
> What am I missing?
>
> _file=file
> class file(_file): ...
>     def flush(self, wait=0):
>         super().flush(self)
>         if wait < 0.25:
>             return
>         if wait < 0.5 and os.fdatasync:
>             os.fdatasync(self.fileno())
>             return

[snip rest of function]

Why are you giving the user the illusion of fine control by making the 
wait parameter a continuous variable and then using it as if it were a 
discrete variable? Your example gives only four distinct behaviours, 
for a (effectively) infinite range of wait. This is bad interface 
design: it misleads people into thinking that wait=0.4 is 33% safer 
than wait=0.3 when in fact they are exactly the same.

So, replace the wait parameter with a discrete variable -- named or 
numeric constants. That's a little better, but I still don't think this 
is the right solution. I believe that we want to leave the foundations 
as they are now, or at least don't rush into making changes to them.

A better approach in my opinion is to leave file as-is (although I 
wouldn't object much to it growing a sync method, for convenience) and 
then providing subclasses with the desired behaviour. That scales much 
better: today we can think of three or four levels of "save 
reliability" (corresponding to your 0.25, 0.5, 0.7 and 1 values for 
wait) but next year we might think of six, or ten. Instead of 
overloading the file type with all these different sorts of behaviour, 
requiring who knows how many arguments and a complicated API, we leave 
file nice and simple and allow the application developer to choose the 
subclass she wants:

from filetools import SyncOnWrite as open
f = open('mydata.txt', 'w')
f.write(data)

The choice of which subclass gets used is up to the application, but 
naturally that might be specified by a user-configurable setting.

-- 
Steven D'Aprano