[Python-Dev] pre-PEP: The Safe Buffer Interface

Mon, 29 Jul 2002 23:10:26 -0700 (PDT)

--- Greg Ewing <greg@cosc.canterbury.ac.nz> wrote:
> Guido:
> 
> > I don't like where this is going.  Let's not add locking to the buffer
> > protocol.
> 
> Do you still object to it even in the form I proposed in
> my last message? (I.e. no separate "lock" call, locking
> is implicit in the getxxxbuffer calls.)
> 
> It does make the protocol slightly more complicated to
> use (must remember to make a release call when you're
> finished with the pointer) but it seems like a good
> tradeoff to me for the flexibility gained.
> 

I realize this wasn't addressed to me, and that I said I would butt out
when you were in favor of canning the proposal altogether, but I won't let
that get in the way.  :-)

We haven't seen a semi-thorough use case where the locking behavior is
beneficial yet.  While I appreciate and agree with the intent of trying to
get a more flexible object, I think there is at least one of several
problems buried down a little further than you and Neil are looking.

I'm concerned that this is very much like the segment count features of the
current PyBufferProcs.  It was apparently designed for more generality, and
while no one uses it, everyone has to check that the segment count is one
or raise an exception.  If there is no realizable benefit to the
acquire/release semantics of the new interface, then this is just extra
burden too.  Lets find a realizable benefit before we muck up Thomas's good
simple proposal with this stuff.

In the current Python core, I can think of the following objects that would
need a retrofit to this new interface (there may be more):

    string
    unicode
    mmap
    array

The string, unicode, and mmap objects do not resize or reallocate by
design.  So for them the extra acquire/release requirements are burden with
no benefit.

The array object does resize (via the extend method among others).  So lets
say that an array object gets passed to an extension that locks the buffer
and grabs the pointer.  The extension releases the GIL so that another
thread can work on the array object.  Another thread comes in and wants to
do a resize (via the extend method).  (We don't need to introduce threads
for this since the asynchronous I/O case is just the same.)

If extend() is called while thread 1 has the array locked, it can:

   A) raise an exception or return an error
   B) block until the lock count returns to zero
   C) ???
   .)
   .)

Case A is troublesome because depending on thread scheduling/disk
performance, you will or won't get the exception.  So you've got a weird
race condition where an operation might have been valid if it had only
executed a split second later, but due to misfortune it raised an
exception.  I think this non-determinism is ugly at the very least. 
However since it's recoverable, you could try again (polling), or ignore
the request completely (odd behavior).  I think this is what both you and
Neil are proposing, and I don't see how this is terribly useful.

While I don't think B is the strategy anyone is proposing, it means you
have two blocking objects in effect (the GIL and whatever the array uses to
implement blocking).  If we're not extremely careful, we can get deadlock
here.

I'm still looking for any good examples that fall into cases C and beyond. 
Neil offered a third example that might fit.  He says that he could buffer
the user event that led to the resize operation.  If that is his strategy,
I'd like to see it explained further.  It sounds like taking the event and
not processing it until the asynchronous I/O operation has completed.  At
which point I wonder what using asynchronous I/O achieved since the resize
operation had to wait synchronously for the I/O to complete.  This also
sounds suspiciously like blocking the resize thread, but I won't argue that
point.

__________________________________________________
Do You Yahoo!?
Yahoo! Health - Feel better, live better
http://health.yahoo.com