Automatic access serialization

Sat Jan 31 04:50:18 EST 2004

gagenellina at softlab.com.ar (Gabriel Genellina) wrote in message news:<946d2ebc.0401301753.5bbe87c5 at posting.google.com>...
> Hello!
> 
> I have a number of objects that are not thread-safe - no locking
> mechanism was originally provided.
> But I want to use them in a thread-safe way. I don't want to analyze
> each class details, looking for where and when I must protect the
> code. So I wrote this sort-of-brute-force locking mechanism using
> threading.RLock() around __getattr__ / __setattr__ / __delattr__.
> The idea is that *any* attribute and method access is protected by the
> lock. The calling thread may acquire the lock multiple times, but
> other threads must wait until it is released.
> This may be too much restrictive, and a bottleneck in performance,
> since *all* accesses are effectively serialized - but it's the best I
> could imagine without deeply analyzing the actual code.
> I *think* this works well, but since I'm not an expert in these
> things, any comments are welcome. Specially what could go wrong, if
> someone could anticipate it.
> Note: As written, this only works for old-style classes. I think that
> just by adding a similar __getattribute__ would suffice for new-style
> classes too - is that true?

A few problems that I can see. First is that this only protects access to
the reference to the data in question. Imagine that the data value
is a dictionary. If I understand it correctly, two different threads
could still get a reference to the same dictionary and then independently
start modifying the dictionary at the same time. This is because you are
then dealing with the dictionary directly and your locks aren't involved
at all.

Second is that it isn't always sufficient to lock at the level of individual
attributes, instead it sometimes necessary to lock on a group of attributes
or even the whole object.  This may be because a thread has to be able
to update two attributes at one time and know that another thread will
not change one of them while it is modifying the other.

One hack I have used is always ensuring that access to an object is
through its functional interface and then locking at the level of
function calls. That is provide additional member functions called
lockObject and unlockObject which operate the thread mutex. When
using the class then have:

  object.lockObject()
  object.someFunc1()
  object.someFunc2()
  object.etc()
  object.unlockObject()

This can sort of be automated a bit by having:

class _SynchronisedMethod:

  def __init__(self,object,name):
    self._object = object
    self._name = name

  def __getattr__(self,name):
    return _SynchronisedMethod(self._object,"%s.%s"%(self._name,name))

  def __call__(self,*args):
    try:
      self._object.lockObject()
      method = getattr(self._object,self._name)
      result = apply(method,args)
      self._object.unlockObject()
    except: 
      self._object.unlockObject()
      raise
    return result

class _SynchronisedObject:

  def __init__(self,object):
    self._object = object

  def __getattr__(self,name):
    return _SynchronisedMethod(self._object,name)

You can then say:

  sync_object = _SynchronisedObject(object)

  sync_object.someFunc1()
  sync_object.someFunc2()
  sync_object.etc()

Although this avoids the explicit calls to lock the object, it still isn't the
same thing. This is because in the first example, the lock was held across
multiple function calls whereas in the second case it is only held for a
single call. Thus something like the following could fail:

  if sync_object.dataExists():
    sync_object.getData()

This is because something could have taken the data between the time
the check for data was made and when it was obtained.

Thus automating it like this isn't foolproof either. The closest you will
probably get to a quick solution is to have the lock/unlock functions
for the object and change any code to use them around any sections of
code which you don't want another thread operating on the class at
the same time. You may even have to hold locks on multiple objects
at the same time to ensure that things don't get stuffed up.

Thus it isn't necessarily a simple task to add threading support after
the fact. You probably will have no choice but to analyse the use of
everything and work out to correctly implement locking or even change
the functional interface a bit so functions can be atomic in themselves.
Ie., rather than having to check if data exists before first getting it,
you just try and get it and either rely on an exception or have the
thread wait until data is there.