Minimalistic Software Transactional Memory

Michael Sparks ms at cerenity.org
Sat Dec 8 19:52:44 EST 2007


Fuzzyman wrote:

> Wouldn't it be more useful having a store (possibly with a mapping
> API?) that stored several values. Then a single commit can be done at
> the end of the transaction.
> 
> If the transaction fails, then the whole process can be repeated with
> none of the changes having been committed.

Quite probably yes :-)

However initially I thought I'd start off with trying to get single values
right :-) (A single value can be a dict of course, but your point holds for
arbitrary unrelated values)

I suspect the way to deal with that might be to do something like:

S = Store()
D = S.using("account_one", "account_two", "myaccount")
D["myaccount"].set(D["account_one"].value+D["account_two"].value)
D["account_one"].set(0)
D["account_two"].set(0)
D.commit()

Which seems pretty nice/simple. It'd be useful for the commit failure to
indicate which value was broken by a concurrent update, but not necessary.

Context: 

At present Kamaelia has something which tracks keys and values - primarily
services ("where's the person who manages the pygame display?"). At present
those are only really accessible by generator based components which means
that key/value context is actually accessed in a single threaded way making
services safe for non-threaded components.

However, making accessing that store safe from threads requires changing to
something more advanced, either locking, or using STM. STM seems more in
keeping with Kamaelia being generally lock-free.

The way this store is currently used is exclusively in a single key/value
lookup, which is why I was thinking of that first - though I can see to
be more generally useful allowing multi-value updates would be useful :-)

> The tricky part is that other threads asking for the value should get
> the old value until the commit has been done - so you need to know
> which thread 'value' is being asked for from.

I think this approach deals with this naturally. cf:
    def get(self, key): # in Store
        return self.store[key].clone()
...
    def clone(self): # in Value
        return Value(self.version, self.value,self.store,self.key)
...
    def set(self, key, value):
        if not (self.store[key].version > value.version):
            self.store[key] = Value(value.version+1, value.value, self, key)
            value.version= value.version+1
        else:
            raise ConcurrentUpdate

This means that the value in 2 threads for the same checked out value names
can be different, but they get warning of this when they try to commit.

eg suppose the following interleaving of commands:

S = Store()
D = S.using("account_one", "account_two", "myaccount")
D["myaccount"].set(D["account_one"].value+D["account_two"].value)
D["account_one"].set(0)
E = S.using("account_one", "myaccount")
E["myaccount"].set(E["myaccount"].value-100)
E["account_one"].set(100)
E.commit()
D["account_two"].set(0)
D.commit()

I did think that this would work with the current code but changing the
value to something more complex - like this:

S = Store()
D = S.using("accounts")
D.set({"account_one":50, "account_two":100, "myaccount":0})
D.commit()
print "here"
S.dump()
X = D.value
X["myaccount"] = X["account_one"] + X["account_two"]
X["account_one"] = 0
X["account_two"] = 0
D.set(X)
D.commit()
S.dump()
print "Committed", D.value["myaccount"]

But in practice, because I'm not doing a _copy_ here of self.value:
    def clone(self): # in Value
        return Value(self.version, self.value,self.store,self.key)

I end up with accidental concurrent access, which is easy to fix, for
moderately complex datatypes:
import copy
....
    def clone(self): # in Value
        return Value(self.version, copy.deepcopy(self.value),
                                              self.store, self.key)

As well as here:
    def set(self, key, value):
        if not (self.store[key].version > value.version):
            self.store[key] = Value(value.version+1,
                                    copy.deepcopy(value.value), self, key)
            value.version= value.version+1
        else:
            raise ConcurrentUpdate

If I do that, then the following code fails as would be expected:
S = Store()
D = S.using("accounts")
D.set({"account_one":50, "account_two":100, "myaccount":0})
D.commit() # First
S.dump()
X = D.value
X["myaccount"] = X["account_one"] + X["account_two"]
X["account_one"] = 0

E = S.using("accounts")
Y = E.value
Y["myaccount"] = Y["myaccount"]-100
Y["account_one"]= 100
E.set(Y)
E.commit() # Second
S.dump()

X["account_two"] = 0
D.set(X)
D.commit()  # Third
S.dump()
print "Committed", D.value["myaccount"]

Fails as follows:

accounts : Value(1, {'account_two': 100, 'myaccount': 0, 'account_one': 50})
accounts : Value(2, {'account_two': 100, 'myaccount': -100, 'account_one':
100})
Traceback (most recent call last):
  File "./NewSTM.py", line 70, in <module>
    D.commit()  # Third
  File "./NewSTM.py", line 20, in commit
    self.store.set(self.key, self)
  File "./NewSTM.py", line 37, in set
    raise ConcurrentUpdate
__main__.ConcurrentUpdate

(which is of course the error wanted - since we want to be able to detect
failure)

It's probably useful to know for the more general approach you suggest.

Thanks!


Michael.
--
Michael Sparks, Kamaelia Project Lead
http://kamaelia.sourceforge.net/Developers/
http://yeoldeclue.com/blog




More information about the Python-list mailing list