Databases: Which one's right for me?

Mon Jan 12 13:19:09 EST 2004

[Aaron Watters]
> re: jeremy's reply...
>
> I find the Gray and Reuter definition confusing.  To my mind
> isolation means serializability and if you don't have serializability
> (at least as an option) you don't have isolation.

Gray and Reuter's "It appears that the system runs one transaction at a
time" can be read as implying serializability.  Or not.  Informal English
isn't good for this.

> Transactions don't have to be complex to cause serious problems in
> read-committed mode:  the classic example is a debit of $1000 for
> checking account 123 running at the same time as a credit for $4000
> for checking account 123.  If both transactions run at the same time
> and read the same previous (committed) balance and they both
> complete but the first one completes last then the customer is
> screwed to the tune of $4000.

This one isn't a problem in ZODB.  The second transaction that tries to
commit will fail, with a ConflictError exception.  It's up to the
application then to either abort the failed transaction, or retry it.

If this is a commonly expected kind of conflict, it's also possible for the
object implementing the notion of "checking account" to provide a conflict
resolution method.  Then instead of raising ConflictError, ZODB will call
that method with 3 things:  the object state as of the time the transaction
began, the object state the transaction is trying to commit, and the object
state currently committed.  If the method believes it can  compute a correct
new state from those three, it can return that, and that new state will be
committed instead.  Or it can give up, letting the ConflictError occur.

In this case, a suitable conflict resolution method could compute the delta
between the balance as of the time the transaction began, and the balance
currently committed, then add that delta to the balance it was trying to
commit, and return the result as the balance it "really wants" to commit.

> This is only the simplist problem -- for transactions involving
> complex data structures the problems can be much more subtle than
> that (and more difficult to correct).

ZODB's BTrees are probably a good example of that.  They resolve some
conflicts on their own (for example, two transactions add distinct keys to
the same bucket -- then their conflict resolution method returns a bucket
with both keys), but punt on others (for example, like before, except the
bucket splits -- then changes to the bucket's parent node are also involved,
and bucket conflict resolution gives up, letting the ConflictError
propagate).

I'm not claiming there's no case in which ZODB can yield an "incorrect"
database state.  For example, it *may* be that when two transactions add
distinct new keys to a BTree, one of the transactions did so only because it
didn't see the other key in the BTree, and then a final BTree state
containing both keys would be incorrect.  What the BTree conflict resolution
code does is correct for all uses of BTrees made by Zope, though.

BTW, there's nothing in ZOBD that *requires* you to run concurrent
transactions.  If you need them to act always and in all conceivable
respects as if one were run at a time, then write your app to do only one at
a time.  You'll never get a ConflictError then, either.  Most people seem
much happier getting the benefits of true concurrency and dealing with
ConflictErrors.