[Persistence-sig] A simple Observation API

Shane Hathaway shane@zope.com
Tue, 30 Jul 2002 14:40:33 -0400 (EDT)


On Tue, 30 Jul 2002, Phillip J. Eby wrote:

> At 08:40 AM 7/30/02 -0400, Jeremy Hylton wrote:
> >>>>>> "PJE" == Phillip J Eby <pje@telecommunity.com> writes:
> >
> >  PJE> Well, the example implementation I wrote took care of all of
> >  PJE> that, quite elegantly I thought.  But for my purposes, it's
> >  PJE> sufficient as long as _p_changed is set after the last
> >  PJE> modification that occurs.  It's okay if it's also set after
> >  PJE> previous modifications.  It just must be set after the last
> >  PJE> modification, regardless of how many other times it's set.
> >
> >  PJE> This requirement on my part has strictly to do with data
> >  PJE> managers that write to other data managers, in the context of
> >  PJE> the transaction API I proposed.
> >
> >Can you explain how _p_changed is used outside of transaction control?
> >I still don't understand how the timing of _p_changed affects things.
> >
>
> This has to do with the "write-through mode" phase between
> "prepareToCommit()" and "voteOnCommit()" messages (whatever you call them).
>  During this phase, to support cascaded storage (one data manager writes to
> another), all data managers must "write through" any changes that occur
> *immediately*.  They can't wait for "prepareToCommit()", because they've
> already received it.  Basically, when the object says, "I've changed"
> (i.e. via "register" or "notify" or whatever you call it), the data manager
> must write it out right then.

I'm having trouble understanding this.  Is prepareToCommit() the first
phase, and voteOnCommit() the second phase?  Can't the data manager commit
the data on the second phase?

> But, if the _p_changed flag is set *before* the change, the data manager
> has no way to know what the change was and write it.  It can't wait for
> "voteOnCommit()", because then the DM it writes to might have already
> voted, for example.  It *must* know about the change as soon as the change
> has occurred.  Thus, the change message must *follow* a change.  It's okay
> if there are multiple change messages, as long as there's at least one
> *after* a set of changes.

For ZODB 3 I've realized that sometimes application code needs to set
_p_changed *before* making a change.  Here is an example of potentially
broken code:

def addDate(self, date):
    self.dates.append(date)  # self.dates is a simple list
    self.dates.sort()
    self._p_changed = 1

Let's say self.dates.sort() raises some exception that leads to an aborted
transaction.  Objects are supposed to be reverted on transaction abort,
but that won't happen here!  The connection was never notified that there
were changes, so self.dates is now out of sync.  But if the application
sets _p_changed just *before* the change, aborting will work.

> Now, you may say that there are other ways to address dependencies between
> participants than having "write-through mode" during the prepare->vote
> phase.  And you're right.  ZPatterns certainly manages to work around this,
> as does Steve Alexander's TransactionAgents.  TransactionAgents, however,
> is actually a partial rewrite of the Zope transaction machinery, and there
> are some holes in how ZPatterns addresses the issue as well.  (ZPatterns
> addresses it by adding more objects to the transaction during the
> "commit()" calls to the data managers, that are roughly equivalent to the
> current "prepare()" message concept.)
>
> We could address this by having transaction participants declare their
> dependencies to other participants, and have the transaction do a
> topological sort, and send all messages in dependency order.  It could then
> be an error to have a circular dependency, and data managers could raise an
> error if they received an object change message once they were done with
> the prepare() call.  It would make the Transaction API and implementation a
> bit more complex, leave data managers about the same in complexity as they
> would have been before, and it would mean that persistent objects wouldn't
> need to worry about whether _p_changed was flagged before or after a change.

Are you alluding to "indexing agents" and "rule agents" like we talked
about before?  I think we do need some kind of transaction participant
ordering to support those concepts.  I had in mind a simple numerical
prioritization scheme.  Is the need complex enough to require topological
sorting?

Shane