[Persistence-sig] join the SIG

Ernesto Revilla aerd@retemail.es
Fri, 16 Aug 2002 02:25:51 +0200


Dear all,

My name is Ernesto Revilla (Spain) and I'm also very interested because we
are designing a new ERP system (for small and medium sized firms) wich will
have to use a lot of business objects.

I would be grateful if anyone could summarize somehow the results till now.

I'm actually working on a new general purpose persistence framework, because
the one included in Webware named MiddleKit
(http://webware.sourceforge.net/Webware-0.7/MiddleKit/Docs/index.html) does
not provide transactions. And there are no comparable frameworks to Java
Data Objects
(http://java.sun.com/aboutJava/communityprocess/review/jsr012/JDO_0_8.pdf)
for Python.

I started with this topic about 6 months but have not very much experience.
After studying the persistence layer white-paper written by Scott Ambler
(http://www.ambysoft.com/persistenceLayer.pdf) I peeked thru some
implementations for Java like Castor (http://www.castor.org) and
persistence-layer (http://player.sourceforge.net ) .

I propose: divide the SIG purpose. I would like to see a level 1 minimal
specification, especiallly the API, because it might be difficult to agree.
I would discard the savepoints discussions but perhaps allow nested
transactions inside the persistence-layer (not the persistence mechanism).
Anyway, I would try to keep things very simple, so we could get a initial
level 1 implementation soon (end of the year). I'm willing to spend
something like 15-20 hours a week on this (depending if the proposed
solution goes in the same direction as what our company needs for the new
project).

I thought something like this:
There is one or more class mapping files which specify which classes there
are, which attributes they have, also the types of attributes, and to which
persistent mechanism it should map. Although the map file could specify just
one persistence mechanism to use, the classes and the attributes can
override this information. For each persistence mechanism, there could be
additional information, e.g. a relational database would specify connection
info, table names, field names, primary keys and foreign keys, a file
storage would use 'directory' and 'filename'. The supplied information
should be specific to each type of persistence mechanism (relational
databases, files, bsd sotre like, and! memory only storages
(http://www.prevayler.org/) ).

In MiddleKit, after specifying a class map, a batch command creates code for
the default persistent classes of the class map, then a user can override
them inheriting from these generated classes. This is because the mapping
file only specifies the data attributes, not the code functions. May be the
generation step isn't necessary thru Meta-Classing. Perhaps, is could be
another way round, just read class definitions and when storing, look up the
class map.

A mininal API could be something like this:
Say the class map specifies that class Invoice has the attributes
'Reference' of type string, 'Customer' of type Customer, 'lines' of type
'List of ArticleLine'

from Persistence import PersistenceManager
pm=PersistenceManager()
# The loading of the classmap automatically defines the classes with all
their attributes as properties.
# The base 'set' method defined in the basic 'PersistentObject' class will
do type checking and others
pm.loadClassMap('/homer/erny/classmap')

# just a class definition with user stuff
# like business rules, updating attributes, accessing related information
class Invoice(pm.classes['Invoice']):
    def _totalAmount(self):
         amount=0
         for l in self.lines:
             amount+=l.amount
    totalAmount=property(_totalamount,None)

# how to work with the objects:
# all retrieves, updates, and lookups are done inside a transaction
# this will isolate the modifications to other users. Optimistic locking is
used
# implicitly
tr=pm.Transaction()
result=tr.retrieve(oql='SELECT i FROM Invoice i WHERE i.customer.name LIKE
'Thomson*')
amount=0
while result.hasMore():
   inv=result.next()
   inv.lines.append(ArticleLine(ref='BOOK1', qty=1))    # modifying
attributes
   amount+=inv.totalAmount

# of course you could also set whatever attributes. Note that this is all
done in a transaction.
# delete the last accessed invoice:
del inv    # Better inv.delete() ?

# Adding the objects:
inv=Invoice()
# We have to add it explicitly, because otherwise we would not know to
# what transaction it belongs
tr.add(inv)
inv.customer=tr.retrieve(oid=45323)
inv.lines.add(ArticleLine(Ref='TOY2',qty=3))

# after finishing all things, do:
tr.commit()
# note that the transactions began automatically, no tr.begin() was needed.
# Transactions can have nested transactions:
tr2=tr.Transaction()
# Metaclass information could be available like this:
attrtype=inv._class.customer.type  # other properties are name, description,
store information, etc.


=======================
* During commit, the persistence-layer, would check that no other person has
changed the same objetcs, throwing a TransactionError if needed.
* The class map specifies which classes should use optimistic locking and
which one pesimistic locking.
*An exception in the code or the deletion of a transaction does a rollback
automatically.
* For some classes, there should be a retry mechanism so the object would be
re-read and the changes reapplied
* Note that nothing until here says something about the type of storage or
if it supports transactions or inheritance.

=======================
Implementation hints:
* class definitions will be created thru meta-classing with properties
created automatically and inheriting either from other persistent classes of
the classmap or a PersistentObject base class (could specify other base
class)
* the property set function should do type-checking with 'isinstance'
* whenver a user accesses an object, the object is read-in in a system-wide
cache and a 'proxy' object is returned.
* all attribute changes will be recorded in the 'proxy' object. All called
methods on a PersistentObject will be tracked, so the changes can be
reapplied if necessary. The transaction is a container of the new object
states, and the actions applied.
* The method-call tracking can be implemented thru metaclasses which scan
the user persistent class at definition time and reroute the calls to a
tracking procedure which in turn calls the user method.
(Sadly, I can't override MethodType, or FunctionType and tell the
interpreter to use them instead of the default. In 'object' terms, we need
the calls to be 'serializable'.)
* all changes are done in-memory (transaction space) until the whole
transaction is commited, in which moment it would start transactions in the
used storages (if supported), block all used objects, update and thereafter
unlock them. This would also update the cache.
* Like the PostgreSQL multi-version concurrency control, we could have
several versions of the same object in the cache. So with in-memory changes,
readers don't block writes nor the other way round. The important thing is
that a user has a consistent image of a object, although it is out of date.
* objects should have backpointers to their containers wich is especially
helpful for query optimization.
* I would like to see that old object versions could be written out to
another storage system.

After all, I hope that this is not too much out of track. As said before, I
would like to see a minimal API spec with:
* start, commit, rollback transactions, (exclude nested transactions
initially?)
* retrieve, update, create and delete objects (I borrowed the Castor OQL
implementation for a porting to Python
* access to meta-data
* say something about retry-operations for very frequently updated objects,
such as global counters or total amounts per period
* loading classmaps
* minimal features of a classmap (the format later) (classes with
classnames, superclasses, abstract, etc, attrinute name and types, also type
of relations (for example, 'embedded' for UML-composition and 'linked' for
UML-aggregation and association, or for a bit lower level 'on delete
cascade', 'on delete detach', and so on.

With best regards,
Erny