[Persistence-sig] "Straw Baby" Persistence API

Phillip J. Eby pje@telecommunity.com
Fri, 19 Jul 2002 18:12:35 -0400


At 04:03 PM 7/19/02 -0400, Guido van Rossum wrote:
>> * I do think we should keep PersistentList and PersistentMapping in the
>> core; they're useful for almost any kind of application, and any kind of
>> back-end storage.  They don't introduce policy or data format dependencies
>> into users' code, either.
>
>But perhaps these should be rewritten to derive from dict and list
>instead of UserDict and UserList?  

Perhaps.  What are the implications for pickling?


>Also, the module names are
>inconsistent -- PersistentMapping is defined in _persistentMapping.py
>but PersistentList is defined in PersistentList.py.  Both are then
>"pulled up" one level by __init__.py and their __module__ attribute
>modified.  I find all that hideous and tricky, and I propose to clean
>this up before making it a standard Python package.

+1


>> * Make _p_dm a synonym for _p_jar, and deprecate _p_jar.  This could be
>> done by making a _p_jar descriptor that read/wrote through to _p_dm, and
>> issued a deprecation warning.  I don't personally have a problem with
>> _p_jar, but I've heard rumblings from other people (ZC folks?) that it's
>> confusing or that they want to get rid of it.  So if we're doing it, now
>> seems like the time.
>
>It's just that "jar" makes no sense (except in the "cutesy" sense of a
>jar full of pickles).  But "dm" is a little obscure too.  Maybe write
>it out in full as _p_datamanager?

Sure, whatever.  Maybe just _p_manager.


>> * Keep the _p_atime slot, but don't fill it with anything by default.
>> Instead, have a _p_getattr_hook(persistentObj,attrName,retrievedValue) slot
>> at C level that's called after the getattribute completes.  A data manager
>> can then set the hook to point to a _p_atime update function, *or* it can
>> introduce postprocessing for "proxy" attributes.  That is, a data manager
>> could set the hook to handle "lazy" loading of certain attributes which
>> would otherwise be costly to retrieve, by placing a dummy value in the
>> object's dictionary, and then having the post-call hook return a
>> replacement value.
>> 
>> For speed, this will generally want to be a C function; let the base
>> package include a simple hook that updates _p_atime, and another which
>> checks whether the retrievedValue is an instance of a LazyValue base class,
>> and if so, calls the object.  This will probably cover the basics.  A data
>> manager that uses ZODB caching will use the atime function, and non-ZODB
>> data managers will probably want the other hook.  I also have an idea about
>> using the transaction's timestamp() plus a counter to supply a "time" value
>> that minimizes system calls, but I'm not sure it would actually improve
>> performance any, so I'm fine with not trying to push that into the initial
>> package.  As long as the hook slot is present in the base package, I or
>> anyone else are free to make up and try our own hooks to put in it.
>
>Shouldn't there be a setattr hook too?

Hm.  Seems like a YAGNI to me, unless you're saying that it's so that
_p_atime can be updated on setattr, in which case, sure, add a
_p_setattr_hook(obj,attrname,setval) that's called after successful
setattr.  Otherwise, I can't think of a use case that isn't already covered
by the objectChanged() (formerly register()) message.


>I've often thought that it's ugly that you have to set _p_state and
>_p_changed, rather than do these things with method calls.  What do
>you think about that?  Especially the conventions for _p_state look
>confusing to me.

I've never used _p_state for anything; I thought that was something purely
private/internal to the implementation.  So I'm not sure what you're
talking about, there.

For _p_changed, I don't have any objections to a method or methods, but it
seems to me that it *was* a method at one time and Jim changed it to an
attribute, so it might be good to ask him why.  :)

Of course, I've also seen people using ZODB write code like this:

self.foo = self.foo

To flag things as changed, without using an explicit _p_changed call.  On a
mental level, it has a certain appeal, because it's like saying, hey, I'm
changing *this* attribute.  :)  

But I don't have a strong preference for or against any of these three
broad categories of change signalling.