[Patches] [ python-Patches-553171 ] optionally make shelve less surprising
noreply@sourceforge.net
noreply@sourceforge.net
Thu, 09 May 2002 17:47:40 -0700
Patches item #553171, was opened at 2002-05-07 08:13
You can respond by visiting:
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=553171&group_id=5470
Category: Library (Lib)
Group: Python 2.2.x
Status: Open
Resolution: None
Priority: 5
Submitted By: Alex Martelli (aleax)
Assigned to: Nobody/Anonymous (nobody)
Summary: optionally make shelve less surprising
Initial Comment:
shelve has highly surprising behavior wrt modifiable
values:
s = shelve.open('she.dat','c')
s['ciao'] = range(3)
s['ciao'].append(4) # doesn't "TAKE"!
Explaining to beginners that s['ciao'] is returning a
temporary object and the modification is done on the
temporary thus "silently ignored" is hard indeed. It
also makes shelve far less convenient than it could
be (whenever modifiable values must be shelved).
Having s keep track of all values it has returned may
perhaps break some existing program (due to extra
memory consumption and/or to lack of "implicit
copy"/"snapshot" behavior) so I've made the 'caching'
change optional and by default off. However it's now
at least possible to obtain nonsurprising behavior:
s = shelve.open('she.dat','c',smart=1)
s['ciao'] = range(3)
s['ciao'].append(4) # no surprises any more
I suspect the 'smart=1' should be made the default,
but, if we at least put it in now, then perhaps we
can migrate to having it as the default very slowly
and gradually.
Alex
----------------------------------------------------------------------
Comment By: H.P.K. (dannu)
Date: 2002-05-10 00:47
Message:
Logged In: YES
user_id=83092
I'd suggest not changing shelve at all but providing
a "cache-commit" dictionary (ccdict) which can wrap a
shelf-instance (or any other simple dictish instance)
and provides the 'non-surprising' behaviour.
Some proof of concept code for the following
properties is provided here
http://home.trillke.net/~hpk/ccdict.py
Current properties are:
- ccdict wraps a dictionary-like object which
in turn only needs to provide
__getitem__, __setitem__, __delitem__,has_key
- on first access of an element
ccdict makes a lookup on the underlying
dict and caches the item.
- the next accesses work with the cached thing.
Unsurprising dict-semantics are provided.
- deleting an item is deferred and actually happens
on commit() time. deleting an item and later on
assigning to it works as expected (i.e. the assignment
takes preference).
- commit() transfers the items in the
cache to the underlying dict and clears
the cache.Prior to issuing commit
no writeback to the underlying dict happens.
- deleting an ccdict-instance does *not* commit any
changes. You have to explicitely call commit().
If you want to work readonly, don't call commit.
- clear() only cleares the cache and not the underlying
dict
- you can explicitely prune the cache (via cache.keys()
etc.) before calling commit(). This lets you
avoid writing back unmodified objects if this
is an issue.
It seems quite impossible to figure out automagically
which objects have been modified
and so the solution is to do it explicitely
(or don't commit for readonly).
holger
----------------------------------------------------------------------
Comment By: Raymond Hettinger (rhettinger)
Date: 2002-05-09 20:55
Message:
Logged In: YES
user_id=80475
A few more thoughts:
Please change the "except:" lines to specify the exception
being caught.
Also, if GvR shows interest in the patch, we should update
the library reference and add unittests.
The docstring should also mention that the cache is kept in
memory -- besides persistence, one of the forces for
shelving is memory conservation.
----------------------------------------------------------------------
Comment By: Raymond Hettinger (rhettinger)
Date: 2002-05-09 18:43
Message:
Logged In: YES
user_id=80475
Nicely done! The code is clean and runs in the smart mode
without problems on my existing programs. I agree that the
patch solves a real world problem. The solution is clean,
but a little expensive.
If there were a way to be able to tell if an entry had been
altered, it would save the 100% writeback. Unfortunately,
I can't think of a way.
The docstring could read more smoothly and plainly. Also,
it should be clear that the cost of setting smart=1 is that
100% of the entries get rewritten on close.
Two microscopically minor thoughts on the coding (feel free
to disregard). Can some of the try/except blocks be
replaced by something akin to 'if self.smart:'? For the
writeback loop, consider 'for k,v in cache.iteritems()' as
it takes less memory and saves a lookup.
----------------------------------------------------------------------
Comment By: Martin v. Löwis (loewis)
Date: 2002-05-07 16:38
Message:
Logged In: YES
user_id=21627
Even more important than the backwards compatibility might
be the issue that it writes back all accessed objects on
close, which might be expensive if there have been many
read-only accesses.
So I think the option name could be also 'slow'; although
'writeback' might be more technical.
Also, I wonder whether write-back should be attempted if the
shelve was opened read-only.
----------------------------------------------------------------------
You can respond by visiting:
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=553171&group_id=5470