Request for comments on a design

TomF tomf.sessile at gmail.com
Sat Oct 23 14:50:03 EDT 2010


On 2010-10-23 01:50:53 -0700, Peter Otten said:

> TomF wrote:
> 
>> I have a program that manipulates lots of very large indices, which I
>> implement as bit vectors (via the bitarray module).   These are too
>> large to keep all of them in memory so I have to come up with a way to
>> cache and load them from disk as necessary.  I've been reading about
>> weak references and it looks like they may be what I want.
>> 
>> My idea is to use a WeakValueDictionary to hold references to these
>> bitarrays, so Python can decide when to garbage collect them.  I then
>> keep a key-value database of them (via bsddb) on disk and load them
>> when necessary.  The basic idea for accessing one of these indexes is:
>> 
>> _idx_to_bitvector_dict = weakref.WeakValueDictionary()
> 
> In a well written script that cache will be almost empty. You should compare
> the weakref approach against a least-recently-used caching strategy. In
> newer Pythons you can use collections.OrderedDict to implement an LRU cache
> or use the functools.lru_cache decorator.

I don't know what your first sentence means, but thanks for pointers to 
the LRU stuff.  Maintaining my own LRU cache might be a better way to 
go.  At least I'll have more control.

Thanks,
-Tom




More information about the Python-list mailing list