weird dict problem, how can this even happen?

Joel Hedlund joel.hedlund at gmail.com
Wed Dec 17 08:17:28 EST 2008


Steven D'Aprano wrote:
> On Tue, 16 Dec 2008 14:32:39 +0100, Joel Hedlund wrote:
>> Duncan Booth wrote:
>>> Alternatively give up on defining hash and __eq__ for FragmentInfo and
>>> rely on object identity instead.
>> Object identity wouldn't work so well for caching. Objects would always
>> be drawn as they appeared for the first time. No updates would be shown
>> until the objects were flushed from the cache.
> 
> Perhaps I don't understand your program structure, but I don't see how 
> that follows.

First off, please note that I consider my problem to be solved, many 
thanks to c.l.p and especially Duncan Booth. But of course continued 
discussion on this topic can be both enlightening and entertaining as 
long as people are interested. So here goes:

I'm making a scientific program that visualizes data. It can be thought 
of as a freely zoomable map viewer with a configurable stack of data 
feature renderers, themselves also configurable to enhance different 
aspects of the individual features. As you zoom in, it becomes possible 
to render more types of features in a visually recognizable manner, so 
consequentially more feature renderers become enabled. The determining 
properties for the final image are then the coordinates and zoom level 
of the view, and the current renderer stack configuration. Rendering may 
be time consuming, so I cache the resulting bitmap fragments using a 
"key object" called FragmentInfo that holds this information.

Some renderers may take a very long time to do their work, so in order 
to keep the gui nice and responsive, I interrupt the rendering chain at 
that point, put the remainder of the rendering chain in a job pool, and 
postpone finishing up the rendering until there are cpu cycles to spare. 
At that time, I go through the job pool and ask the cache which fragment 
was most recently accessed. This fragment is the most likely to actually 
be in the current view, and thus the most interesting for the user to 
have finallized.

Now, when the user navigates the view to any given point, the gui asks 
the cache if the bitmap fragments necessary to tile up the current view 
have already been rendered, and if so, retrieves them straight from the 
cache and paints them to the screen. And that's why object identity 
wouldn't work. If the user changes something in the config for the 
current renderer stack (=mutates the objects), the renderers still 
retain the same object identities and therefore the old versions would 
be retrieved from the cache, and no updates would be shown until the 
fragments are flushed from the cache, and the fragment subsequently 
redrawn.

I guess you could delve deep into the data members and pull out all 
their object identities and hash wrt that if you'd really want to, but I 
don't really see the point.

The stupid cache implementation that I originally employed used a dict 
to store the FragmentInfo:BitmapFragment items plus a use tracker (list) 
on the side. This obviously doesn't work, because as soon as the 
renderer stack mutates, list and dict go out of sync and I can no longer 
selectively flush old items from the dict, because it's not readily 
apparent how to reconstruct the old keys. Purely using lists here is 
vastly superior because I can just .pop() the least recently used items 
from the tail and be done with them. Also for lists as small as this, 
the cost in performance is at most negligible.

>> I've been experimenting with a list cache now and I can't say I'm
>> noticing any change in performance for a cache of 100 items. 
> 
> 100 items isn't very big though. If you have 50,000 items you may notice 
> significant slow down :)
> 
> If having many items in the cache is possible, you should consider using 
> a binary search instead of a linear search through the cache. See the 
> bisect module.

Thanks for the tip, but I don't forsee this cache ever needing to be 
that big. ~100 is quite enough for keeping the gui reasonably responsive 
in this case.

/Joel



More information about the Python-list mailing list