Creating a dict-like class that counts successful and failed key matches

Mon Jun 30 10:28:56 EDT 2014

On Mon, Jun 30, 2014 at 11:43 PM,  <python at bdurham.com> wrote:
> As a diagnostic tool, I would like to create a dict-like class that counts
> successful and failed key matches by key. By failed I mean in the sense that
> a default value was returned vs. an exception raised. By count, I mean by
> tracking counts for individual keys vs. just total success/failure counts.
> The class needs to support setting values, retrieving values, and retrieving
> keys, items, and key/item pairs. Basically anything that a regular dict, I'd
> like my modified class to do as well.

Sounds like you want to subclass dict, then. Something like this:

class StatsDict(dict):
    def __init__(self, *a, **ka):
        super().__init__(*a, **ka)
        self.success = defaultdict(int)
        self.fail = defaultdict(int)
    def __getitem__(self, item):
        try:
            ret = super().__getitem__(item)
            self.success[item] += 1
            return ret
        except KeyError:
            self.fail[item] += 1
            raise

On initialization, set up some places for keeping track of stats. On
item retrieval (I presume you're not also looking for stats on item
assignment - for that, you'd want to also override __setitem__),
increment either the success marker or the fail marker for that key,
based exactly on what you say: was something returned, or was an
exception raised.

To get the stats, just look at the success and fail members:

>>> d = StatsDict()
>>> d["foo"]=1234
>>> d["foo"]
1234
>>> d["spam"]
(chomp)
KeyError: 'spam'
>>> d["foo"]
1234
>>> d["foo"]
1234
>>> d["test"]
(chomp)
KeyError: 'test'
>>> len(d.success) # Unique successful keys
1
>>> len(d.fail) # Unique failed keys
2
>>> sum(d.success.values()) # Total successful lookups
3
>>> sum(d.fail.values()) # Total unsuccessful lookups
2

You can also interrogate the actual defaultdicts, eg to find the hottest N keys.

For everything other than simple key retrieval, this should function
identically to a regular dict. Its repr will be a dict's repr, its
iteration will be its keys, all its other methods will be available
and won't be affected by this change. Notably, the .get() method isn't
logged; if you use that and want to get stats for it, you'll have to
reimplement it - something like this:

    def get(self, k, d=None):
        try:
            return self[k]
        except KeyError:
            return d

The lookup self[k] handles the statisticking, but if you let this go
through to the dict implementation of get(), it seems to ignore
__getitem__.

This probably isn't exactly what you want, but it's a start, at least,
and something to tweak into submission :)

ChrisA