ANNOUNCE: Thesaurus - a recursive dictionary subclass using attributes

Tue Dec 11 01:41:38 EST 2012

On Mon, Dec 10, 2012 at 8:48 PM, Dave Cinege <dave at cinege.com> wrote:
> Thesaurus: A different way to call a dictionary.
>
> Thesaurus is a new a dictionary subclass which allows calling keys as
> if they are class attributes and will search through nested objects
> recursively when __getitem__ is called.
>
> You will notice that the code is disgusting simple. However I have found that
> this has completely changed the way I program in python. I've re-written some
> exiting programs using Thesaurus, and often realized 15-30% code reduction.
> Additionally I find the new code much easier to read.
>
> If you find yourself programing with nested dictionaries often, fighting to
> generate output or command lines for external programs, or wish you had
> a dictionary that could act (sort of) like a class, Thesaurus may be for you.

I have a few critiques on the code.  First, you might want to use
__getattribute__ instead of __getattr__.  Otherwise you'll end up
running into bugs like this:

>>> thes = Thesaurus()
>>> thes.update = 'now'
>>> thes.update
<built-in method update of Thesaurus object at 0x01DB30C8>

Hey, where'd my data go?  The answer is that it is in the Thesaurus:

>>> thes['update']
42

But it's not visible as an attribute because it is shadowed by the
dict methods.  Using __getattribute__ instead of __getattr__ would
mean that those non-special methods simply wouldn't be visible at all.

Second, in __getitem__ you start a loop with "for i in
range(len(l)):", and then you use i as an index into l several times.
It would be cleaner and more Pythonic to do "for i, part in
enumerate(l):", and then you can replace every occurrence of "l[i]"
with "part" (or whatever you want to call that variable).

Third, also in __getitem__ you have this code:

"""
if '.' not in key:
	return dict.__getitem__(self, key)
l = key.split('.')
if isinstance(l[0], (dict, Thesaurus)):
	a = self.data
else:
	a = self
"""

It's not clear to me what the isinstance call here is meant to be
testing for.  The prior statements require key to be a string.  If key
is a string, then by construction l[0] is also a string.  So it seems
to me that the isinstance check here will always be False.

In any case, the key splitting here seems to be done primarily to
support the use of formatting placeholders like "%(L.l.1)s" in the
examples.  I want to point out that this use case is already well
supported (I might even say "better" supported since it cleanly
distinguishes index elements from attributes with syntax) by the
str.format style of string formatting:

>>> L = {'l': ['zero', 'one']}
>>> "There should be {L[l][1]}-- and preferably only {L[l][1]} --obvious way to do it.".format(L=L)
'There should be one-- and preferably only one --obvious way to do it.'

Lastly, you have several bare "except" clauses in the code.  Bare
excepts are almost always incorrect.  I appreciate that it's not easy
to predict exactly what exceptions might turn up here (although I
posit that for all of these, subsets of (TypeError, KeyError,
AttributeError, IndexError) are sufficient), but at the very minimum
you should specify "except Exception", so that you're not
inadvertently catching things like SystemExit and KeyboardInterrupt.

Cheers and hope this is helpful,
Ian