python bijection

Raymond Hettinger python at rcn.com
Sat Dec 5 03:32:25 EST 2009


[Me]
> > * we've already got one (actually two).
> >   The two dictionary approach...

[Francis Carr]
> Solutions such as bidict just automate the two-dict approach.

They do so at the expense of implementing a new API to support it and
at the expense with having non-obvious behaviors (i.e. how it handles
collapsing two records into one, which exceptions can be raised, how
the bijection invariants are maintained, which of the two symmetric
accessors is the primary, etc). This is not a cost-free choice of
simply "automating something".

IMO, "just automating" something that is already clear is not
necessarily a net win.  Each new class has a learning curve and it is
sometimes simpler to use the basic dictionary API instead of inventing
a new one.  I would *much* rather debug code written by someone using
two dictionaries than code using any of the APIs discussed so far --
all of those hide important design decisions behind a layer of
abstraction.  The API must be easy to learn and remember (including
all of it quirks); otherwise, it is a net mental tax on the programmer
and code reviewers.

Also, I've presented examples of usability problems with APIs that do
not require the user to specify names for the two directions.  It is
too easy to make mistakes that are hard to see.  Which is correct,
phonelist['raymond'] or phonelist[raymonds_phonenumber]?  There is no
way to tell without looking back at the bijection specification.  The
API must be self-documenting.  An API that is error-prone is worse
than having no bijection class at all.

Further, the API needs to be consistent with the rest of the language
(no abusing syntax with the likes of phonelist[:number]).

Unfortunately, Mark Lemburg has thrown gas on this fire, so the
conversation will likely continue for a while.  If so, it would be
helpful if the conversation centered around real-world examples of
code that would be improved with a bijection class.  In my experience,
there are examples where bijections arise in programs but it doesn't
happen often enough to warrant a new class for it.  In cases where it
does arise, people should try-out the proposed APIs to see if in-fact
the code is made more clear or whether simple work is just being
hidden behind a layer of abstraction.  For grins, insert an error into
the code (conflating the primary key with the secondary key) and see
if the error is self-evident or whether it is hidden by the new API.
Also, test the API for flexibility (how well can it adapt to
injections and surjections, can it handle use cases with default
values, is there a meaningful interpretation of dict.fromkeys() in a
bijection, can a user specify how to handle violations of the
bijection invariants by raising exceptions, supplying default values,
collapsing records, etc.)

Even if a proposed API passes those smell tests, demonstrates the
required flexibility, is easy to learn and use, is not error-prone,
and actually improves real-world use cases, it is my opinion that a
recipe for it will not garner a fan club and that it will have limited
uptake.  ISTM that it is more fun to write classes like this than it
is to actually use them.

Raymond


P.S.  It also makes sense to look at other mature languages to see
whether they ever find a need to include a bijection class in their
standard library or builtin collections.  If Smalltalk, Java, Forth,
Go, Io, Haskell and C++ couldn't be sold on it, perhaps the buyer
should beware.  If some language is found that did include a bijection
class, then do a Google code search to see how it fared in the real-
world.  My bet is that it either wasn't used much or that it actually
make code worse by making errors harder to spot.





More information about the Python-list mailing list