trouble subclassing str

Paul McGuire ptmcg at austin.rr.com
Fri Jun 24 09:28:46 EDT 2005


>From purely Python terms, there is a distinction that one of these
classes (PaddedStr) is immutable, while the other is not.  Python only
permits immutable objects to act as dictionary keys, so this would one
thing to differentiate these two approaches.

But on a more abstract, implementation-independent level, this is a
distinction of inheritance vs. composition and delegation.  Inheritance
was one of the darling concepts in the early days of O-O programming,
with promises of reusability and development speed.  But before long,
it turned out that inheritance comes with some unfriendly baggage -
dependencies between subclasses and superclasses made refactoring more
difficult, and modifications to supertypes had unwanted effects on
subclasses.  Sometimes subclasses would use some backdoor knowledge of
the supertype data, thereby limiting flexibility in the superclass -
this phenomenon is often cited as "inheritance breaks encapsulation."

One check for good inheritance design is the Liskov Substitution
Principle (LSP) (Thanks for the Robert Martin link, Kent - you beat me
to it). Borrowing from the Wiki-pedia:
"In general, the principle mandates that at all times objects from a
class can be swapped with objects from an inheriting class, without the
user noticing any other new behaviour. It has effects on the paradigms
of design by contract, especially regarding to specification:
- postconditions for methods in the subclass should be more strict than
those in the superclass
- preconditions for methods in the subclass should be less strict than
those in the superclass
- no new exceptions should be introduced in the subclass"
(http://en.wikipedia.org/wiki/Liskov_substitution_principle)

One thing I like about this concept is that is fairly indepedent of
language or implementation features. I get the feeling that many such
rules/guidelines seem to be inspired by limitations or gimmicks that
are found in programming language X (usually C++ or Java), and then
mistakenly generalized to be universal O-O truths.

Looking back to PaddedStr vs. MyString, you can see that PaddedStr will
substitute for str, and for that matter, the MyString behavior that is
given could be a reasonable subclass of str, although maybe better
named StarredStr.  But let's take a slightly different MyString, one
like this, where we subclass str to represent a person's name:

    class Person(str):
        def __new__(cls,s,data):
            self = str.__new__(cls,s)
            self.age = data
            return self

    p = Person("Bob",10)
    print p,p.age

This is handy enough for printing out a Person and getting their name.
But consider a father and son, both named "Bob".

    p1 = Person("Bob",10)
    p2 = Person("Bob",35)  # p1's dad, also named Bob
    print p1 == p2  # prints 'true', should it?
    print p1 is p2  # prints 'false'


Most often, I see "is-a" confused with "is-implemented-using-a".  A
developer decides that there is some benefit (reduced storage, perhaps)
of modeling a zip code using an integer, and feels the need to define
some class like:

    class ZipCode(int):
        def lookupState(self):
            ...

But zip codes *aren't* integers, they just happen to be numeric - there
is no sense in supporting zip code arithmetic, nor in using zip codes
as slice indices, etc.  And there are other warts, such as printing zip
codes with leading zeroes (like they have in Maine).

So when, about once a month we see on c.l.py "I'm having trouble
sub-classing <built-in class XYZ>," I can't help but wonder if telling
the poster how to sub-class an XYZ is really doing the right thing.

In this thread, the OP wanted to extend str with something that was
constructable with two arguments, a string and an integer, as in s1 =
MyString("some text", 100).  I tried to propose a case that would be a
good example of inheritance, where the integer would be used to define
and/or constrain some str attribute.  A *bad* example of inheritance
would have been one where the 100 had some independent characteristic,
like a font size, or an age value to be associated with a string that
happens to contain a person's name.  In fact, looking at the proposed
MyClass, this seems to be the direction he was headed.

When *should* you use inheritance?  Well, for a while, there was such
backlash that the response was "Never".  Personally, I use inheritance
in cases where I have adopted a design pattern that incorporates it,
such as Strategy; otherwise, I tend not to use it.  (For those of you
who use my pyparsing package, it is loaded with the Strategy pattern.
The base class ParserElement defines an abstract do-nothing parsing
implementation, which is overridden in subclasses such as Literal,
Word, and Group.  All derived instances are treated like the base
ParserElement, with each subclass providing its own specialized
parseImpl or postParse behavior, so any subclass can be substituted for
the base ParserElement, satisfying LSP.)

I think the current conventional wisdom is "prefer composition over
inheritance" - never say "never"! :)

-- Paul




More information about the Python-list mailing list