My Experiences Subclassing String

Fuzzyman michael at foord.net
Mon Jun 7 08:23:19 EDT 2004


I recently went through a bit of a headache trying to subclass
string.... This is because the string is immutable and uses the
mysterious __new__ method rather than __init__ to 'create' a string.
To those who are new to subclassign the built in types, my experiences
might prove helpful. Hopefully not too many innacuracies :-)

I've just spent ages trying to subclass string.... and I'm very proud
to say I finally managed it !

The trouble is that the string type (str) is immutable - which means
that new instances are created using the mysterious __new__ method
rather than __init__ !! :-) You still following me.... ?

SO :

class newstring(str):
    def __init__(self, value, othervalue):
        str.__init__(self, value)
        self.othervalue = othervalue

astring = newstring('hello', 'othervalue')

fails miserably. This is because the __new__ method of the str is
called *before* the __init__ value.... and it says it's been given too
many values. What the __new__ method does is actually return the new
instance - for a string the __init__ method is just a dummy.

The bit I couldn't get (and I didn't have access to a python manual at
the time) - if the __new__ method is responsible for returning the new
instance of the string, surely it wouldn't have a reference to self;
since the 'self' wouldn't be created until after __new__ has been
called......

Actually thats wrong - so, a simple string type might look something
like this :

class newstring(str):
    def __new__(self, value):
        return str.__new__(self, value)
    def __init__(self, value):
        pass

See how the __new__ method returns the instance and the __init__ is
just a dummy.
If we want to add the extra attribute we can do this :


class newstring(str):
    def __new__(self, value, othervalue):
        return str.__new__(self, value)
    def __init__(self, value, othervalue):
        self.othervalue = othervalue

The order of creation is that the __new__ method is called which
returns the object *then* __init__ is called. Although the __new__
method receives the 'othervalue' it is ignored - and __init__ uses it.
In practise __new__ could probably do all of this - but I prefer to
mess around with __new__ as little as possible ! I was just glad I got
it working..... What it means is that I can create my own class of
objects - that in most situations will behave like strings, but have
their own attributes. The only restriction is that the string value is
immutable and must be set when the object is created. See the
excellent path module by Jason Orendorff for another example object
that behaves like a string but also has other attributes - although it
doesn't use the __new__ method; or the __init__ method I think.

Regards,

Fuzzy

Posted to Voidspace - Techie Blog :
http://www.voidspace.org.uk/voidspace/index.shtml
Experiences used in the python modules at :
http://www.voidspace.org.uk/atlantibots/pythonutils.html



More information about the Python-list mailing list