My Experiences Subclassing String

Paul McGuire ptmcg at austin.rr._bogus_.com
Mon Jun 7 11:41:53 EDT 2004


"Fuzzyman" <michael at foord.net> wrote in message
news:8089854e.0406070423.5d2d1d71 at posting.google.com...
> I recently went through a bit of a headache trying to subclass
> string.... This is because the string is immutable and uses the
> mysterious __new__ method rather than __init__ to 'create' a string.
> To those who are new to subclassign the built in types, my experiences
> might prove helpful. Hopefully not too many innacuracies :-)

<snip>

> The bit I couldn't get (and I didn't have access to a python manual at
> the time) - if the __new__ method is responsible for returning the new
> instance of the string, surely it wouldn't have a reference to self;
> since the 'self' wouldn't be created until after __new__ has been
> called......
>
> Actually thats wrong - so, a simple string type might look something
> like this :
>
> class newstring(str):
>     def __new__(self, value):
>         return str.__new__(self, value)
>     def __init__(self, value):
>         pass
>
> See how the __new__ method returns the instance and the __init__ is
> just a dummy.
> If we want to add the extra attribute we can do this :
>
>
> class newstring(str):
>     def __new__(self, value, othervalue):
>         return str.__new__(self, value)
>     def __init__(self, value, othervalue):
>         self.othervalue = othervalue
>
> The order of creation is that the __new__ method is called which
> returns the object *then* __init__ is called. Although the __new__
> method receives the 'othervalue' it is ignored - and __init__ uses it.
<snip>

Fuzzy -

I recently went down this rabbit hole while trying to optimize Literal
handling in pyparsing.  You are close in your description, but there is one
basic concept that I think still needs to be sorted out for you.

Think of __new__ as a class-level factory method, not an instance method.
That first argument that you passed to your example as 'self' is not the
self instance, it is the class being new'ed.  By luck, even though you
called it 'self', you passed it to str.__new__ where the class argument is
supposed to go, so everything still worked.

The canonical/do-nothing __new__ method looks like this:

class A(object):
    def __new__(cls,*args):
        return object.__new__(cls)

There's nothing stopping you from looking at the args tuple to see if you
want to do more than this, but in truth that's what __init__ is for.

Here's a sample of using __new__ to return a different class of object,
depending on the initialization arguments:

class SpecialA(object):
    pass

class A(object):
    def __new__(cls,*args):
        print cls,":",args
        if len(args)>0 and args[0]==2:
            return object.__new__(SpecialA)
        return object.__new__(cls)

obj = A()
print type(obj)
obj = A(1)
print type(obj)
obj = A(1,"test")
print type(obj)
obj = A(2,"test")
print type(obj)

gives the following output:

<class '__main__.A'> : ()
<class '__main__.A'>
<class '__main__.A'> : (1,)
<class '__main__.A'>
<class '__main__.A'> : (1, 'test')
<class '__main__.A'>
<class '__main__.A'> : (2, 'test')
<class '__main__.SpecialA'>


HTH,
-- Paul





More information about the Python-list mailing list