Instance method for converting int to str - str() and __str__()

Steven D'Aprano steve at pearwood.info
Sat Oct 3 07:29:38 EDT 2015


On Sat, 3 Oct 2015 04:35 pm, neubyr wrote:

> I was wondering if there is any resource that explains why certain methods
> like str() and type() were implemented the way they are, rather than
> .to_string() or .type() instance/object methods.

There is a FAQ that might help with this question:

https://docs.python.org/2/faq/design.html

In early Python, some objects (like ints, floats, tuples, None) had no
methods at all, so you couldn't have None.to_string() and expect it to
work. There was no base class for the entire object hierarchy, so no method
they could inherit: every class would have to re-implement their own
to_string method.

But even today, when all objects have methods and all inherit from object,
having functions instead of methods for some tasks makes good sense. It
allows us to write functions that operate by a protocol, rather than purely
by inheritance, and it helps guarantee consistent naming.

For example, let's look at conversion to bool. Should that method be
called .bool, .boolean, .to_bool, .truthify, ... ? Different classes that
don't inherit from each other may make different choices. By having a
built-in function (technically, a class, but the difference doesn't matter
here) called bool, that ensures one standard way to convert to bool.

How does bool() work? As a built-in function, it can (in theory) include
optimizations that aren't available to a method, for example it could look
like this pseudocode:

def bool(obj):
    if obj is None: return False
    if obj is a number: return obj != 0
    ...

which may be faster than calling obj.bool() since it doesn't have to search
the inheritance chain (remember that method resolution in Python happens at
runtime, not compile-time).

Of course, the built-in bool needs to know how to deal with custom classes
that aren't built-in. For that, we have a *protocol* that tells the
built-in functions how to deal with new, unknown classes. The most basic
protocol is to call a specified "dunder" method. Dunder stands for:

    Double leading and trailing UNDERscores

and refers to those methods __str__, __add__, __radd__, etc. that you so
often see. All dunder names are reserved for use by Python, so you should
never use them yourself, and especially never invent your own.

bool() is a good example of a protocol because it is more complex than just
a single method call. It looks something like this:

# Python 3 version
def bool(x):
    # possible optimizations for built-ins here...
    # now deal with everything else
    T = type(x)
    if hasattr(T, '__bool__'):
        flag = T.__bool__()
        if flag in (True, False):
            return flag
        else:
            raise TypeError
    elif hasattr(T, '__len__'):
        return T.__len__() != 0
    else:
        return True
        

bool(x) ignores methods defined on the instance x itself, and goes straight
to the class object (saving a runtime lookup). If the dunder method does
not return an actual boolean, it raises an error. And if the object doesn't
define a __bool__ method, it falls back on __len__, and if it doesn't
define __len__ either, it could in principle have other fallbacks, but in
practice just returns a default value.

(Of course, the actual built-in bool may be written in C, not Python, but
that's besides the point.)

So you should never call dunder methods such as __str__ yourself, always
call the wrapper. Chances are very good that the function form will contain
optimizations that are unavailable to you, will perform pre-processing or
post-processing, catch exceptions, or error-check the result.




-- 
Steven




More information about the Python-list mailing list