[Python-Dev] Avoid formatting an error message on attribute error

Victor Stinner victor.stinner at gmail.com
Thu Nov 7 12:32:50 CET 2013


2013/11/7 Steven D'Aprano <steve at pearwood.info>:
> My initial instinct here was to say that sounded like premature
> optimization, but to my surprise the overhead of generating the error
> message is actually significant -- at least from pure Python 3.3 code.

I ran a quick and dirty benchmark by replacing the error message with None.

Original:

$ ./python -m timeit 'hasattr(1, "y")'
1000000 loops, best of 3: 0.354 usec per loop
$ ./python -m timeit -s 's=1' 'try: s.y' 'except AttributeError: pass'
1000000 loops, best of 3: 0.471 usec per loop

Patched:

$ ./python -m timeit 'hasattr(1, "y")'
10000000 loops, best of 3: 0.106 usec per loop
$ ./python -m timeit -s 's=1' 'try: s.y' 'except AttributeError: pass'
10000000 loops, best of 3: 0.191 usec per loop

hasattr() is 3.3x faster and try/except is 2.4x faster on such micro benchmark.

> Given that, I wonder whether it would be worth coming up with a more
> general solution to the question of lazily generating error messages
> rather than changing AttributeError specifically.

My first question is about keeping strong references to objects (type
object for AttributeError). Is it an issue? If it is an issue, it's
maybe better to not modify the code :-)


Yes, the lazy formatting idea can be applied to various other
exceptions. For example, TypeError message is usually build using
PyErr_Format() to mention the name of the invalid type. Example:

        PyErr_Format(PyExc_TypeError, "exec() arg 2 must be a dict, not %.100s",
                     globals->ob_type->tp_name);

But it's not easy to store arbitary C types for PyUnicode_FromFormat()
parameters. Argument types can be char*, Py_ssize_t, PyObject*, int,
etc.

I proposed to modify (first/only) AttributeError, because it is more
common to ignore the AttributeError than other errors like TypeError.
(TypeError or UnicodeDecodeError are usually unexpected.)

>> It would be nice to only format the message on demand. The
>> AttributeError would keep a reference to the type.
>
> Since only the type name is used, why keep a reference to the type
> instead of just type.__name__?

In the C language, type.__name__ does not exist, it's a char* object.
If the type object is destroyed, type->tp_name becomes an invalid
pointer. So AttributeError should keep a reference on the type object.

>> AttributeError.args would be (type, attr) instead of (message,).
>> ImportError was also modified to add a new "name "attribute".
>
> I don't like changing the signature of AttributeError. I've got code
> that raises AttributeError explicitly.

The constructor may support different signature for backward
compatibility: AttributeError(message: str) and AttributeError(type:
type, attr: str).

I'm asking if anyone relies on AttributeError.args attribute.

Victor


More information about the Python-Dev mailing list