[Python-Dev] Impact of Namedtuple on startup time

Steven D'Aprano steve at pearwood.info
Mon Jul 17 12:45:20 EDT 2017


On Mon, Jul 17, 2017 at 02:43:19PM +0200, Antoine Pitrou wrote:
> 
> Hello,
> 
> Cost of creating a namedtuple has been identified as a contributor to
> Python startup time.  Not only Python core and the stdlib, but any
> third-party library creating namedtuple classes (there are many of
> them).  An issue was created for this:
> https://bugs.python.org/issue28638

Some time ago, I needed to backport a version of namedtuple to Python 
2.4, so I started with Raymond's recipe on Activestate and modified it 
to only exec the code needed for __new__. The rest of the class is an 
ordinary inner class:

# a short sketch
def namedtuple(...):
    class Inner(tuple):
        ...
    exec(source, ns)
    Inner.__new__ = ns['__new__']
    return Inner


Here's my fork of Raymond's recipe:

https://code.activestate.com/recipes/578918-yet-another-namedtuple/


Out of curiosity, I took that recipe, updated it to work in Python 3, 
and compared it to the std lib version. Here are some representative 
timings:

[steve at ando ~]$ python3.5 -m timeit -s "from collections import 
namedtuple" "K = namedtuple('K', 'a b c')"
1000 loops, best of 3: 1.02 msec per loop

[steve at ando ~]$ python3.5 -m timeit -s "from nt3 import namedtuple" "K = 
namedtuple('K', 'a b c')"
1000 loops, best of 3: 255 usec per loop


I think that proves that this approach is viable and can lead to a big 
speed up.

I don't think that merely dropping the _source attribute will save much 
time. It might save a bit of memory, but in my experiements dropping it 
only saves about 10µs more. I think the real bottleneck is the cost of 
exec'ing the entire class.


 
-- 
Steve


More information about the Python-Dev mailing list