performance critical Python features

Steven D'Aprano steve+comp.lang.python at pearwood.info
Thu Jun 23 20:07:43 EDT 2011


On Fri, 24 Jun 2011 04:00:17 +1000, Chris Angelico wrote:

> On Fri, Jun 24, 2011 at 2:58 AM, Eric Snow <ericsnowcurrently at gmail.com>
> wrote:
>> So, which are the other pieces of Python that really need the heavy
>> optimization and which are those that don't?  Thanks.
>>
>>
> Things that are executed once (imports, class/func definitions) and

You can't assume that either of those things are executed once. Consider 
this toy example:

def outer(a, b):
    def inner(x):
        return (x*a - b)*(x*b - a) - 1
    return inner(b**2 - a**2)

results = [outer(a, b) for (a, b) in coordinate_pairs()]

The function definition for inner gets executed repeatedly, inside a 
tight loop.

Fortunately Python does optimize this case. The heavy lifting (parsing 
the source of inner, compiling a code object) is done once, when outer is 
defined, and the only work done at runtime is assembling the pieces into 
a function object, which is fast.

Similarly, imports are so expensive that it makes sense to optimize them. 
A single line like "import module" requires the following work:

- expensive searches of the file system, looking for a module.py file 
  or a module/__init__.py package, possibly over a slow network or 
  inside zip files;
- once found, parse the file;
- compile it;
- execute it, which could be arbitrarily expensive;
- and which may require any number of new imports.

Again, imports are already optimized in Python: firstly, once a module 
has been imported the first time, the module object is cached in 
sys.modules so that subsequent imports of that same module are much 
faster: it becomes little more than a name lookup in a dict. Only if that 
fails does Python fall back on the expensive import from disk.

Secondly, Python tries to cache the compiled code in a .pyc or .pyo file, 
so that parsing and compiling can be skipped next time you import from 
disk (unless the source code changes, naturally).

And even so, importing is still slow. That's the primary reason why 
Python is not suitable for applications where you need to execute lots of 
tiny scripts really fast: each invocation of the interpreter requires a 
whole lot of imports, which are slow the first time.

(Still, Python's overhead at startup time is nowhere near as expensive as 
that of Java... but Java is faster once started up.)


-- 
Steven



More information about the Python-list mailing list