[Numpy-discussion] NumPy 1.2.0b2 released

Andrew Dalke dalke at dalkescientific.com
Fri Aug 15 12:41:26 EDT 2008


On Aug 15, 2008, at 4:38 PM, Pauli Virtanen wrote:
> I think you can still do something evil, like this:
>
> 	import os
> 	if os.environ.get('NUMPY_VIA_API', '0') != '0':
>             from numpy.lib.fromnumeric import *
>             ...
>
> But I'm not sure how many milliseconds must be gained to justify  
> this...

I don't think it's enough.  I don't like environmental
variable tricks like that.  My tests suggest:
   current SVN: 0.12 seconds
   my patch: 0.10 seconds
   removing some top-level imports: 0.09 seconds
   my patch and removing some
      additional top-level imports: 0.08 seconds (this is a guess)


First, I reverted my patch, so my import times went from
0.10 second to 0.12 seconds.

Second, I commented out the pure module imports from numpy/__init__.py

     import linalg
     import fft
     import random
     import ctypeslib
     import ma
     import doc

The import time went to 0.089.  Note that my patch also
gets rid of "import doc" and "import ctypeslib", which
take up a good chunk of time.  The fft, linalg, and
random libraries take 0.002 seconds each, and ma takes 0.007.


Not doing these imports makes code about 0.01 second
faster than my patches, which shaved off 0.02 seconds.
That 0.01 second comes from not importing the
fft, linalg, and ma modules.

My patch does improve things in a few other places, so
perhaps those other places adds another 0.01 seconds
of performance.


Why can't things be better?  Take a look at the slowest
imports. (Note, times are inclusive of the children)

== Slowest (including children) ==
0.089 numpy (None)
0.085 add_newdocs (numpy)
0.079 lib (add_newdocs)
0.041 type_check (lib)
0.040 numpy.core.numeric (type_check)
0.015 _internal (numpy.core.numeric)
0.014 numpy.testing (lib)
0.014 re (_internal)
0.010 unittest (numpy.testing)
0.010 numeric (numpy.core.numeric)
0.009 io (lib)

Most of the time is spent importing 'lib'.

Can that be made quicker?  Not easily.  "lib" is
first imported in "add_newdocs".  Personally, I
want to get rid of add_newdocs and move the
docstrings into the correct locations.

Stubbing the function out by adding

def add_newdoc(*args): pass

to the tops of add_newdocs.py saves 0.005 seconds,
but if you try it out and remove the "import lib"
from add_newdocs.py then you'll have to fix a
cyclical dependency.

numpy/__init__.py: import core
numpy/core/__init__.py: from defmatrix import *
numpy/core/defmatrix.py: from numpy.lib.utils import issubdtype
numpy/lib/__init__.py: from type_check import *
numpy/lib/type_check.py: import numpy.core.numeric as _nx
AttributeError: 'module' object has no attribute 'core'

The only way out of the loop is to have numpy/__init__.py
import lib before importing core.

It's possible to clean up the code so this loop
doesn't exist, and fix things so that fewer things
are imported when some environment variable is set,
but it doesn't look easy.  Modules depend on other
modules a bit too much to make me happy.


				Andrew
				dalke at dalkescientific.com





More information about the NumPy-Discussion mailing list