[SciPy-Dev] Breaking up scipy.stats or How to avoid importing the kitchen sink (when we are not in the kitchen)

josef.pktd at gmail.com josef.pktd at gmail.com
Wed Feb 16 10:23:11 EST 2011

Warren's thread on scipy's subpackages made me realize that we can
break up the imports in scipy.stats in a backward compatible way.

"from scipy import stats" is slow unless scipy is already in the disk cache

len(beforenp), len(beforesp), len(beforestats), len(after)
125 261 341 569
>>> 569 - 341

e.g. who import scipy.sparse if there is no sparse code in scipy.stats
If I only want to use some tests, then all I need is scipy.stats.stats
and scipy.special


keep scipy.stats as API import subpackage as public API especially for
interactive work

move all modules from scipy.stats into another directory, scipy.stats_
or scipy.statslib or something:
  keep it's __import__ empty
  create API one level down
  - stats_basic: current stats.stats plus tests from morestats,
(name?): imports only scipy.special
  - stats_other: rest of morestats and other extras (plots,
boxcox,...),  (name?)
  - mstats
  - kde
  - distributions: imports the kitchen sink
    no lazy imports possible because distributions are instances and
not just classes

then we can do
"from scipy.statlib import stats_basic"
and we get the ttests with an import of scipy.special plus one module
instead of plus 215 modules.

This is currently just an idea, and I won't pursue it further if we
don't want to go this way.


I don't understand some things about the imports,
why do I get some distutils and enthought modules with the stats
import. (I don't understand the lazy import machinery.)

statsmodels just switched to separating API from package imports.

import sys, copy
beforenp = copy.copy(sys.modules)
import numpy
beforesp = copy.copy(sys.modules)
import scipy
beforestats = copy.copy(sys.modules)
from scipy import stats
after = copy.copy(sys.modules)

print 'len(beforenp), len(beforesp), len(beforestats), len(after)'
print len(beforenp), len(beforesp), len(beforestats), len(after)

from pprint import pprint
pprint(sorted(set(('.'.join(i.split('.')[:2]) for i in

##pprint(sorted(set(('.'.join(i.split('.')[:2]) for i in
##                   set(beforestats)-set(beforesp)))))

>python -i stats_imports.py
len(beforenp), len(beforesp), len(beforestats), len(after)
125 261 341 569


More information about the SciPy-Dev mailing list