[SciPy-Dev] Question about subpackage/submodule API

Ralf Gommers ralf.gommers at googlemail.com
Wed Mar 9 00:17:58 EST 2011


On Sun, Feb 20, 2011 at 3:33 PM, Ralf Gommers
<ralf.gommers at googlemail.com> wrote:
> On Tue, Feb 15, 2011 at 7:53 AM, Warren Weckesser
> <warren.weckesser at enthought.com> wrote:
>>
>>
>> On Sat, Feb 12, 2011 at 7:28 PM, Ralf Gommers <ralf.gommers at googlemail.com>
>> wrote:
>>>
>>>
>>> On Sun, Feb 13, 2011 at 5:05 AM, Pauli Virtanen <pav at iki.fi> wrote:
>>>>
>>>> One wild idea to make this clearer could be to prefix all internal sub-
>>>> package names with the usual '_'. In the long run, it probably wouldn't
>>>> be as bad as it initially sounds like.
>>>>
>>> This is not a wild idea at all, I think it should be done. I considered
>>> all modules without '_' prefix public API.
>>>
>>
>>
>> Agreed (despite what I said in my initial post).
>>
>> To actually do this, we'll need to check which packages have modules that
>> should be private.  These can be renamed in 0.10 to have an underscore,  and
>> new public versions created that contain a deprecation warning and that
>> import everything from the private version.   The deprecated public modules
>> can be removed in 0.11.
>>
>> Some modules will require almost no changes.  For example, scipy.cluster
>> *only* exposes two modules, vq and hierarchy, so no changes are needed.
>> (Well, there is also the module info.py that all packages have.  That should
>> become _info.py--there's no need for that to be public, is there?)
>
> Agreed, rename to _info.py

This can't actually be done very easily, because the info.py name is
hardcoded in PackageLoader in numpy._import_tools.py. So if desired,
it first has to be done in numpy.


>> Other packages will probably require some discussion about what modules should be
>> public.
>>
>> Consider the above a proposed change for 0.10 and 0.11--what do you think?
>>
> Sounds good. Attached is a file that goes through scipy sub-packages
> and checks their __all__ for modules. Those are public by definition
> (but this doesn't give you the whole API). It's pretty messy, for
> example:
>
> signal
> ======
> bsplines
> filter_design
> fir_filter_design
> integrate
> interpolate
> linalg
> ltisys
> np
> numpy
> optimize
> scipy
> signaltools
> sigtools
> special
> spline
> types
> warnings
> waveforms
> wavelets
> windows
>
> That should be cleaned up. Then there are also public modules that
> don't show up of course (for example odr.models).
>
> How about doing the following? :
> 1. Start a doc, perhaps on the wiki, with a full list of public modules.
> 2. Put that doc at the beginning of the reference guide, as well as
> the relevant part in the docstring for each sub-package.
> 3. Clean up existing __all__, and add __all__ to sub-packages that
> don't have them yet.
> 4. Rename private modules, with suitable deprecation warning.

I've done the uncontroversial part (3):
https://github.com/rgommers/scipy/tree/refactor-private-modules
This cleans up the sub-package namespaces quite a bit, which will also
make for example tab-completion in IPython easier to use. For example,
sp.signal.<TAB> gives now 147 results instead of 262.

There is one thing I wasn't sure about: should
arccos/arccosh/arcsinh/... stay exposed in the scipy.special
namespace, even though they are numpy functions?

Here is a complete list of modules that I think are part (or should be
part) of the public API. I added modules because they are documented
as being public in docs, or contain useful functions/objects that are
not exposed one level up, or because the sub-package namespace is very
large and could benefit from a subdivision.

cluster
=======
vq
hierarchy

constants
=========

fftpack
=======

integrate
=========
vode

interpolate
===========
dfitpack

io
==
arff
idl
matlab
mmio
netcdf
wavfile

linalg
======
calc_lwork
cblas
clapack
fblas
flapack
flinalg
lapack
special_matrices

maxentropy
==========
<deprecate this module>

misc
====
doccer
pilutil

ndimage
=======
filters
fourier
interpolation
io
measurements
morphology

odr
===
models
odrpack

optimize
========

signal
======
bsplines
filter_design
fir_filter_design
ltisys
spectral
spline
waveforms
wavelets
windows

sparse
======

sparse.linalg
=============
umfpack

spatial
=======
distance

special
=======

stats
=====
distributions
mstats
<add more after refactoring, Josef wants to break up module. I agree
it's too large and slow to import.>

weave
=====
<didn't look at weave yet>


Cheers,
Ralf



More information about the SciPy-Dev mailing list