[SciPy-dev] Technology Previews and Sphinx Docstring Annotations

Tue Nov 4 20:57:18 EST 2008

2008/11/4 David Cournapeau <cournape at gmail.com>:
> On Wed, Nov 5, 2008 at 7:34 AM, Anne Archibald
> <aarchiba at physics.mcgill.ca> wrote:
>> 2008/11/4 Travis E. Oliphant <oliphant at enthought.com>:
>>> Jarrod Millman wrote:
>>>> I absolutely agree with the ideas presented about scikits and look
>>>> forward to seeing the numerous scikits improvements.  I feel that I
>>>> have gotten into a discussion where the counter argument to what I am
>>>> proposing is something I strongly support.  I also feel that the
>>>> counterargument doesn't directly address my concern; but it may be
>>>> that I am simply perceiving a problem that no one else believes
>>>> exists.
>>>>
>>> Let me make my point again.   I'm arguing that instead of scipy.preview,
>>> let's just make a *single* scikit called scikit.preview or
>>> scikit.forscipy or scikit.future_scipy or whatever.    This will create
>>> some incentive to make scikits easier to install generally as we want to
>>> get the future_scipy out there being used.
>>>
>>> I'm very interested, though, to hear what developers of modules that
>>> they would like to see in SciPy but have not made it there yet, think.
>>>
>>> I'm very interested in the question of how do we make it easier to
>>> contribute to SciPy.
>>
>> As a developer who has written the module that is sparking this
>> discussion, if the route to inclusion in scipy were "make a scikit,
>> maintain and distribute it until you get enough user feedback to judge
>> whether the API is optimal, then move it fully-formed into scipy" my
>> code would simply gather dust and never be included. I don't have the
>> time and energy to maintain a scikit.
>
> That's what I don't understand: there is almost no difference between
> maintaing a scikit and a scipy submodule. In both case you have to
> write some setup.py + the module itself. To get the sources, it is
> scikit vs scipy svn. Both Damian and you made this case, so I would
> like to understand what's so different from your POV, because I just
> don't get it ATM. Maybe there are some confusion on how a scikit can
> be made and distributer (the documentation could certainly be
> improved).

A scipy submodule will be distributed to users. It has a bug tracker,
and other people who read the bug tracker, and can possibly fix minor
bugs. Users can find it. It gets built and tested on all relevant
architectures. The unit tests get rerun whenever some piece of scipy
changes. I don't have to do *any* of that. If something goes wrong,
okay, I can go in and try to fix it, but other than that I can leave
the working code as it is.

If it were a scikit I would have to scrounge build machines, I would
have to rerun the unit tests every time some part of scipy I depend on
changes, I would have to set up a bug tracker, and I would have to
publicize it. More, it makes it difficult for other packages: how are
dependencies between scikits handled? Is there a way to automatically
download and install all scikits a package depends on?

And: how many scikits have ever been incorporated in scipy?

> Having a scikit also means that if you are willing to do it, you can
> easily build binaries installers, source distributions *in one
> command*. You  can't do that with scipy, which won't change for the
> foreseable future. And you don't need to care about breaking scipy.

I can't build binary distributions at all; I don't have access to (for
example) Windows machines. And scipy.preview is intended for code
stable enough that breaking scipy is not a concern (i.e., not more of
a problem than for the rest of scipy).

>> The question is really, how do we take tested, apparently
>> production-ready code, and get it out there so users can get at it?
>> The current approach is "put it in scipy and live with the API".
>
> Not exactly: the "live with the API" case has been made for features
> which have been in scipy for years, that many people depend on.
>
> Also, I can't help noticing than in both Damian and your case, what
> happened is not what scipy.preview is about, but to put code directly
> in scipy. And I also think the process of scipy.preview does not scale
> much. It worked in your case, but will it work if many people want to
> put code in scipy ?

If scipy.preview had existed, I would have put my code there. In fact,
if someone creates it, my code will probably be the first to be moved
there. I think my code is useful as is, and I don't think I'll need to
change the API of what's there, but when we start seeing users it may
be a different story. (Specifically, I think there will be a demand
for annotated kdtrees and some way to efficiently implement custom
tree traversal.)

Anne