[SciPy-dev] SciPy Foundation

Joe Harrington jh at physics.ucf.edu
Sun Aug 16 10:40:51 EDT 2009


I've finally had time to look at all the replies to this thread.
There were dozens, so rather than quoting and responding to everyone
individually, I'll summarize.  The short version is that due to an
early misunderstanding, we spent a lot of bandwidth generating
agreement that masqueraded as dissent!  In the end, I think we have
general agreement (and no specific dissent) to the idea of an
organization dedicated to development of scientific tools in Python
and gathering and disbursing funds to that end.  We even agree on our
major priorities.  So, I propose that we move forward with planning.
There's a BoF proposal at the end.


Here's the longer version:

1. Objection: The mission statement stuffs too much into one package.
The scipy package doesn't need a GUI!  (Long post by Gael 2009-08-01
22:52:16, shorter one by Robert 2009-08-04 19:37:01, many others.)

My apologies to these fine gentlemen and others who discussed on this
threadlet, but this was a bit of a bandwidth waster since I started my
proposed mission statement with "(The toolstack)", not "SciPy" or
"scipy".  Of course nobody would go to such lengths just for one
package, nor propose stuffing so much into it that exists elsewhere
and is in wide use already (GUIs, interactive shells, etc.).  We're
talking broadly about scientific use of Python.

Robert proposed a name change to avoid such ambiguity.  SciPython?
SciPyStack? Py4Sci?  Scientific Python is taken.  I really prefer
SciPy, as it has branding already, but perhaps SciPyStack is ok
informally.  I think we're stuck with SciPy for formal docs, web site,
etc, just like JPL (which has not studied the propulsion of jets or
rocket engines for decades).  What this means is:

a. POSTERS BE CLEAR: specify package or toolstack when you talk about
scipy.  Use "SciPy" for the toolstack and "scipy" for the package, but
don't rely on that alone.  (note: I did this!)

b. RESPONDENTS BE CAREFUL: double-check what the poster wrote before
replying if it's about "scipy" or "SciPy".


2. It's important for the package structure to be light.

Yes!  I am not proposing to change the package structure at all.
People need to be able to pick and choose, and it needs to be light
for many reasons, such as OLPC.

However, as a practical matter, I know of *nobody* who is a heavy user
and who does not install a significant number of packages.  We install
about 15 python-related packages now for our group.  It has become a
nightmare that takes my very experienced system manager, an Ubuntu
developer with a PhD in computer science, several days.  Basically, if
you want everything current (e.g., to get recent docs in numpy, or HDF
libraries that actually work), it is hard to do a consistent build
without doing a lot of patching.  Clearly, most potential users cannot
tolerate that, or even do it.

So, I would like to see packaging *coordination* such that a
monolithic install is as trivial for the user as it is to install one
package.  From my discussion with hundreds of users who are sitting on
the sidelines in my discipline alone, this and docs are essentially
what they are waiting for.  Done right, I think most of the relevant
package authors would welcome the opportunity to coordinate (but I
don't speak for them).  Exactly what and how is a matter to discuss
but let's get the overall project structure settled first.


3. This is going to be a lot of work, particularly IDEs and GUIs!  I
don't want to burn out or hurt my career.

People should not burn out or hurt their careers on service projects!
It's the first rule of academia.  There are tons of workers who will
happily contribute small bits if they were served in nice-sized chunks
and integrated by someone when finished.  There are lots more willing
to work for pay, or even partial pay.  This proposal is a way of
moving to that model, which might also be called "many hands make
light work".  I think the doc project proves the viability of the
paid-coordinator model.  For IDEs and GUIs, there are good starts
already.  With enough momentum, we can directly fund development to
provide something better.


4. Why not use Sage/EPD/etc.?

Those solve the monolithic packaging problem, usually inelegantly but
that's the only way to do it today.  There is plenty broken in our own
house before we even get to the monolithic packaging problem, like
missing documentation, code cleanups, API
stabilization/rationalization, and getting packages to build together
for all platforms.

Once that's done, Sage's and our goals might well merge.  Still, Sage
has its own focus, and it is not scientific modeling and data
analysis.  EPD focuses on Windows.  My ideal would be that our
much-improved packaging makes rolling a monolithic distro for a
particular purpose much easier, in some cases as easy as publishing a
meta-package that pulls in what you want as dependencies.  Then STScI
can release an astronomy distro, someone else can release a
neuroscience distro, and Sage can release their thing for math, all
benefitting from a toolstack that builds cleanly together.


5. Packaging is hard.  What we need is (long description of packaging
needs)...

What we need is fully-automated builds that populate PPAs on all
platforms for every version and a nightly snapshot of every package,
and tests run nightly that show they still work together.  There is a
tool that does this.  It was funded by the US National Science
Foundation and is required for applicants to many of their grant
programs.  It is called metronome, formerly NMI Build and Test Suite:

http://nmi.cs.wisc.edu/

At last count, they build on 46 platforms.  Getting there will be
hard.  That is what money is for.

Story: When I was a freshman in 1984, there was a free student
computing system at MIT called Multics that was run by a student
group.  Your account had a certain amount of "money", which it charged
for CPU usage and printing.  When you ran out, you asked for more
"money" and got it for free.  There was a sign on the door to the
group's office that said, "If you need more money, use the
request-extension command."  But, it was done with funky colors and
words going every which way, and to me it initially read, "If you need
more, use money, the request-extension command."  I've looked at money
in a different light ever since...


Gael Varoquaux 2009-08-01 22:52:16 GMT writes:

> Specifically, I would love to see an official umbrella project for
> BSD-licensed tools for building scientific projects with Python. As the
> "scipy" name is well branded (through the website, and the conference),
> we could call this the 'scipy project'. I would personally like to limit
> wheel reinvention and have preferred solutions for the various bricks (I
> am thinking of the unfortunate Chaco versus Matplotlib situation, where I
> have to depend on both libraries that complement each other). 

This is exactly what I am proposing.  Pretty much everything else in
the message was based on a misunderstandings of my intent about
package vs. toolstack.  I would not limit it to BSD-licensed tools,
but would want that to continue to be a requirement for the core
stuff, and likely for grants we would write.  In other words, if a
benefactor came along wanting to give some cash to a field-specific
project that was under GPL, fine, I'd be glad to funnel their money to
the developers.

> first, as Robert points out,
> telling somebody what to do will not achieve anything. I am already way
> too busy scratching my own itches. 

It works well if you are paying them.  What is amazing (witness the
doc project) is that if just one person is paid to organize an area,
lots of people flock to the project and pitch in doing small tasks.
Not everyone.  Not even most people.  But enough.  That is what the
funding is for.  It's the request_extension command!  Specifically,
extension of effort on the part of someone who would otherwise find
other uses for their time.

> Second, who will find the time to take care of this?

I've been doing it since Spring 2008 for the doc project.  Hopefully a
few others will join me so we can write some grants, start a funding
organization, and launch something more permanent and far-reaching.

I've proposed a BoF on this topic.  I immodestly think it could be the
most important of the meeting.  I propose Thursday at 8:30 (I think
that 2.5 hours for dinner is too much and that we should start the
BoFs much earlier, like at 7:30, so we can do an early and a late set
of BoFs.  I'm not sure what the reception is.  Is it dinner?  Or just
a delay in the start of dinner?).  Alternatively, we can do it Friday
over lunch, though that depends on getting some box lunches.  Proposed
format:

Organization, Funding, and Future Direction of SciPy
(coordinator: Joe Harrington, sergeant-at-arms: David Goldsmith)

I'd like to spend a strict 10 minutes on each of these, cutting off
discussion and moving on after each item. After all 6 items, we can
continue discussion on any item:

    * What are our long-term goals?
    * What are our current strengths and weaknesses?
    * How is our current Steering Committee/grass-roots model working?
    * What would we do with funding? Would it require a change in how
        the community operates? 
    * How can we get funding?
    * In the large, how should we proceed?

--jh--



More information about the SciPy-Dev mailing list