[SciPy-dev] Generic polynomials class (was Re: Volunteer for Scipy Project)

Sun Oct 18 01:18:22 EDT 2009

2009/10/17 josef.pktd <josef.pktd at gmail.com>:
> Just to point out that OO design and "framework independence" are
> not necessarily exclusive:
>
> In statsmodels, we still use classes and subclasses to benefit from
> inheritance and OO, but the interface is in plain numpy arrays.
> This way, any data framework or data structure (nitime time series,
> scikits.timeseries, pandas, tabular, ...) or formula framework
> can use the classes, but has to write the conversion code from
> a data frame object to the ndarray design matrix and back again
> themselves.

[...]

I agree with you, and in fact in nitime we obviously do use 'objects'
in the sense that the procedural interface operates on python base
objects like lists, numbers and strings, and numpy arrays.  What I
meant to say, and perhaps wasn't precise enough, was that we wanted
this base layer to not depend on new, custom objects introduced by us.
 In this manner, this base functional layer could be used to build new
class hierarchies independent from ours while reusing the base
algorithms.

In terms of dependencies, we wanted our chain to be (I suck at ascii art):

0. python objects, nd arrays

   1.a algorithms
   1.b data containers

       2. Analyzers that need 1a, 1b.

> In the nitime case, whether ``algorithms`` uses classes wouldn't
> really matter for the easy usage from outside of neuroimaging as
> long as it doesn't force the user of the algorithms to use the
> nitime timeseries class as the Analyzers in timeseries do.

The point was to have algorithms only rely on objects that would be
familiar and available to anyone using already numpy.  Part of it is a
code dependency issue, part of it is cognitive load: if you need to
learn a lot of new object APIs before you can start using a library,
the cost for you to start using it goes up.

We've found that so far, we can keep the algorithmic library depending
only on numpy/scipy numerical tools.  We haven't set in stone avoiding
more objects at that layer at all  costs, but we do want to keep it
simple to use and understand.

We are gradually making the other objects richer, but that is driven
by careful discussion of real use cases: this week we started working
with a Berkeley group that has a lot of code for the analysis of
single-cell recording data, and we're going to grow the object design
to accommodate such uses.  We're very happy about this, because it
means our objects will grow in functionality but driven by
usage-proven needs.

Design isn't a black-and-white world of rights and wrongs, but rather
a collection of compromises that, if successful, when put together
provide something useful.  We've taken one approach to attempt that,
based on prior experience (especially prior mistakes we want to
avoid), but I'm not trying to say it's the *only* way to do this.
Just providing feedback that may be useful to others facing similar
design questions.

Cheers,

f