Request for feedback on API design

Steven D'Aprano steve+comp.lang.python at pearwood.info
Thu Dec 9 18:44:35 EST 2010


I am soliciting feedback regarding the API of my statistics module:

http://code.google.com/p/pycalcstats/


Specifically the following couple of issues:

(1) Multivariate statistics such as covariance have two obvious APIs:

    A pass the X and Y values as two separate iterable arguments, e.g.: 
      cov([1, 2, 3], [4, 5, 6])

    B pass the X and Y values as a single iterable of tuples, e.g.:
      cov([(1, 4), (2, 5), (3, 6)]

I currently support both APIs. Do people prefer one, or the other, or 
both? If there is a clear preference for one over the other, I may drop 
support for the other.


(2) Statistics text books often give formulae in terms of sums and 
differences such as

Sxx = n*Σ(x**2) - (Σx)**2

There are quite a few of these: I count at least six common ones, all 
closely related and confusing named:

Sxx, Syy, Sxy, SSx, SSy, SPxy

(the x and y should all be subscript).

Are they useful, or would they just add unnecessary complexity? Would 
people would like to see these included in the package?



Thank you for your feedback.


-- 
Steven



More information about the Python-list mailing list