basic statistics in python

Sat Mar 16 05:19:15 EST 2002

Tim Churches wrote:

> Note that stats.py also returns the 2-tailed p-value as well (which can
> also easily be obtained from R via RPy).
> 
> Tim C

As a side note (but not related to the above problem): there exists also
some peculiarities with the R-language:

For example:

> data <- c(0.23,1.0023,1.223,1.235,5.6,9.0,23.3456,34.458,34.56,78.9)

> summary(data)

delivers:

   Min. 1st Qu.  Median    Mean 3rd Qu.    Max.
  0.230   1.226   7.300  18.960  31.680  78.900

Everything is correct, except the 1st quantile and 3rd quantile.

First, I could not believe it and fired up XLispStat:

(setf data (make-array 10 :initial-contents  '(0.23 1.0023 1.223 1.235
5.6 9.0 23.3456
34.458 34.56 78.9)))

(quantile data 0.25) and (quantile data 0.75) respectively:

delivers: 1.229 and 28.9018

The R-language calculates not only on Windows the values wrong; even on
Unix: the values are the same as on Windows.

Maybe they use some other method for calculating the quantiles.

Personally: I can not cope with the R-language. It is rich of many
build-in functions; but most of the time I am not successful in finding
what I am searching for.

The graphics are good; but I would always prefer Dislin as long as I do
not need any specialized graphics from the field of statistics.

In R one can even read in binary files (you can even swap the binary
order). But plotting a large array is a pain in the neck. In Dislin it
is very fast and one can overlay maps (e.g. coastlines) without any
problems (you get even the x- and y-axis annotation right:
-180E....+180W,...).

S. Gonzi