[SciPy-User] calculating the mean for each factor (like tapply in R)

Andreas Hilboll lists at hilboll.de
Wed Aug 1 09:13:04 EDT 2012


> Hi there,
>
> I've just moved from R to IPython and wondered if there was a good way of
> finding the means and/or variance of values in a dataframe given a factor
>
> e.g.:
> if df =
> x		experiment
> 10		1
> 13		1
> 12		1
> 3		2
> 4		2
> 6		2
> 33		3
> 44		3
> 55		3
>
> in tapply you would do:
>
> tapply(df$x, list(df$experiment), mean)
> tapply(df$x, list(df$experiment), var)
>
> I guess I can always loop through the array for each experiment type, but
> thought that this is the kind of functionality that would be included in a
> core library.

Pandas (http://pandas.pydata.org/) seems to be what you're looking for. It
has a DataFrame class which allows grouping of data.

Cheers, Andreas.




More information about the SciPy-User mailing list