[SciPy-User] calculating the mean for each factor (like tapply in R)
Andreas Hilboll
lists at hilboll.de
Wed Aug 1 09:13:04 EDT 2012
> Hi there,
>
> I've just moved from R to IPython and wondered if there was a good way of
> finding the means and/or variance of values in a dataframe given a factor
>
> e.g.:
> if df =
> x experiment
> 10 1
> 13 1
> 12 1
> 3 2
> 4 2
> 6 2
> 33 3
> 44 3
> 55 3
>
> in tapply you would do:
>
> tapply(df$x, list(df$experiment), mean)
> tapply(df$x, list(df$experiment), var)
>
> I guess I can always loop through the array for each experiment type, but
> thought that this is the kind of functionality that would be included in a
> core library.
Pandas (http://pandas.pydata.org/) seems to be what you're looking for. It
has a DataFrame class which allows grouping of data.
Cheers, Andreas.
More information about the SciPy-User
mailing list