Questions about mathematical and statistical functionality in Python
Tim Churches
tchur at optushome.com.au
Thu Jun 14 17:54:09 EDT 2007
Michael Hoffman wrote:
> Talbot Katz wrote:
>
>> I hope you'll indulge an ignorant outsider. I work at a financial
>> software firm, and the tool I currently use for my research is R, a
>> software environment for statistical computing and graphics. R is
>> designed with matrix manipulation in mind, and it's very easy to do
>> regression and time series modeling, and to plot the results and test
>> hypotheses. The kinds of functionality we rely on the most are standard
>> and robust versions of regression and principal component / factor
>> analysis, bayesian methods such as Gibbs sampling and shrinkage, and
>> optimization by linear, quadratic, newtonian / nonlinear, and genetic
>> programming; frequently used graphics include QQ plots and histograms.
>> In R, these procedures are all available as functions (some of them are
>> in auxiliary libraries that don't come with the standard distribution,
>> but are easily downloaded from a central repository).
>
> I use both R and Python for my work. I think R is probably better for
> most of the stuff you are mentioning. I do any sort of heavy
> lifting--database queries/tabulation/aggregation in Python and load the
> resulting data frames into R for analysis and graphics.
I would second that. It is not either/or. Use Python, including Numpy
and matplotlib and packages from SciPy, for some things, and R for
others. And you can even embed R in Python using RPy - see
http://rpy.sourceforge.net/
We use the combination of Python, Numpy (actually, the older Numeric
Python package, but soon to be converted to Numpy), RPy and R in our
NetEpi Analysis project - exploratory epidemiological analysis of large
data sets - see http://sourceforge.net/projects/netepi - and it is a
good combination - Python for the Web interface, data manipulation and
data heavy-lifting, and for some of the more elementary statistics, and
R for more involved statistical analysis and graphics (with teh option
of using matplotlib or other Python-based graphics packages for some
tasks if we wish). The main thing to remember, though, is that indexing
is zero-based in Python and 1-based in R...
Tim C
More information about the Python-list
mailing list