[Rpy] [Fwd: Re: [Numpy-discussion] Possible example applicati on of the array interface]

Warnes, Gregory R gregory.r.warnes at pfizer.com
Wed Apr 6 14:02:05 EDT 2005


Hi All,

It is possible to establish conversion functions so that R dataframe, lists,
and vector objects are better translated into python equivalents.  I've made
several aborted stabs at this, but my time has been extremely limited.

The basic task is to create a functionally equivalent python class [The
tricky bit here is that R list and vector objects have both order and names.
It is possible to emulate this in python by creating a base object that
maintains a dictionary of names in along side the data vector/matrix data.]

See the example in the rpu documentation at
http://rpy.sourceforge.net/rpy/doc/manual_html/DataFrame-class.html#DataFram
e%20class.

This shouldn't be very hard if someone can dedicate a bit of time to it.

-Greg
(Current RPy maintainer)



> -----Original Message-----
> From: rpy-list-admin at lists.sourceforge.net
> [mailto:rpy-list-admin at lists.sourceforge.net]On Behalf Of Tim Churches
> Sent: Wednesday, April 06, 2005 4:22 PM
> To: rpy-list at lists.sourceforge.net
> Subject: [Rpy] [Fwd: Re: [Numpy-discussion] Possible example 
> application
> of the array interface]
> 
> 
> The following discussion occured on the Numeric Python mailing list.
> Others may wish to enjoin the conversation.
> 
> Tim C
> 
> -------- Original Message --------
> Subject: Re: [Numpy-discussion] Possible example application of the
> array interface
> Date: Thu, 7 Apr 2005 03:10:08 +1000 (EST)
> From: Michael Sorich <mike_lists at yahoo.com.au>
> To: numpy-discussion at lists.sourceforge.net
> 
> I think that this is a great idea! While I have a
> strong preference for python, I generally use R for
> statistical analyses due to the large number of mature
> libraries available. There are also some aspects of
> the R data types (eg data-frames and column/row names
> for 2D arrays) that are really nice for spreadsheet
> like data. I hope that scipy.base record arrays will
> be as easily manipulated as data-frames are.
> 
> While RPy works well for small simple problems, there
> are data conversion limitations between R and Python.
> If one could efficiently convert between the major R
> data types and python scipy.base data types without
> loss of data, it would become possible to do most of
> the data manipulation in python and freely mix in R
> functions when required. This may encourage the use of
> python for the development of statistical routines.
> 
> >From my meager understanding of RPy:
> 
> R vectors are converted to python lists. It may make
> more sense to convert them to an array (either stdlib
> or scipy.base version) - without copying data if
> possible.
> 
> R arrays and matrices are converted to Numeric arrays.
> Eg
> 
> In [8]: r.array([1,2,3,4,5,6],dim=[2,3])
> Out[8]:
> array([[1, 3, 5],
>        [2, 4, 6]])
> 
> However, column and row names (or dimnames for arrays
> with >2 dimensions) are lost in R->Py conversion. I do
> not know whether these conversions require copying of
> the data.
> 
> R data-frames are currently converted to python
> dictionaries and I don’t think that there is any
> simple way to convert a python object to an R data
> frame. This is the biggest limitation of rpy in my
> opinion.
> 
> In [16]:
> r.data_frame(col1=[1,2,3,4],col2=['one','two','three','four'])
> Out[16]: {'col2': ['one', 'two', 'three', 'four'],
> 'col1': [1, 2, 3, 4]}
> 
> If it were possible to convert between an R data-frame
> and a scipy.base record array without copying or
> losing data, RPy would become more useful.
> 
> I wish I understood C, scipy.base and R well enough to
> give this a go. However, this is Way over my head!
> 
> Mike
> 
> --- Magnus Lie Hetland <magnus at hetland.org> wrote:
> > I was just thinking about some experimental designs,
> > and whether I
> > could, perhaps, do the statistics in Python. I
> > remembered having used
> > RPy [1] briefly at some time (there may be other
> > similar bindings out
> > there -- I don't remember) and started thinking
> > about whether I could,
> > perhaps, combine it with numpy in some way. My first
> > thought was to
> > reimplement the relevant statistical functions; then
> > I thought about
> > how to convert data back and forth -- but then it
> > occurred to me that
> > R also uses arrays extensively, and that it could,
> > perhaps, be
> > possible to expose those (through something like
> > RPy) through the
> > array interface/protocol!
> > 
> > This would be (IMO) a good example of the benefits
> > of the array
> > protocol; it's not a matter of "getting yet another
> > array module". RPy
> > is an external library/language with *lots* of
> > features that might be
> > useful to numpy users, many of which aren't likely
> > to be implemented
> > in Python for quite a while, I'd guess (unless,
> > perhaps, someone
> > writes a translator from R, which I'm sure is
> > doable).
> > 
> > I don't know enough (at least yet ;) about the
> > implementation of RPy
> > and the R library to say for sure whether this would
> > even be possible,
> > but it does seem like it could be really useful...
> > 
> > [1] rpy.sf.net
> > 
> > -- 
> > Magnus Lie Hetland                    Fall seven
> > times, stand up eight
> > http://hetland.org                                 
> > [Japanese proverb]
> 
> 
> -------------------------------------------------------
> SF email is sponsored by - The IT Product Guide
> Read honest & candid reviews on hundreds of IT Products from 
> real users.
> Discover which products truly live up to the hype. Start reading now.
> http://ads.osdn.com/?ad_ide95&alloc_id396&op=click
> _______________________________________________
> rpy-list mailing list
> rpy-list at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rpy-list
> 
> 


LEGAL NOTICE
Unless expressly stated otherwise, this message is confidential and may be privileged. It is intended for the addressee(s) only. Access to this E-mail by anyone else is unauthorized. If you are not an addressee, any disclosure or copying of the contents of this E-mail or any action taken (or not taken) in reliance on it is unauthorized and may be unlawful. If you are not an addressee, please inform the sender immediately.





More information about the NumPy-Discussion mailing list