[AstroPy] AstroPy Digest, Vol 81, Issue 12

Aldcroft, Thomas aldcroft at head.cfa.harvard.edu
Thu Jun 20 23:22:05 EDT 2013


On Thu, Jun 20, 2013 at 5:42 PM, Thøger Emil Rivera-Thorsen <
thoger.emil at gmail.com> wrote:

>  [*Snip*]
>
>
>
>
> Group-by and related functionality is top on my list of priorities for
> astropy.table (in fact I see it every day on my google keep app...).  Join
> and merging are in master now.  In my tests the astropy table join is
> within a factor of 2 to 3 in speed relative to pandas, so in most use cases
> it should be good enough.
>
> Join and merge would follow the pandas behaviour? Because that is one of
> its major assets, I believe - its elegant handling of missing data and
> misaligned indices etc.
> Speed is a minor issue in my world, we're not often working with the data
> set sizes the quants are.
>

I was looking at the pandas docs when writing join, so the interface is
inspired by, but not the same as, pandas.  I believe that the left, right,
inner, and outer joins are all correct and handle missing data correctly,
but independent testing and feedback would be welcome!  astropy.Table does
not have the analogous concept of a pandas label, but any column or columns
can be designated as the join key.  See:

http://astropy.readthedocs.org/en/latest/table/operations.html


>
>
>   It's probably worth pointing out to the community that it was not a
> lightly-taken decision to reject pandas for use as the base data storage
> container.  For the case of tables there is one show-stopper which is that
> pandas DataFrame does not support arbitrary multi-dimensional columns, i.e.
> column where each element is itself an N-d array.  These occur enough in
> astronomy and are supported by FITS and VO standards, so the astropy Table
> must be able to represent that.  The lack of support for table and column
> metadata is a smaller but still important issue.
>
>    That is very interesting to hear. I had, in fact, been wondering what
> was the rationale, but it certainly makes sense.
> I was under the impression, though, that pandas pretty much supported
> arbitrary objects as entries in its data structures. But I was apparently
> mistaken?
>

Yes, pandas supports an object dtype, but that is not really useful for
representing a multi-dimensional array since you basically lose all the
numpy-ness of the array.  A 1000000 x 2 column would be 1000000 numpy array
objects = bad news.


>
>
>   Having said that, there is no question pandas has a ton of
> highly-efficient and useful machinery and we are working on ways to improve
> inter-operability.  This includes being able convert between Table and
> DataFrame easily.  Suggestions and (especially) pull requests welcome.
>
>
> An easy and near-seamless conversion between a DataFrame and an astropy
> table (as long as data type support allows for it, of course) would
> definitely be a great thing.
>
>
>
>
>>
>>  I won't speculate about whether that's enough an asset to warrant a
>> dependency in astropy. I do agree that lots of other pandas features don't
>> translate as well into astronomy use.
>>
>>
>>
>> On Thu, Jun 20, 2013 at 12:34 PM, Erik Tollerud <erik.tollerud at gmail.com>wrote:
>>
>>>  I'm of mixed minds about traits UI because once you know it you can
>>> make great GUIs with it, but I've spent a lot of time troubleshooting
>>> people's python installations to get traits to work.  That is, in general
>>> it can be tricky to get installed because of all the dependencies.  Maybe
>>> this has improved recently with Enthought's Canopy (or other new python
>>> distros), but that's been my past experience.
>>>
>>>  More generally, the view in the astropy core package is that we don't
>>> want to put GUIs in the core because GUIs always carry lots of
>>> dependencies, which we don't want to be forced to deal with.  But part of
>>> the whole reason for affiliated packages was to get around this, so we're
>>> happy to see GUI-based affiliated packages.
>>>
>>>
>>>  As for Pandas, to be totally honest, I don't see a huge amount to be
>>> gained from adding a Pandas dependency Astropy.  It's honestly not clear
>>> what it gives the astronomy community that numpy does not already have.
>>>  The following quote from the Pandas web site has guided me to that
>>> conclusion: "*pandas* helps fill this gap, enabling you to carry out
>>> your entire data analysis workflow in Python without having to switch to a
>>> more domain specific language like R."
>>>
>>>  I have been carrying out my entire data analysis workflow for some
>>> time now in python without using Pandas.  It looks to me like Pandas is a
>>> tool that was written by and for statisticians who use R.  While we can
>>> take lessons from this, it's not clear we get much out of it in an
>>> astronomy context. For example, how would it make astropy's NDData,
>>> Quantity, or Table better to use a Pandas DataFrame vs. a numpy array? Most
>>> of what we are doing is building astronomy-convenient interfaces, and I'm
>>> not sure what Pandas adds there, at the cost of a pretty heavy-weight
>>> dependency.
>>>
>>>  It could just be that I don't know enough about Pandas, though.  So if
>>> someone who knows Pandas better can speak to this, I'm all ears.
>>>
>>>
>>>
>>>
>>> On Tue, Jun 18, 2013 at 3:35 PM, Thøger Rivera-Thorsen <
>>> trive at astro.su.se> wrote:
>>>
>>>>  Pandas is a part of the newly-defined SciPy stack, after all, so that
>>>> would be part of any science-oriented distribution worth its salt. In fact,
>>>> I think it could be a good idea for astropy in general to use under the
>>>> hood, but again, could clash with the philosophy of the project and
>>>> possibly also maintainabillity.
>>>>
>>>> As for offering my code or just my experience, I'll have to square it
>>>> with my supervisor first, and I also think it depends on what direction the
>>>> project in question will take. I'm positive about the idea (which is why I
>>>> wrote in the first place), but supervisor might think it is a better idea
>>>> to actually get my paper in the project wrapped up before sending the code
>>>> out there. Will get back about that one!
>>>>
>>>> /Emil
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> On 2013-06-18 20:53, Slavin, Jonathan wrote:
>>>>
>>>>  Hi Emil,
>>>>
>>>>  That looks very nice!  I don't see Pandas as a big issue in terms of
>>>> dependencies.  I don't know that much about traits, etc.  My thought about
>>>> the gui was just based on my experience with matplotlib, and the fact that
>>>> it is widely used -- though I would agree that too many dependencies can be
>>>> a deterrent to people using something.  Are you offering your code as a
>>>> starting point for the project?  It strikes me that many have gotten some
>>>> sort of fitting package to a point of personal usability but no one has the
>>>> time/interest/motivation to make a more generally usable package.
>>>>
>>>>  Jon
>>>>
>>>>  On Tue, Jun 18, 2013 at 2:34 PM, <astropy-request at scipy.org> wrote:
>>>>
>>>>> Date: Tue, 18 Jun 2013 20:39:55 +0200
>>>>> From: Th?ger Rivera-Thorsen <thoger.emil at gmail.com>
>>>>> Subject: Re: [AstroPy] ESA Summer of Code in Space 2013
>>>>> To: astropy at scipy.org
>>>>> Message-ID: <51C0A97B.8090703 at gmail.com>
>>>>> Content-Type: text/plain; charset="iso-8859-1"
>>>>>
>>>>> I have been working on a fitting GUI for a while, although it is made
>>>>> with a specific task in mind.
>>>>> However, it is not based on Matplotlib but on Traits/Traitsui/Chaco and
>>>>> Pandas. It is made for a specific projhect I'm working and as such not
>>>>> yet usable for more general cases, but it could be a starting point, if
>>>>> the dependencies don't conflict with astropy politics.
>>>>>
>>>>> Especially, I am happy about the choice of Pandas for managing a quite
>>>>> complex data structure (the fitted and/or guessed values of an
>>>>> arbitrary
>>>>> number of transitions for an arbitrary number of rows or collapsed rows
>>>>> of a spatially resolved spectrum) of a), but also with the Traits-based
>>>>> interactive interface to build complex line profiles from single
>>>>> gaussians, good for fitting-by-eye and giving good initial guesses for
>>>>> fitting of complex line profiles. It hooks directly up to a wrapper
>>>>> I've
>>>>> made for lmfit, but given the modularity, it should be relatively easy
>>>>> to change to other backends.
>>>>>
>>>>> It's still a work-in-progress, but there are some screenshots here:
>>>>> http://flic.kr/s/aHsjGaEMGg .
>>>>> I know the choice and number of dependencies may be prohibitive but it
>>>>> saved a lot of work on the GUI, and Pandas means the difference between
>>>>> sanity and madness when it comes to keeping track of so many
>>>>> parameters.
>>>>>
>>>>> Cheers,
>>>>> Emil
>>>>>
>>>>
>>>>
>>>>
>>>>  ________________________________________________________
>>>> Jonathan D. Slavin                 Harvard-Smithsonian CfA
>>>> jslavin at cfa.harvard.edu       60 Garden Street, MS 83
>>>> phone: (617) 496-7981 <%28617%29%20496-7981>       Cambridge, MA
>>>> 02138-1516
>>>> fax: (617) 496-7577 <%28617%29%20496-7577>            USA
>>>> ________________________________________________________
>>>>
>>>>
>>>>
>>>>  _______________________________________________
>>>> AstroPy mailing listAstroPy at scipy.orghttp://mail.scipy.org/mailman/listinfo/astropy
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> AstroPy mailing list
>>>> AstroPy at scipy.org
>>>> http://mail.scipy.org/mailman/listinfo/astropy
>>>>
>>>>
>>>
>>>
>>>   --
>>> Erik
>>>
>>> _______________________________________________
>>> AstroPy mailing list
>>> AstroPy at scipy.org
>>> http://mail.scipy.org/mailman/listinfo/astropy
>>>
>>>
>>
>>
>>  --
>>  ************************************
>> Chris Beaumont
>> Graduate Student
>> Institute for Astronomy
>> University of Hawaii at Manoa
>> 2680 Woodlawn Drive
>> Honolulu, HI 96822
>> www.ifa.hawaii.edu/~beaumont <http://www.ifa.hawaii.edu/%7Ebeaumont>
>> ************************************
>>
>> _______________________________________________
>> AstroPy mailing list
>> AstroPy at scipy.org
>> http://mail.scipy.org/mailman/listinfo/astropy
>>
>>
>
>
> _______________________________________________
> AstroPy mailing listAstroPy at scipy.orghttp://mail.scipy.org/mailman/listinfo/astropy
>
>
>
> _______________________________________________
> AstroPy mailing list
> AstroPy at scipy.org
> http://mail.scipy.org/mailman/listinfo/astropy
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/astropy/attachments/20130620/253fe0d6/attachment.html>


More information about the AstroPy mailing list