[SciPy-User] Proposal for a new data analysis toolbox

Dag Sverre Seljebotn dagss at student.matnat.uio.no
Wed Nov 24 16:50:18 EST 2010


On 11/24/2010 07:09 PM, Matthew Brett wrote:
> Hi,
>
> On Wed, Nov 24, 2010 at 9:30 AM, Dag Sverre Seljebotn
> <dagss at student.matnat.uio.no>  wrote:
>    
>> On 11/24/2010 06:11 PM, Keith Goodman wrote:
>>      
>>> On Tue, Nov 23, 2010 at 11:56 PM, Dag Sverre Seljebotn
>>> <dagss at student.matnat.uio.no>    wrote:
>>>
>>>        
>>>> On 11/23/2010 10:17 PM, Keith Goodman wrote:
>>>>
>>>>          
>>>        
>>>> This feels like the kind of functionality that, once it is there, people
>>>> might start to take for granted. In those cases I think finding a boring
>>>> name is proper :-)
>>>>
>>>> So how about something boring under the scikits namespace.
>>>> scikits.datautils, scikits.arraystats, ...
>>>>
>>>> If one wants to be cute, perhaps "scikits.missing", for functions that
>>>> deal well with missing data (unless I misunderstand, I don't use NaN
>>>> much myself).
>>>>
>>>> I guess "Missing" by itself would be rather un-Googlable :-)
>>>>
>>>>          
>>> Cython is great. It is amazing how quickly (coding time and cpu time)
>>> someone with no experience (me) can get code running. Thanks for all
>>> your work.
>>>
>>>        
>> :-)
>>
>> Well, there's a couple of obvious warts, like the lack of templates, and
>> the difficulty of doing programming in N arbitrary dimensions. If I had
>> a lot of time and/or money...
>>      
> Is there anything we can do to find you time and / or money?  Seriously.
>    

Well, getting into this, I'd like to start by pointing out that there's 
now NSF money for a Cython+numerics+SciPy workshop sometimes during the 
next three years, see item 6 in the proposal:

http://modular.math.washington.edu/grants/compmath09/

It's been granted, which is very good news. What you are doing here with 
speeding up "elementary" functions using Cython fit very well with some 
of the ideas in that proposal. This means that once interest picks up 
enough and there's some experience with using Cython for this kind of 
things, and where it is lacking, we can hold a workshop to figure out 
the best way forward.

I like the foundation idea that Fernando Perez, William Stein and Jarrod 
Millman talked about this summer (see (Euro)SciPy conference talks 
2010). The key point is to have a little pool of money available to use 
in the critical spots, when people are between jobs/student summers etc; 
then the money can go much longer. See, e.g., Google Summer of Code; 
sometimes a relatively tiny amount of money can go a long way.

As for me working on Cython...realistically, to really push new 
complicated features in Cython I'd need to have it as my day job (the 
TODO list is just getting too long ahead of Cython).  My current plan is 
to go for a PhD starting this spring, in which case I may be able to 
take month-long breaks here and there to work on Cython if funding is 
available. (I wouldn't gain much for my research by improving Cython, 
although I do hope to have more time for quick bug-fixes etc. In my 
research I mostly need linear algebra and/or spherical harmonic 
transforms on a cluster.)

BTW, if you don't know already, I'm currently working for a couple of 
months for Enthought on SciPy + fwrap + .NET (search scipy-dev for fwrap 
and my name). Even if the .NET port is the primary objective, I believe 
it will have some nice side-effects and do both fwrap and hopefully 
SciPy-on-CPython some good in the end.

I'll see if I can post a little bit on my use of templates tomorrow morning.


Dag Sverre


>    
>> For the time being, for something like this I'd definitely go with a
>> template language to generate Cython code if you are not already. Myself
>> (for SciPy on .NET/fwrap refactor) I'm using Tempita with a pyx.in
>> extension and it works pretty well. Using Bento one can probably chain
>> Tempita so that this gets built automatically (but I haven't tried that
>> yet).
>>      
> Thanks for the update - it's excellent news that you are working on
> this.  If you ever have spare time, would you consider writing up your
> experiences in a blog post or similar?  I'm sure it would be very
> useful for the rest of us who have idly thought we'd like to do this,
> and then started waiting for someone with more expertise to do it...
>
> See you,
>
> Matthew
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>    




More information about the SciPy-User mailing list