[SciPy-user] scipy data mining ?

David Cournapeau david at ar.media.kyoto-u.ac.jp
Wed Jan 24 03:26:20 EST 2007


Ivan Vilata i Balaguer wrote:
> Karl Young (el 2007-01-22 a les 14:16:56 -0800) va dir::
>
>> I'm currently using a nice Java based data mining package called Weka 
>> (essentially as a black box as I don't have time to learn Java)  but was 
>> looking for something more python/scipy friendly to switch to as I'd 
>> prefer more interactive use. I found a python package on the web that 
>> potentially looks pretty nice (Orange - http://www.ailab.si/orange) but 
>> given that it uses GPL (and also given the recent discussion on license 
>> issues) and doesn't look to have made any effort to be numpy/scipy 
>> friendly I was wondering if anyone was aware of a more scipy friendly 
>> effort. Should someone (maybe even me...) be talked into contacting the 
>> Orange developers and seeing if they'd be interested in a switch to BSD 
>> and a gradual evolution towards integration with numpy... ?
>
> You may also give a try to PyTables_, which is already being used by
> some people to perform data mining.  It is not similar to Orange or Weka
> in the sense that PyTables is a lower-level, non-GUI Python library.
> However, it uses NumPy at its core, so integration with SciPy should be
> no problem, and it is designed to be comfortable in interactive usage
> (on a Python console).  The standard version is free/libre software
> under a BSD license.
>
> On the GUI part, you could use ViTables_ for textual browsing of big
> files, or HDFView_ if you need plotting or image visualisation
> capabilities.
>
> .. _PyTables: http://www.pytables.org/
> .. _ViTables: http://www.carabos.com/products/vitables
> .. _HDFView: http://www.hdfgroup.org/hdf-java-html/hdfview/
That would give the IO and interface part of orange, but not the core 
machine learning part. This is I think one area where numpy/scipy is 
still lacking, at least integration-wise, compared to matlab which has 
major toolboxes such as netlab for this kind of thing.

David



More information about the SciPy-User mailing list