[SciPy-dev] [GSoC 2008]Machine learning package in SciPy

Stéfan van der Walt stefan at sun.ac.za
Tue Mar 11 23:01:44 EDT 2008


On Tue, Mar 11, 2008 at 11:50 AM, Anton Slesarev
 <slesarev.anton at gmail.com> wrote:
 > Hi all,
 >
 > it might be a good idea to have a machine learning(ML) package in SciPy. As
 > I understand there are some ML code in SciKits, but it is in raw state?
 >
 > There are a lot of machine learning projects, with its own data format,
 > number of classifiers, feature selection algorithms and benchmarks. But if
 > you want to compare your own algorithm with some others, you should convert
 > your data format to input format of every tool you want to use and after
 > training, you should convert output format of each tools to the single
 > format to have facility to compare results(for example you want to see
 > common which features was selected by different tools).
 >
 > Now I'm analyzing different ML approaches for the special case of text
 > classification problem. I couldn't find ML framework appropriate for my
 > task. I've got two simple requirements for this framework. It should support
 > sparse data format and has at least svm classifier. For example, Orange [1]
 > is a vary good data mining project but has poor sparse format support. PyML
 > [2] has all needed features, but there are problems with installation on
 > different platforms and code design is not perfect.
 >
 > I believe that creation framework, which will be convenient for scientist to
 > integrate their algorithms to it, is a vary useful challenge. Scientists
 > often talk about standard machine learning software[3] and may be SciPy will
 > be appropriate platform for developing such software.
 >
 > I can write detailed proposal, but I want to see is it interesting for
 > someone? Any wishes and recommendations?
 >
 > 1. Orange http://magix.fri.uni-lj.si/orange/
 >  2. PyML http://pyml.sourceforge.net/
 >  3. The Need for Open Source Software in Machine Learning
 > http://www.jmlr.org/papers/volume8/sonnenburg07a/sonnenburg07a.pdf

 I also recently learned of Elefant,

  http://elefant.developer.nicta.com.au/

 but haven't had a chance to investigate in more detail.

 Regards
 Stéfan



More information about the SciPy-Dev mailing list