[SciPy-dev] [GSoC 2008]Machine learning package in SciPy
Stéfan van der Walt
stefan at sun.ac.za
Tue Mar 11 23:01:44 EDT 2008
On Tue, Mar 11, 2008 at 11:50 AM, Anton Slesarev
<slesarev.anton at gmail.com> wrote:
> Hi all,
>
> it might be a good idea to have a machine learning(ML) package in SciPy. As
> I understand there are some ML code in SciKits, but it is in raw state?
>
> There are a lot of machine learning projects, with its own data format,
> number of classifiers, feature selection algorithms and benchmarks. But if
> you want to compare your own algorithm with some others, you should convert
> your data format to input format of every tool you want to use and after
> training, you should convert output format of each tools to the single
> format to have facility to compare results(for example you want to see
> common which features was selected by different tools).
>
> Now I'm analyzing different ML approaches for the special case of text
> classification problem. I couldn't find ML framework appropriate for my
> task. I've got two simple requirements for this framework. It should support
> sparse data format and has at least svm classifier. For example, Orange [1]
> is a vary good data mining project but has poor sparse format support. PyML
> [2] has all needed features, but there are problems with installation on
> different platforms and code design is not perfect.
>
> I believe that creation framework, which will be convenient for scientist to
> integrate their algorithms to it, is a vary useful challenge. Scientists
> often talk about standard machine learning software[3] and may be SciPy will
> be appropriate platform for developing such software.
>
> I can write detailed proposal, but I want to see is it interesting for
> someone? Any wishes and recommendations?
>
> 1. Orange http://magix.fri.uni-lj.si/orange/
> 2. PyML http://pyml.sourceforge.net/
> 3. The Need for Open Source Software in Machine Learning
> http://www.jmlr.org/papers/volume8/sonnenburg07a/sonnenburg07a.pdf
I also recently learned of Elefant,
http://elefant.developer.nicta.com.au/
but haven't had a chance to investigate in more detail.
Regards
Stéfan
More information about the SciPy-Dev
mailing list