[SciPy-dev] Summer of Code proposal: Support Vector Machines

Albert Strasheim fullung at gmail.com
Mon May 8 09:51:44 EDT 2006


Hello all

> -----Original Message-----
> From: scipy-dev-bounces at scipy.net [mailto:scipy-dev-bounces at scipy.net] On
> Behalf Of Gennan Chen
> Sent: 08 May 2006 15:32
> To: SciPy Developers List
> Subject: Re: [SciPy-dev] Summer of Code proposal: Support Vector Machines
> 
> Albert,
> 
> When are you going to add libsvm in Scipy? Some of my stuff need hat.
> BTW, ctypes vs DIY. Any speed issue there?? I

I'm hoping to add everything as part of this SoC project.

If you're eager to get going you can try my code here:

http://students.ee.sun.ac.za/~albert/pylibsvm

If you're doing sparse problems, it should just work like the old libsvm
Python wrappers, although I haven't tested this extensively. You want to
look at SparseProblem in problem.py and the tests at the end.

If you have dense data that isn't too big, i.e. you can afford to have a
copy of it in memory in the libsvm svm_node format, look at ArrayProblem.

If you can't afford a copy, you can put your data in an array with
dtype=svm_node_dtype and use SvmNodeArrayProblem, or put all your data in
multiple arrays or lists and use SvmNodeListProblem. If you need to do this,
let me know, and I'll write up some docs on how this works exactly (or look
at the tests). The reason this hasn't been extensively documented yet is
that I only finished this code yesterday. (The code checks that your
matrices are in the right format, so that should help you if you want to
explore this option.)

As for ctypes vs DIY: before trying ctypes, I spent a week try to wrap
libsvm with SWIG. After that experience, I wouldn't recommend this approach
for wrapping C code that has to work nicely with NumPy arrays to anyone.

The beauty of wrapping with ctypes is that there is no C code involved (if
you do things right). Sure, you can still make things segfault, but most of
the time it just works brilliantly. To give you an idea, from reading the
ctypes tutorial to having a completely wrapped libsvm (about 10 functions,
lots of pointers, a few intricate structs), took me 4 hours while I took me
another few hours to figure out all the goodies that you'll see on the
ctypes page on the SciPy wiki:

http://www.scipy.org/Cookbook/Ctypes

If anybody wants to mentor this project, please let me know.

Cheers,

Albert




More information about the SciPy-Dev mailing list