[SciPy-dev] Google Summer of Code and scipy.learn (another trying)

Mon Mar 24 13:43:33 EDT 2008

>    Indeed, the framework is great and is easy to use.
that is the hope ;-) it is easy to be used by us but we will see how
neurosci people be able to adopt it for their needs ;-)

>    didn't show an exemple of a classification training and then a
>    single test. For instance, train a SVM and then test one image (for
>    SVMs, it
if I got it right, you just want to train on 1 dataset (name it
training) and transfer (predict) another (name it validation) (which
might have just a single sample/image)

As always there are multiple ways to accomplish the mission. Explicit
(and "low-level") one would be

clf = SomeClassifier()
clf.train(training)
predictions = clf.predict(validation)

or if the main goal also to get an error measure out of the box, then
just use

terr = TransferError(clf, enabled_states=['confusion'])
error = terr(validation, training)

what good about terr is that you could use it as an error measure for
feature selection (e.g. RFE). Also I enabled confusion state variable,
so it stores confusion matrix, which also contains all 'validated'
samples' target and predicted labels, or in your case it will be just 1
pair of such. You can easily access target and predicted labels
through

terr.confusion.sets[0]

NB [0] is solely since you had just 1 training/validation pair.

>    is now needed to be able to get the underlying cost function, more and
>    more publications use it to show interesting results, and it is what
>    could be thought as a state in your framework ?).
ah, so you want to see the actual value which was used to decide on the
label. Eg for linear SVM it would be w*x+b value, right?
For those we actually have already state variable "values" (as opposed
to labels), but we are yet to "connect" it to the real computed value I
believe since we haven't used it yet (actually now it is linked to
prediction probability as it is implemented within libsvm)...

I guess we should add cleaner interface to both cost_function and prediction
probability results produced
by classifier for any given sample. But if you are interested in w on
itself we call it LinearSVMSensitivity ;-)

Also please have a look at doc/examples/pylab_2d.py which plots
probability maps for few classifiers ;-)

> This is expecting
>    when testing one individual versus two groups (it is done in anatomic
>    brainimaging). Could you add an exemple with this ?
shouldn't you simply have a binary classifier to classify between two
groups?

>    I also know that Windows is not very tested for the moment, I hope the
>    mix with scikits.learn will help promoting your tool and fix the
>    different bugs that will be found (or not if there are no bugs :D) with
>    other platforms.
Yeah -- eventually we should try it on some proprietary OS like Windows
but we aren't brave enough :-)

>    Matthieu
-- 
                                  .-.
=------------------------------   /v\  ----------------------------=
Keep in touch                    // \\     (yoh@|www.)onerussian.com
Yaroslav Halchenko              /(   )\               ICQ#: 60653192
                   Linux User    ^^-^^    [175555]