[scikit-learn] anti-correlated predictions by SVR

Thomas Evangelidis tevang3 at gmail.com
Tue Sep 26 12:48:56 EDT 2017


I have very small training sets (10-50 observations). Currently, I am
working with 16 observations for training and 25 for validation (external
test set). And I am doing Regression, not Classification (hence the SVR
instead of SVC).


On 26 September 2017 at 18:21, Gael Varoquaux <gael.varoquaux at normalesup.org
> wrote:

> Hypothesis: you have a very small dataset and when you leave out data,
> you create a distribution shift between the train and the test. A
> simplified example: 20 samples, 10 class a, 10 class b. A leave-one-out
> cross-validation will create a training set of 10 samples of one class, 9
> samples of the other, and the test set is composed of the class that is
> minority on the train set.
>
> G
>
> On Tue, Sep 26, 2017 at 06:10:39PM +0200, Thomas Evangelidis wrote:
> > Greetings,
>
> > I don't know if anyone encountered this before, but sometimes I get
> > anti-correlated predictions by the SVR I that am training. Namely, the
> > Pearson's R and Kendall's tau are negative when I compare the
> predictions on
> > the external test set with the true values. However, the SVR predictions
> on the
> > training set have positive correlations with the experimental values and
> hence
> > I can't think of a way to know in advance if the trained SVR will produce
> > anti-correlated predictions in order to change their sign and avoid the
> > disaster. Here is an example of what I mean:
>
> > Training set predictions: R=0.452422, tau=0.333333
> > External test set predictions: R=-0.537420, tau-0.300000
>
> > Obviously, in a real case scenario where I wouldn't have the external
> test set
> > I would have used the worst observation instead of the best ones. Has
> anybody
> > any idea about how I could prevent this?
>
> > thanks in advance
> > Thomas
> --
>     Gael Varoquaux
>     Researcher, INRIA Parietal
>     NeuroSpin/CEA Saclay , Bat 145, 91191 Gif-sur-Yvette France
>     Phone:  ++ 33-1-69-08-79-68
>     http://gael-varoquaux.info            http://twitter.com/GaelVaroquaux
> _______________________________________________
> scikit-learn mailing list
> scikit-learn at python.org
> https://mail.python.org/mailman/listinfo/scikit-learn
>



-- 

======================================================================

Dr Thomas Evangelidis

Post-doctoral Researcher
CEITEC - Central European Institute of Technology
Masaryk University
Kamenice 5/A35/2S049,
62500 Brno, Czech Republic

email: tevang at pharm.uoa.gr

          tevang3 at gmail.com


website: https://sites.google.com/site/thomasevangelidishomepage/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20170926/4248956e/attachment.html>


More information about the scikit-learn mailing list