[scikit-learn] cross validation scores seem off for PLSRegression

Fabian Böhnlein fabian.boehnlein at gmail.com
Tue Feb 14 06:08:11 EST 2017


Hi Paul,

not sure what @ syntax does in ipython, but seems you're setting y to the
coefficients of the model instead of y_hat = pls.predict(x).

Also see in the documentation why R^2 can be negative:
http://scikit-learn.org/stable/modules/generated/sklearn.cross_decomposition.PLSRegression.html#sklearn.cross_decomposition.PLSRegression.score

Best,
Fabian

On Tue, 14 Feb 2017 at 11:57 Paul Anton Letnes <pa at letnes.com> wrote:

> Hi!
>
> Versions:
> sklearn 0.18.1
> numpy 1.11.3
> Anaconda python 3.5 on ubuntu 16.04
>
> What range is the cross_val_score supposed to be in? I was under the
> impression from the documentation, although I cannot find it stated
> explicitly anywhere, that it should be a number in the range [0, 1].
> However, it appears that one can get large negative values; see the ipython
> session below.
>
> Cheers
> Paul
>
> In [2]: import numpy as np
>
> In [3]: y = np.random.random((10, 3))
>
> In [4]: x = np.random.random((10, 17))
>
> In [5]: from sklearn.cross_decomposition import PLSRegression
>
> In [6]: pls = PLSRegression(n_components=3)
>
> In [7]: from sklearn.cross_validation import cross_val_score
>
> In [8]: from sklearn.model_selection import cross_val_score
>
> In [9]: cross_val_score(pls, x, y)
> Out[9]: array([-32.52217837,  -4.17228083,  -5.88632365])
>
>
> PS:
> This happens even if I cheat by setting y to the predicted value, and
> cross validate on that.
>
> In [29]: y = x @ pls.coef_
>
> In [30]: cross_val_score(pls, x, y)
> /home/paul/anaconda3/envs/wp3-paper/lib/python3.5/site-packages/sklearn/cross_decomposition/pls_.py:293:
> UserWarning: Y residual constant at iteration 5
>   warnings.warn('Y residual constant at iteration %s' % k)
> /home/paul/anaconda3/envs/wp3-paper/lib/python3.5/site-packages/sklearn/cross_decomposition/pls_.py:293:
> UserWarning: Y residual constant at iteration 6
>   warnings.warn('Y residual constant at iteration %s' % k)
> /home/paul/anaconda3/envs/wp3-paper/lib/python3.5/site-packages/sklearn/cross_decomposition/pls_.py:293:
> UserWarning: Y residual constant at iteration 6
>   warnings.warn('Y residual constant at iteration %s' % k)
> Out[30]: array([-35.01267353,  -4.94806383,  -5.9619526 ])
>
> In [34]: np.max(np.abs(y - x @ pls.coef_))
> Out[34]: 0.0
>
>
> _______________________________________________
> scikit-learn mailing list
> scikit-learn at python.org
> https://mail.python.org/mailman/listinfo/scikit-learn
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20170214/548188fe/attachment.html>


More information about the scikit-learn mailing list