[scikit-learn] no positive predictions by neural_network.MLPClassifier

Piotr Bialecki piotr.bialecki at hotmail.de
Thu Dec 8 03:04:40 EST 2016


Hi Thomas,

Hi Thomas,

besides that information of Sebastian, you dataset seems to be quite imbalances (48 positive and 1230 negative observations).
You could try rebalancing your data using
https://github.com/scikit-learn-contrib/imbalanced-learn

This package offers some methods for resampling your data (under-sampling the majority class, over-sampling the minority class, etc.)


Greets,
Piotr

On 08.12.2016 01:19, Sebastian Raschka wrote:
Hi, Thomas,
we had a related thread on the email list some time ago, let me post it for reference further below. Regarding your question, I think you may want make sure that you standardized the features (which makes the learning generally it less sensitive to learning rate and random weight initialization). However, even then, I would try at least 1-3 different random seeds and look at the cost vs time — what can happen is that you land in different minima depending on the weight initialization as demonstrated in the example below (in MLPs you have the problem of a complex cost surface).

Best,
Sebastian

The default is set 100 units in the hidden layer, but theoretically, it should work with 2 hidden logistic units (I think that’s the typical textbook/class example). I think what happens is that it gets stuck in local minima depending on the random weight initialization. E.g., the following works just fine:

from sklearn.neural_network import MLPClassifier
X = [[0, 0], [0, 1], [1, 0], [1, 1]]
y = [0, 1, 1, 0]
clf = MLPClassifier(solver='lbfgs',
                    activation='logistic',
                    alpha=0.0,
                    hidden_layer_sizes=(2,),
                    learning_rate_init=0.1,
                    max_iter=1000,
                    random_state=20)
clf.fit(X, y)
res = clf.predict([[0, 0], [0, 1], [1, 0], [1, 1]])
print(res)
print(clf.loss_)


but changing the random seed to 1 leads to:

[0 1 1 1]
0.34660921283

For comparison, I used a more vanilla MLP (1 hidden layer with 2 units and logistic activation as well; <https://github.com/rasbt/python-machine-learning-> https://github.com/rasbt/python-machine-learning-book/blob/master/code/ch12/ch12.ipynb), essentially resulting in the same problem:
[cid:part2.01000907.08040901 at hotmail.de][cid:part3.06010207.00090501 at hotmail.de]















On Dec 7, 2016, at 6:45 PM, Thomas Evangelidis <tevang3 at gmail.com<mailto:tevang3 at gmail.com>> wrote:

I tried the sklearn.neural_network.MLPClassifier with the default parameters using the input data I quoted in my previous post about Nu-Support Vector Classifier. The predictions are great but the problem is that sometimes when I rerun the MLPClassifier it predicts no positive observations (class 1). I have noticed that this can be controlled by the random_state parameter, e.g. MLPClassifier(random_state=0) gives always no positive predictions. My question is how can I chose the right random_state value in a real blind test case?

thanks in advance
Thomas


--
======================================================================
Thomas Evangelidis
Research Specialist
CEITEC - Central European Institute of Technology
Masaryk University
Kamenice 5/A35/1S081,
62500 Brno, Czech Republic

email: tevang at pharm.uoa.gr<mailto:tevang at pharm.uoa.gr>
          tevang3 at gmail.com<mailto:tevang3 at gmail.com>

website: https://sites.google.com/site/thomasevangelidishomepage/


_______________________________________________
scikit-learn mailing list
scikit-learn at python.org<mailto:scikit-learn at python.org>
https://mail.python.org/mailman/listinfo/scikit-learn




_______________________________________________
scikit-learn mailing list
scikit-learn at python.org<mailto:scikit-learn at python.org>
https://mail.python.org/mailman/listinfo/scikit-learn


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20161208/df56f38a/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ATT00001.png
Type: image/png
Size: 10222 bytes
Desc: ATT00001.png
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20161208/df56f38a/attachment-0002.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ATT00002.png
Type: image/png
Size: 9601 bytes
Desc: ATT00002.png
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20161208/df56f38a/attachment-0003.png>


More information about the scikit-learn mailing list