[scikit-learn] Problem using boxplots to compare significance of model performance

Suranga Kasthurirathne surangakas at gmail.com
Sun Oct 30 15:24:12 EDT 2016


Hi folks!

I'm using scikit-learn to build two neural networks using 10% holdout, and
compare their performance using precision. To compare statistical
significance in the variance of precision, i'm using scikit's boxplots.

My problem is twofold -

1) The standard deviation in the precision of the two models (obtained
using precision.std()) is always 0.0. I'm assuming that's a problem.
2) My boxplot is meant to display bars for the two models, but always
displays only the first model (nn01)

My outcomes for this dataset is binary (0 or 1) since the models assume
average=binary by default, is that a problem?

For those who'd like to look, my source code can be seen at
http://pastebin.com/yvE2T1Sw

The code produces the following plot - which is of course only ONE of the
bars that I need :(



​

-- 
Best Regards,
Suranga
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20161030/257140bf/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Screen Shot 2016-10-30 at 12.17.22 PM.png
Type: image/png
Size: 45270 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20161030/257140bf/attachment-0001.png>


More information about the scikit-learn mailing list