[scikit-learn] Problem using boxplots to compare significance of model performance

Sun Oct 30 17:38:18 EDT 2016

Hi, Suranga

> So, I may have to go over 2 models, so McNamara's may not be an option :(

Sure, but there are many other hypothesis tests, was just a suggestion since I thought you just wanted compare 2 models :)

> plt.boxplot(results)
> So what does "results" look like? 
> 
> [0.85433808345719897, 0.8976733724549345]

You can’t do a boxplot based on 1 single value.

> These are the two precision values calculated for each neural network. Exactly what should 1Darray_of_model1_results look like? is it one value per model or....

This should work:

model_1 = [0.85,   # experiment 1
           0.84]   # experiment 2

model_2 = [0.84,  # experiment 1
    0.83]  # experiment 2

plt.boxplot([model_1, model_2])

However, a boxplot based on 2 values only doesn’t make sense imho, I you could just plot the range.

Best,
Sebastian

> On Oct 30, 2016, at 4:43 PM, Suranga Kasthurirathne <surangakas at gmail.com> wrote:
> 
> 
> Hi Sebastian!
> 
> Thank you, you might be onto something here ;)
> 
> So, I may have to go over 2 models, so McNamara's may not be an option :(
> 
> In regard to your second comment, in building my boxplots, this is how I input results. 
> 
> plt.boxplot(results)
> So what does "results" look like? 
> 
> [0.85433808345719897, 0.8976733724549345]
> 
> These are the two precision values calculated for each neural network. Exactly what should 1Darray_of_model1_results look like? is it one value per model or....
> 
> 
> -- 
> Best Regards,
> Suranga
> _______________________________________________
> scikit-learn mailing list
> scikit-learn at python.org
> https://mail.python.org/mailman/listinfo/scikit-learn