[scikit-learn] PyCM: Multiclass confusion matrix library in Python

Mon Jun 4 11:40:51 EDT 2018

On 5/31/18 1:26 PM, Stuart Reynolds wrote:
> Hi Sepand,
>
> Thanks for this -- looks useful. I had to write something similar (for
> the binary case) and wish scikit had something like this.
Which part of it? I'm not entirely sure I understand what the core 
functionality is.
>
> I wonder if there's something similar for the binary class case where,
> the prediction is a real value (activation) and from this we can also
> derive
>   - CMs for all prediction cutoff (or set of cutoffs?)
>   - scores over all cutoffs (AUC, AP, ...)
AUC and AP are by definition over all cut-offs. And CMs for all
cutoffs doesn't seem a good idea, because that'll be n_samples many
in the general case. If you want to specify a set of cutoffs, that would 
be pretty easy to do.
How do you find these cut-offs, though?
>
> For me, in analyzing (binary class) performance, reporting scores for
> a single cutoff is less useful than seeing how the many scores (tpr,
> ppv, mcc, relative risk, chi^2, ...) vary at various false positive
> rates, or prediction quantiles.
You can totally do that with sklearn right now. Granted, it's not
as convenient as it could be, but we're working on it.

What's really the crucial point for me is how to pick the cut-offs.

Cheers,

Andy