[scikit-learn] Micro average in classification report
Andreas Mueller
t3kcit at gmail.com
Tue Oct 9 11:42:19 EDT 2018
On 10/05/2018 12:00 PM, Kevin Markham wrote:
> Hello all,
>
> Congratulations on the release of 0.20! My questions are about the
> updated classification_report:
> http://scikit-learn.org/stable/modules/generated/sklearn.metrics.classification_report.html
>
> Here is the simple example shown in the documentation (apologies for
> the formatting):
>
> >>> from sklearn.metrics import classification_report
> >>> y_true = [0, 1, 2, 2, 2]
> >>> y_pred = [0, 0, 2, 2, 1]
> >>> target_names = ['class 0', 'class 1', 'class 2']
> >>> print(classification_report(y_true, y_pred,
> target_names=target_names))
> precision recall f1-score support
>
> class 0 0.50 1.00 0.67 1
> class 1 0.00 0.00 0.00 1
> class 2 1.00 0.67 0.80 3
>
> micro avg 0.60 0.60 0.60 5
> macro avg 0.50 0.56 0.49 5
> weighted avg 0.70 0.60 0.61 5
>
> I understand how macro average and weighted average are calculated. My
> questions are in regard to micro average:
>
> 1. From this and other examples, it appears to me that "micro average"
> is identical to classification accuracy. Is that correct?
>
> 2. Is there a reason that micro average is listed three times (under
> the precision, recall, and f1-score columns)? From my understanding,
> that 0.60 number is being calculated once but is being displayed three
> times. The display implies (at least in my mind) that 0.60 is being
> calculated from the three precision numbers, and separately calculated
> from the three recall numbers, and separately calculated from the
> three f1-score numbers, which seems misleading.
>
> 3. The documentation explains micro average as "averaging the total
> true positives, false negatives and false positives". If my
> understanding is correct that micro average is the same as accuracy,
> then why are true negatives any less relevant to the calculation?
> (Also, I don't mean to be picky, but "true positives" etc. are whole
> number counts rather than rates, and so it seems odd to say that you
> are arriving at a rate by averaging counts.)
>
> These may be dumb questions arising from my ignorance... my apologies
> if so!
I had exactly the same comments and I find the current behavior
confusing, see https://github.com/scikit-learn/scikit-learn/issues/12334
PR welcome!
More information about the scikit-learn
mailing list