[scikit-learn] precision_recall_curve giving incorrect results on very small example
David R
dabruro at gmail.com
Tue Apr 28 13:41:14 EDT 2020
Here is a very small example using precision_recall_curve():
from sklearn.metrics import precision_recall_curve, precision_score,
recall_score
y_true = [0, 1]
y_predict_proba = [0.25,0.75]
precision, recall, thresholds = precision_recall_curve(y_true, y_predict_proba)
precision, recall
which results in:
(array([1., 1.]), array([1., 0.]))
Now let's calculate manually to see whether that's correct. There are
three possible class vectors depending on threshold: [0,0], [0,1], and
[1,1]. We have to discard [0,0] because it gives an undefined precision
(divide by zero). So, applying precision_score() and recall_score() to the
other two:
y_predict_class=[0,1]
precision_score(y_true, y_predict_class), recall_score(y_true, y_predict_class)
which gives:
(1.0, 1.0)
and
y_predict_class=[1,1]
precision_score(y_true, y_predict_class), recall_score(y_true, y_predict_class)
which gives
(0.5, 1.0)
This seems not to match the output of precision_recall_curve() (which for
example did not produce a 0.5 precision value).
Am I missing something?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20200428/64d36267/attachment.html>
More information about the scikit-learn
mailing list