[scikit-learn] Does NMF optimise over observed values

Raphael C drraph at gmail.com
Sun Aug 28 13:16:14 EDT 2016


On Sunday, August 28, 2016, Andy <t3kcit at gmail.com> wrote:

>
>
> On 08/28/2016 12:29 PM, Raphael C wrote:
>
> To give a little context from the web, see e.g. http://www.quuxlabs.com/
> blog/2010/09/matrix-factorization-a-simple-tutorial-and-implementation-
> in-python/ where it explains:
>
> "
> A question might have come to your mind by now: if we find two matrices [image:
> \mathbf{P}] and [image: \mathbf{Q}] such that [image: \mathbf{P} \times
> \mathbf{Q}] approximates [image: \mathbf{R}], isn’t that our predictions
> of all the unseen ratings will all be zeros? In fact, we are not really
> trying to come up with [image: \mathbf{P}] and [image: \mathbf{Q}] such
> that we can reproduce [image: \mathbf{R}] exactly. Instead, we will only
> try to minimise the errors of the observed user-item pairs.
> "
>
> Yes, the sklearn interface is not meant for matrix completion but
> matrix-factorization.
> There was a PR for some matrix completion for missing value imputation at
> some point.
>
> In general, scikit-learn doesn't really implement anything for
> recommendation algorithms as that requires a different interface.
>

Thanks Andy. I just looked up that PR.

I was thinking simply producing a different factorisation optimised only
over the observed values wouldn't need a new interface. That in itself
would be hugely useful.

I can see that providing a full drop in recommender system would involve
more work.

Raphael
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20160828/71c97a4c/attachment-0001.html>


More information about the scikit-learn mailing list