[scikit-learn] How to not recalculate transformer in a Pipeline?

Gael Varoquaux gael.varoquaux at normalesup.org
Mon Nov 28 17:17:22 EST 2016


Actually, thinking a bit about this, the inconvenience with the pattern
that I lay out below is that it adds an extra indirection in the
parameter setting. One way to avoid this would be to have a subclass of
the pipeline that includes memoizing. It would call a memoized version of
fit.

I think that it would be quite handy :).

Should I open an issue on that?

G

On Mon, Nov 28, 2016 at 07:51:21PM +0100, Gael Varoquaux wrote:
> On Mon, Nov 28, 2016 at 01:46:08PM -0500, Andreas Mueller wrote:
> > I guess so. You'd handle parameters using an estimator_params dict in init
> > and pass that to the caching function?

> I'd try to set on the estimator, before passing them to the function, as we
> do in standard scikit-learn, and joblib is clever enough to take that in
> account when given the estimator as a function of the function that is
> memoized.

> G
> _______________________________________________
> scikit-learn mailing list
> scikit-learn at python.org
> https://mail.python.org/mailman/listinfo/scikit-learn

-- 
    Gael Varoquaux
    Researcher, INRIA Parietal
    NeuroSpin/CEA Saclay , Bat 145, 91191 Gif-sur-Yvette France
    Phone:  ++ 33-1-69-08-79-68
    http://gael-varoquaux.info            http://twitter.com/GaelVaroquaux


More information about the scikit-learn mailing list