[scikit-learn] [ANN] Scikit-learn 0.20.0

Andreas Mueller t3kcit at gmail.com
Fri Sep 28 15:42:48 EDT 2018



On 09/28/2018 03:20 PM, Javier López wrote:
> I understand the difficulty of the situation, but an approximate 
> solution to that is saving the predictions from a large enough 
> validation set. If the prediction for the newly created model are 
> "close enough" to the old ones, we deem the unserialized model to be 
> the same and move forward, if there are serious discrepancies, then we 
> dive deep to see what's going on, and if needed refit the offending 
> submodels with the newer version.

Basically what you're saying is that you're fine with versioning the 
models and having the model break loudly if anything changes.
That's not actually what most people want. They want to be able to make 
predictions with a given model for ever into the future.

Your use-case is similar, but if retraining the model is not an issue, 
why don't you want to retrain every time scikit-learn releases a new 
version?
We're now storing the version of scikit-learn that was used in the 
pickle and warn if you're trying to load with a different version.
That's basically a stricter test than what you wanted. Yes, there are 
false positives, but given that this release took a year,
this doesn't seem that big an issue?


More information about the scikit-learn mailing list