[scikit-learn] Using a new random number generator in libsvm and liblinear

Ruchika Nayyar ruchika.work at gmail.com
Thu Jan 2 10:45:25 EST 2020


OK
On Thu, Jan 2, 2020, 10:42 AM Adrin <adrin.jalali at gmail.com> wrote:

> Hi,
>
> liblinear and libsvm use the C `rand()` function which returns number up to
> 32767 on the windows platform. This PR
> <https://github.com/scikit-learn/scikit-learn/pull/13511> proposes the
> following fix:
>
> *Fixed a convergence issue in ``libsvm`` and ``liblinear`` on Windows
> platforms*
> *impacting all related classifiers and regressors. The random number
> generator*
> *used to randomly select coordinates in the coordinate descent algorithm
> was*
> *C ``rand()``, that is only able to generate numbers up to ``32767`` on
> windows*
> *platform. It was replaced with C++11 ``mt19937``, a Mersenne Twister that*
> *correctly generates 31bits/63bits random numbers on all platforms. In
> addition,*
> *the crude "modulo" postprocessor used to get a random number in a bounded*
> *interval was replaced by the tweaked Lemire method as suggested by `this
> blog*
> *post <http://www.pcg-random.org/posts/bounded-rands.html
> <http://www.pcg-random.org/posts/bounded-rands.html>>`*
>
> In order to keep the models consistent across platforms, we'd like to use
> the same (new) rng
> on all platforms, which means after this change the generated models may
> be slightly different
> to what they are now. We'd like to hear any concerns on the matter from
> the community, here
> or on the PR, before merging the fix.
>
> Best,
> Adrin.
> _______________________________________________
> scikit-learn mailing list
> scikit-learn at python.org
> https://mail.python.org/mailman/listinfo/scikit-learn
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20200102/2125a9b7/attachment.html>


More information about the scikit-learn mailing list