[SciPy-Dev] General discussion on parallelisation

Gael Varoquaux gael.varoquaux at normalesup.org
Sat Sep 8 18:39:03 EDT 2018


I meant: used heavily.

Gaël

⁣Sent from my phone. Please forgive typos and briefness.​

On Sep 9, 2018, 00:04, at 00:04, Phillip Feldman <phillip.m.feldman at gmail.com> wrote:
>I don't understand what is meant by the phase "used in anger".
>
>Phillip
>
>On Mon, Sep 3, 2018 at 2:22 AM Gael Varoquaux
><gael.varoquaux at normalesup.org>
>wrote:
>
>> On Sun, Sep 02, 2018 at 08:58:12PM -0700, Ralf Gommers wrote:
>> >     joblib has a custom backend framework that can be used for such
>> purpose
>> >     (if I understnad you well):
>> >     https://pythonhosted.org/joblib/parallel.html#
>> >     custom-backend-api-experimental
>>
>>
>> > Updated link (status is still experimental):
>>
>https://joblib.readthedocs.io/en/latest/parallel.html#custom-backend-api-experimental
>>
>> Still experimental, but less and less :). We are using this in anger
>> these days.
>>
>> > There's also this JoblibPool that can be taken over:
>> https://github.com/adrn/schwimmbad/blob/master/schwimmbad/jl.py#L14
>> > Seems simpler than a backend still tagged experimental.
>>
>> Well, we have a fairly stringent definition of experimental. This
>feature
>> is no longer very experimental.
>>
>> >     This is evolving. However, the reason behind this is that Pool
>get
>> >     corrupted and lead to deadlock. Olivier Grisel and Thomas
>Moreau are
>> >     working on fixing this in the Python standard library (first PR
>> merged
>> >     recently)!
>>
>> > Anyone know the status of this? And can this issue be avoided by
>the new
>> loky
>> > backend to joblib?
>>
>> This is merged and released since a while. Loky is now used in anger.
>So
>> far, I think that people are happy with it. In particular it is more
>> robust than multiprocessing.Pool (specifically, robust to segfault),
>and
>> the improvements have been contributed upstream to Python
>> concurrent.futures's process pool executor (available in Python 3.7).
>>
>> Joblib has lately been getting a lot of improvements to make it more
>> robust and scaleable [1]. It will still have some overhead, due to
>> pickling.
>> Pickling speed should be solved by coordinating upstream changes in
>> Python with implementations in numpy. Olivier Grisel has been
>> coordinating with Python for this. I believe that PEP 574 [2] is
>related to
>> these efforts. The specific challenges are to enable fast code paths
>in
>> cloud pickle, which is necessary to pickle arbitrary objects and
>code.
>>
>> While simpler multiprocessing-based code will sometimes give less
>> overhead compared to joblib, it will probably be brittle.
>>
>> I think that the best way to move forward from here would be to do
>some
>> prototyping and experimentation.
>>
>> Gaël
>>
>> [1] Joblib changelog:
>>
>https://joblib.readthedocs.io/en/latest/developing.html#latest-changes
>>
>> [2] Pickling improvement PEP:
>https://www.python.org/dev/peps/pep-0574/
>>
>>
>> _______________________________________________
>> SciPy-Dev mailing list
>> SciPy-Dev at python.org
>> https://mail.python.org/mailman/listinfo/scipy-dev
>>
>
>
>------------------------------------------------------------------------
>
>_______________________________________________
>SciPy-Dev mailing list
>SciPy-Dev at python.org
>https://mail.python.org/mailman/listinfo/scipy-dev
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scipy-dev/attachments/20180909/17922a3a/attachment.html>


More information about the SciPy-Dev mailing list