[scikit-learn] sklearn.model_selection.GridSearchCV - unable to use n_jobs>1 on MacOS Sierra python 2.7

Sumeet Sandhu sumeet.k.sandhu at gmail.com
Wed Jan 10 14:41:15 EST 2018

and just now, the first case stopped working too - the 15MB training data
causes python to abruptly die.

On Mon, Jan 8, 2018 at 9:22 PM, Sumeet Sandhu <sumeet.k.sandhu at gmail.com>

> There are two cases : n_jobs > 1 works when data is smaller - when the
> training docs numpy array is 15MB. It does not work when training matrix is
> 100MB. My Mac has 16GB RAM.
> In the second case, the jobs die out pretty quickly, in seconds, and the
> main python process seems to die out (min CPU usage). There is a popup
> message saying 'python processes appear to have died'. This is when i run
> python on bash command line.
> When I run in python GUI IDLE, a message pops up 'your program is still
> running, sure you want to close window'.
> What are these jobs anyway? Are they various parameter combinations in
> param_grid, or lower level jobs out of compiler etc?
> Does each job replicate the training data in RAM?
> regards
> On Sun, Jan 7, 2018 at 11:35 AM, Sumeet Sandhu <sumeet.k.sandhu at gmail.com>
> wrote:
>> Hi,
>> I was able to run this with n_jobs=-1, and the activity monitor does show
>> all 8 CPUs engaged, but the jobs start to die out one by one. I tried with
>> n_jobs=2, same story.
>> The only option that works is n_jobs=1.
>> I played around with 'pre_dispatch' a bit - unclear what that does.
>> GRID = GridSearchCV(LogisticRegression(), param_grid, scoring=None,
>> fit_params=None, n_jobs=1, iid=True, refit=True, cv=10, verbose=0,
>> error_score=0, return_train_score=False)
>> GRID.fit(trainDocumentV,trainLabelV)
>> How can I sustain at least 3-4 parallel jobs?
>> thanks,
>> Sumeet
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20180110/dcbeede5/attachment.html>

More information about the scikit-learn mailing list