[scikit-learn] Scikit Learn in a Cray computer

Mauricio Reis reismc at ime.eb.br
Fri Jun 28 16:04:07 EDT 2019


Sorry, but just now I reread your answer more closely.

It seems that the "n_jobs" parameter of the DBScan routine brings no 
benefit to performance. If I want to improve the performance of the 
DBScan routine I will have to redesign the solution to use MPI 
resources.

Is it correct?

---
Ats.,
Mauricio Reis

Em 28/06/2019 16:47, Mauricio Reis escreveu:
> My laptop has Intel I7 processor with 4 cores. When I run the program
> on Windows 10, the "joblib.cpu_count()" routine returns "4". In these
> cases, the same test I did on the Cray computer caused a 10% increase
> in the processing time of the DBScan routine when I used the "n_jobs =
> 4" parameter compared to the processing time of that routine without
> this parameter. Do you know what is the cause of the longer processing
> time when I use "n_jobs = 4" on my laptop?
> 
> ---
> Ats.,
> Mauricio Reis
> 
> Em 28/06/2019 06:29, Brown J.B. via scikit-learn escreveu:
>>> where you can see "ncpus = 1" (I still do not know why 4 lines were
>>> printed -
>>> 
>>> (total of 40 nodes) and each node has 1 CPU and 1 GPU!
>> 
>>> #PBS -l select=1:ncpus=8:mpiprocs=8
>>> aprun -n 4 p.sh ./ncpus.py
>> 
>> You can request 8 CPUs from a job scheduler, but if each node the
>> script runs on contains only one virtual/physical core, then
>> cpu_count() will return 1.
>> If that CPU supports multi-threading, you would typically get 2.
>> 
>> For example, on my workstation:
>> `--> egrep "processor|model name|core id" /proc/cpuinfo
>> processor : 0
>> model name : Intel(R) Core(TM) i3-4160 CPU @ 3.60GHz
>> core id : 0
>> processor : 1
>> model name : Intel(R) Core(TM) i3-4160 CPU @ 3.60GHz
>> core id : 1
>> processor : 2
>> model name : Intel(R) Core(TM) i3-4160 CPU @ 3.60GHz
>> core id : 0
>> processor : 3
>> model name : Intel(R) Core(TM) i3-4160 CPU @ 3.60GHz
>> core id : 1
>> `--> python3 -c "from sklearn.externals import joblib;
>> print(joblib.cpu_count())"
>> 4
>> 
>> It seems that in this situation, if you're wanting to parallelize
>> *independent* sklearn calculations (e.g., changing dataset or random
>> seed), you'll ask for the MPI by PBS processes like you have, but
>> you'll need to place the sklearn computations in a function and then
>> take care of distributing that function call across the MPI processes.
>> 
>> Then again, if the runs are independent, it's a lot easier to write a
>> for loop in a shell script that changes the dataset/seed and submits
>> it to the job scheduler to let the job handler take care of the
>> parallel distribution.
>> (I do this when performing 10+ independent runs of sklearn modeling,
>> where models use multiple threads during calculations; in my case,
>> SLURM then takes care of finding the available nodes to distribute the
>> work to.)
>> 
>> Hope this helps.
>> J.B.
>> _______________________________________________
>> scikit-learn mailing list
>> scikit-learn at python.org
>> https://mail.python.org/mailman/listinfo/scikit-learn


More information about the scikit-learn mailing list