More CPUs doen't equal more speed

Christian Gollwitzer auriocus at gmx.de
Fri May 24 03:02:33 EDT 2019


Am 23.05.19 um 23:44 schrieb Paul Rubin:
> Bob van der Poel <bob at mellowood.ca> writes:
>> for i in range(0, len(filelist), CPU_COUNT):
>>      for z in range(i, i+CPU_COUNT):
>>          doit( filelist[z])
> 
> Write your program to just process one file, then use GNU Parallel
> to run the program on your 1200 files, 6 at a time.
> 

This is a very sensible suggestion. GNU parallel on a list of files is 
relatively easy, for instance I use it to resize many images in parallel 
like this:

	parallel convert {} -resize 1600 small_{} ::: *.JPG

The {} is replaced for each file in turn.

Another way with an external tool is a Makefile. GNU make can run in 
parallel, by setting the flag "-j", so "make -j6" will run 6 processes i 
parallel. It is more work to set up the Makefile, but it might pay off 
if you have a dependency graph or if the process is interrupted.
"make" can figure out which files need to be processed and therefore 
continue a stopped job.

Maybe rewriting all of this from scratch in Python is not worth it.

	Christian



More information about the Python-list mailing list