multiprocessing vs thread performance

Mon Dec 29 15:22:44 EST 2008

Roy Smith <roy at panix.com> writes:

> In article <mailman.6337.1230563873.3487.python-list at python.org>,
>  Christian Heimes <lists at cheimes.de> wrote:
>
>> You have missed an important point. A well designed application does
>> neither create so many threads nor processes. The creation of a thread
>> or forking of a process is an expensive operation. You should use a pool
>> of threads or processes.
>
> It's worth noting that forking a new process is usually a much more 
> expensive operation than creating a thread.

If by "forking" you mean an actual fork() call, as opposed to invoking
a different executable, the difference is not necessarily that great.
Modern Unix systems tend to implement a 1:1 mapping between threads
and kernel processes, so creating a thread and forking a process
require similar amount of work.

On my system, as measured by timeit, spawning and joining a thread
takes 111 usecs, while forking and waiting for a process takes 260.
Slower, but not catastrophically so.

> Not that I would want to create 100,000 of either!

Agreed.

> Not everybody realizes it, but threads eat up a fair chunk of memory
> (you get one stack per thread, which means you need to allocate a
> hunk of memory for each stack).  I did a quick look around; 256k
> seems like a common default stack size.  1 meg wouldn't be unheard
> of.

Note that this memory is virtual memory, so it doesn't use up the
physical RAM until actually used.  I've seen systems running legacy
Java applications that create thousands of threads where *virtual*
memory was the bottleneck.