Threads vs. processes, what to consider in choosing ?

Philip Semanchuk philip at semanchuk.com
Tue Feb 17 12:08:29 EST 2009


On Feb 17, 2009, at 10:18 AM, Barak, Ron wrote:

> I have a wxPython application that builds an internal database from  
> a list of files and then displays various aspects of that data,
> in response to user's requests.
>
> I want to add a module that finds events in a set of log files  
> (LogManager).
> These log files are potentially huge, and the initial processing is  
> lengthy (several minutes).
> Thus, when the user will choose LogManager, it would be unacceptable  
> to block the other parts of the program, and so - the initial  
> LogManager processing
> would need to be done separately from the normal run of the program.
> Once the initial processing is done, the main program would be  
> notified and could display the results of LogManager processing.
>
> I was thinking of either using threads, or using separate processes,  
> for the main programs and LogManager.
>
> What would you suggest I should consider in choosing between the two  
> options ?
> Are there other options besides threads and multi-processing ?

Hi Ron,
The general rule is that it is a lot easier to share data between  
threads than between processes. The multiprocessing library makes the  
latter easier but is only part of the standard library in Python >=  
2.6. The design of your application matters a lot. For instance, will  
the processing code write its results to a database, ping the GUI code  
and then exit, allowing the GUI to read the database? That sounds like  
an excellent setup for processes.

In addition, there's the GIL to consider. Multi-process applications  
aren't affected by it while multi-threaded applications may be. In  
these days where multi-processor/multi-core machines are more common,  
this fact is ever more important. Torrents of words have been written  
about the GIL on this list and elsewhere and I have nothing useful to  
add to the torrents. I encourage you to read some of those  
conversations.

FWIW, when I was faced with a similar setup, I went with multiple  
processes rather than threads.

Last but not least, since you asked about alternatives to threads and  
multiprocessing, I'll point you to some low level libraries I wrote  
for doing interprocess communication:
http://semanchuk.com/philip/posix_ipc/
http://semanchuk.com/philip/sysv_ipc/

Good luck
Philip







More information about the Python-list mailing list