Parallel processing on shared data structures

psaffrey at googlemail.com psaffrey at googlemail.com
Thu Mar 19 13:46:34 EDT 2009


I'm filing 160 million data points into a set of bins based on their
position. At the moment, this takes just over an hour using interval
trees. I would like to parallelise this to take advantage of my quad
core machine. I have some experience of Parallel Python, but PP seems
to only really work for problems where you can do one discrete bit of
processing and recombine these results at the end.

I guess I could thread my code and use mutexes to protect the shared
lists that everybody is filing into. However, my understanding is that
Python is still only using one process so this won't give me multi-
core.

Does anybody have any suggestions for this?

Peter



More information about the Python-list mailing list