[Chicago] Kickstarter Fund to get rid of the GIL

Tal Liron tal.liron at threecrickets.com
Sun Jul 24 20:58:54 CEST 2011


I would say that there's truth in both approaches. "Scalability" means 
different things at different levels of scale. A web example: the 
architecture of Twitter or Facebook is nothing like the architecture of 
even a large Django site. It's not even the same problem field.


A good threading model can be extremely efficient at certain scales. For 
data structures that are mostly read, not written, synchronization is 
not a performance issue, and you get the best throughput possible in 
multicore situations. The truly best scalability would be achieved by a 
combined approach: threading on a single node, message passing between 
nodes. Programming for that, though, is a nightmare (unless you had a 
programming language that makes both approaches transparent) and so 
usually at the large scale the latter approach is chosen. One 
significant challenge is to make sure that operations that MIGHT use the 
same data structures are actually performed on the same node, so that 
threading would be put to use.


So, what Dave said applies very well to threading, too: "you still need 
to know what you're doing and how to decompose your application to use it."


Doing concurrency right is hard. Doing message passing right is hard. 
Functional (persistent data structure) languages are hard, too. Good 
thing we're all such awesome geniuses, bursting with experience and a 
desire to learn.


-Tal


On 07/23/2011 01:40 PM, David Beazley wrote:

>> "high performance just create multi processes that message" very rarely have
>> I heard IPC and high performance in the same sentence.
>>
>> Alex
>>
> Your youth and inexperience is the only reason would make a statement that ignorant.   Go hang out with some people doing Python and supercomputing for awhile and report back---you will find that almost significant application is based on message passing (e.g., MPI).   This is because message passing has proven itself to be about the only sane way of scaling applications up to run across thousands to tens of thousands of CPU cores.
>
> I speak from some experience as I was writing such software for large Crays, Connection Machines, and other systems when I first discovered Python back in 1996.   As early as 1995, our group had done performance experiments comparing threads vs. message passing on some multiprocessor SMP systems and found that threads just didn't scale or perform as well as message passing even on machines with as few as 4 CPUs.   This was all highly optimized C code for numerics (i.e., no Python or GIL).
>
> That said, in order to code with message passing, you still need to know what you're doing and how to decompose your application to use it.
>
> Cheers,
> Dave
>
>
>
>
>
>
>
>
> _______________________________________________
> Chicago mailing list
> Chicago at python.org
> http://mail.python.org/mailman/listinfo/chicago



More information about the Chicago mailing list