[Chicago] Kickstarter Fund to get rid of the GIL

Sun Jul 24 22:07:56 CEST 2011

Should we create a wiki page? Like inglourous basterds to kill the GIL
or something.

On Sun, Jul 24, 2011 at 2:56 PM, Jason Rexilius <jason at hostedlabs.com> wrote:
> I also have to quote:
>
> "rather that, for problems for which shared-memory concurrency is
> appropriate (read: the valid cases to complain about the GIL), message
> passing will not be, because of the marshal/unmarshal overhead (plus data
> size/locality ones)."
>
>
> I have to say this is some of the best discussion in quite a while. Dave's
> passionate response is great as well as others. I think the rudeness, or
> not, is kinda besides the point.
>
> There is a valid point to be made about marshal/unmarshal overhead in
> situations where data-manipulation-concurrency AND _user expectation_ or
> environmental constraints apply.  I think that's why people have some
> grounds to be unhappy with the GIL concept (for me its a concept) in certain
> circumstances. Tal is dead on in that "scalability" means different things.
>
> Oddly, I'm more engaged in this as an abstract comp sci question than a
> specific python question.  The problem set applies across languages.
>
> The question I would raise is if, given that an engineer understands the
> problem he is facing, are there both tools in the toolbox?  Is there an
> alternative to GIL for the use-cases where it is not the ideal solution?
>
> BTW, I will stand up for IPC as one of the tools in the toolbox to deal with
> scale/volume/speed/concurrency problems.
>
>
> On 7/24/11 1:58 PM, Tal Liron wrote:
>>
>> I would say that there's truth in both approaches. "Scalability" means
>> different things at different levels of scale. A web example: the
>> architecture of Twitter or Facebook is nothing like the architecture of
>> even a large Django site. It's not even the same problem field.
>>
>>
>> A good threading model can be extremely efficient at certain scales. For
>> data structures that are mostly read, not written, synchronization is
>> not a performance issue, and you get the best throughput possible in
>> multicore situations. The truly best scalability would be achieved by a
>> combined approach: threading on a single node, message passing between
>> nodes. Programming for that, though, is a nightmare (unless you had a
>> programming language that makes both approaches transparent) and so
>> usually at the large scale the latter approach is chosen. One
>> significant challenge is to make sure that operations that MIGHT use the
>> same data structures are actually performed on the same node, so that
>> threading would be put to use.
>>
>>
>> So, what Dave said applies very well to threading, too: "you still need
>> to know what you're doing and how to decompose your application to use
>> it."
>>
>>
>> Doing concurrency right is hard. Doing message passing right is hard.
>> Functional (persistent data structure) languages are hard, too. Good
>> thing we're all such awesome geniuses, bursting with experience and a
>> desire to learn.
>>
>>
>> -Tal
>>
>>
>> On 07/23/2011 01:40 PM, David Beazley wrote:
>>
>>>> "high performance just create multi processes that message" very
>>>> rarely have
>>>> I heard IPC and high performance in the same sentence.
>>>>
>>>> Alex
>>>>
>>> Your youth and inexperience is the only reason would make a statement
>>> that ignorant. Go hang out with some people doing Python and
>>> supercomputing for awhile and report back---you will find that almost
>>> significant application is based on message passing (e.g., MPI). This
>>> is because message passing has proven itself to be about the only sane
>>> way of scaling applications up to run across thousands to tens of
>>> thousands of CPU cores.
>>>
>>> I speak from some experience as I was writing such software for large
>>> Crays, Connection Machines, and other systems when I first discovered
>>> Python back in 1996. As early as 1995, our group had done performance
>>> experiments comparing threads vs. message passing on some
>>> multiprocessor SMP systems and found that threads just didn't scale or
>>> perform as well as message passing even on machines with as few as 4
>>> CPUs. This was all highly optimized C code for numerics (i.e., no
>>> Python or GIL).
>>>
>>> That said, in order to code with message passing, you still need to
>>> know what you're doing and how to decompose your application to use it.
>>>
>>> Cheers,
>>> Dave
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> _______________________________________________
>>> Chicago mailing list
>>> Chicago at python.org
>>> http://mail.python.org/mailman/listinfo/chicago
>>
>> _______________________________________________
>> Chicago mailing list
>> Chicago at python.org
>> http://mail.python.org/mailman/listinfo/chicago
>
> _______________________________________________
> Chicago mailing list
> Chicago at python.org
> http://mail.python.org/mailman/listinfo/chicago
>