[Chicago] Kickstarter Fund to get rid of the GIL

Tal Liron tal.liron at threecrickets.com
Sun Jul 24 23:51:46 CEST 2011


Doesn't CPython work, too? And plain Ruby? And Perl? And PHP?


All of these work, pass the tests, and are deployed successfully in the 
real world. The question is what for, and how far you want to go.


Everyone here criticizing CPython and the GIL, don't forget how far and 
well you travelled with it.


On 07/24/2011 04:32 PM, Joshua Herman wrote:

> At least erlang works for the use cases. I wasn't aware that Jython
> was that powerful I will have to play with it.
>
> On Sun, Jul 24, 2011 at 3:46 PM, Tal Liron<tal.liron at threecrickets.com>  wrote:
>> There is an alternative: Jython, which is Python on the JVM, and has no GIL.
>> It's real, it works, and has a very open community. If you want to do
>> high-concurrency in Python, it's the way to go. (And it has other advantages
>> and disadvantages, of course.)
>>
>>
>> I am always a bit frightened by community attempts to create new virtual
>> machines for favorite languages in order to solve problem X. This shows a
>> huge under-estimation of what it means to create a robust, reliable,
>> performative generic platform. Consider how many really reliable versions of
>> the C standard library out there -- and how many decades they took to
>> mature, even with thousands of expert eyes poring over the code and testing
>> it. And this is without duck typing (or ANY typing), data integrity, scoping
>> (+call/cc), tail recursion, or any other of the other huge (and exciting)
>> challenges required to run a dynamic language like Python.
>>
>>
>> So, it's almost amusing to see projects like Rubinius or Parrot come to be.
>> Really? This is the best use of our time and effort? I'm equally impressed
>> by the ballsiness of Erlang to create a new virtual machine from scratch.
>>
>>
>> But those are rather unique histories. CPython has it's own unique history.
>> Not many people realize this, but Python is about 6 years older than Java,
>> and the JVM would take another decade before reaching prominence. JavaScript
>> engines (running in web browsers only) at the time were terrible, and Perl
>> was entirely interpreted (no VM). So, in fact, CPython was written where
>> there was no really good platform for dynamic languages. It wasn't a matter
>> of hubris ("not invented here") to build a VM from scratch; there was simply
>> no choice.
>>
>>
>> Right now, though, there are many good choices. People like Rich Hickey
>> (Clojure) and Martin Odersky (Scala) have it right in targeting the JVM,
>> although both projects are also exploring .NET/Mono. If Python were invented
>> today, I imagine it also would start with "Jython," instead of trying to
>> reinvent the wheel (well, reinvent a whole damn car fleet, really, in terms
>> of the work required).
>>
>>
>> One caveat: I think there is room for "meta-VM" projects like PyPy and LLVM.
>> These signify a real progress in architecture, whereas "yet another dynamic
>> VM" does not.
>>
>>
>> -Tal
>>
>>
>> On 07/24/2011 02:56 PM, Jason Rexilius wrote:
>>
>>> I also have to quote:
>>>
>>> "rather that, for problems for which shared-memory concurrency is
>>> appropriate (read: the valid cases to complain about the GIL), message
>>> passing will not be, because of the marshal/unmarshal overhead (plus data
>>> size/locality ones)."
>>>
>>>
>>> I have to say this is some of the best discussion in quite a while. Dave's
>>> passionate response is great as well as others. I think the rudeness, or
>>> not, is kinda besides the point.
>>>
>>> There is a valid point to be made about marshal/unmarshal overhead in
>>> situations where data-manipulation-concurrency AND _user expectation_ or
>>> environmental constraints apply.  I think that's why people have some
>>> grounds to be unhappy with the GIL concept (for me its a concept) in certain
>>> circumstances. Tal is dead on in that "scalability" means different things.
>>>
>>> Oddly, I'm more engaged in this as an abstract comp sci question than a
>>> specific python question.  The problem set applies across languages.
>>>
>>> The question I would raise is if, given that an engineer understands the
>>> problem he is facing, are there both tools in the toolbox?  Is there an
>>> alternative to GIL for the use-cases where it is not the ideal solution?
>>>
>>> BTW, I will stand up for IPC as one of the tools in the toolbox to deal
>>> with scale/volume/speed/concurrency problems.
>>>
>>>
>>> On 7/24/11 1:58 PM, Tal Liron wrote:
>>>> I would say that there's truth in both approaches. "Scalability" means
>>>> different things at different levels of scale. A web example: the
>>>> architecture of Twitter or Facebook is nothing like the architecture of
>>>> even a large Django site. It's not even the same problem field.
>>>>
>>>>
>>>> A good threading model can be extremely efficient at certain scales. For
>>>> data structures that are mostly read, not written, synchronization is
>>>> not a performance issue, and you get the best throughput possible in
>>>> multicore situations. The truly best scalability would be achieved by a
>>>> combined approach: threading on a single node, message passing between
>>>> nodes. Programming for that, though, is a nightmare (unless you had a
>>>> programming language that makes both approaches transparent) and so
>>>> usually at the large scale the latter approach is chosen. One
>>>> significant challenge is to make sure that operations that MIGHT use the
>>>> same data structures are actually performed on the same node, so that
>>>> threading would be put to use.
>>>>
>>>>
>>>> So, what Dave said applies very well to threading, too: "you still need
>>>> to know what you're doing and how to decompose your application to use
>>>> it."
>>>>
>>>>
>>>> Doing concurrency right is hard. Doing message passing right is hard.
>>>> Functional (persistent data structure) languages are hard, too. Good
>>>> thing we're all such awesome geniuses, bursting with experience and a
>>>> desire to learn.
>>>>
>>>>
>>>> -Tal
>>>>
>>>>
>>>> On 07/23/2011 01:40 PM, David Beazley wrote:
>>>>
>>>>>> "high performance just create multi processes that message" very
>>>>>> rarely have
>>>>>> I heard IPC and high performance in the same sentence.
>>>>>>
>>>>>> Alex
>>>>>>
>>>>> Your youth and inexperience is the only reason would make a statement
>>>>> that ignorant. Go hang out with some people doing Python and
>>>>> supercomputing for awhile and report back---you will find that almost
>>>>> significant application is based on message passing (e.g., MPI). This
>>>>> is because message passing has proven itself to be about the only sane
>>>>> way of scaling applications up to run across thousands to tens of
>>>>> thousands of CPU cores.
>>>>>
>>>>> I speak from some experience as I was writing such software for large
>>>>> Crays, Connection Machines, and other systems when I first discovered
>>>>> Python back in 1996. As early as 1995, our group had done performance
>>>>> experiments comparing threads vs. message passing on some
>>>>> multiprocessor SMP systems and found that threads just didn't scale or
>>>>> perform as well as message passing even on machines with as few as 4
>>>>> CPUs. This was all highly optimized C code for numerics (i.e., no
>>>>> Python or GIL).
>>>>>
>>>>> That said, in order to code with message passing, you still need to
>>>>> know what you're doing and how to decompose your application to use it.
>>>>>
>>>>> Cheers,
>>>>> Dave
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> Chicago mailing list
>>>>> Chicago at python.org
>>>>> http://mail.python.org/mailman/listinfo/chicago
>>>> _______________________________________________
>>>> Chicago mailing list
>>>> Chicago at python.org
>>>> http://mail.python.org/mailman/listinfo/chicago
>>> _______________________________________________
>>> Chicago mailing list
>>> Chicago at python.org
>>> http://mail.python.org/mailman/listinfo/chicago
>> _______________________________________________
>> Chicago mailing list
>> Chicago at python.org
>> http://mail.python.org/mailman/listinfo/chicago
>>
> _______________________________________________
> Chicago mailing list
> Chicago at python.org
> http://mail.python.org/mailman/listinfo/chicago



More information about the Chicago mailing list