python simply not scaleable enough for google?

Thu Nov 12 11:35:23 EST 2009

On Nov 12, 10:07 am, mcherm <mch... at gmail.com> wrote:
> On Nov 11, 7:38 pm, Vincent Manis <vma... at telus.net> wrote:
>
> > 1. The statement `Python is slow' doesn't make any sense to me.
> > Python is a programming language; it is implementations that have
> > speed or lack thereof.
>    [...]
> > 2. A skilled programmer could build an implementation that compiled
> > Python code into Common Lisp or Scheme code, and then used a
> > high-performance Common Lisp compiler...
>
> I think you have a fundamental misunderstanding of the reasons why
> Python is
> slow. Most of the slowness does NOT come from poor implementations:
> the CPython
> implementation is extremely well-optimized; the Jython and Iron Python
> implementations use best-in-the-world JIT runtimes. Most of the speed
> issues
> come from fundamental features of the LANGUAGE itself, mostly ways in
> which
> it is highly dynamic.
>
> In Python, a piece of code like this:
>     len(x)
> needs to watch out for the following:
>     * Perhaps x is a list OR
>       * Perhaps x is a dict OR
>       * Perhaps x is a user-defined type that declares a __len__
> method OR
>       * Perhaps a superclass of x declares __len__ OR
>     * Perhaps we are running the built-in len() function OR
>       * Perhaps there is a global variable 'len' which shadows the
> built-in OR
>       * Perhaps there is a local variable 'len' which shadows the
> built-in OR
>       * Perhaps someone has modified __builtins__
>
> In Python it is possible for other code, outside your module to go in
> and
> modify or replace some methods from your module (a feature called
> "monkey-patching" which is SOMETIMES useful for certain kinds of
> testing).
> There are just so many things that can be dynamic (even if 99% of the
> time
> they are NOT dynamic) that there is very little that the compiler can
> assume.
>
> So whether you implement it in C, compile to CLR bytecode, or
> translate into
> Lisp, the computer is still going to have to to a whole bunch of
> lookups to
> make certain that there isn't some monkey business going on, rather
> than
> simply reading a single memory location that contains the length of
> the list.
> Brett Cannon's thesis is an example: he attempted desperate measures
> to
> perform some inferences that would allow performing these
> optimizations
> safely and, although a few of them could work in special cases, most
> of the
> hoped-for improvements were impossible because of the dynamic nature
> of the
> language.
>
> I have seen a number of attempts to address this, either by placing
> some
> restrictions on the dynamic nature of the code (but that would change
> the
> nature of the Python language) or by having some sort of a JIT
> optimize the
> common path where we don't monkey around. Unladen Swallow and PyPy are
> two
> such efforts that I find particularly promising.
>
> But it isn't NEARLY as simple as you make it out to be.
>
> -- Michael Chermside

obviously the GIL is a major reason it's so slow. That's one of the
_stated_ reasons why Google has decided to forgo CPython code. As far
as how sweeping the directive is, I don't know, since the situation
would sort of resolve itself if one committed to Jython application
building or just wait until unladen swallow is finished.