python simply not scaleable enough for google?

Sat Nov 14 09:13:47 EST 2009

On Nov 14, 8:55 am, Vincent Manis <vma... at telus.net> wrote:
> On 2009-11-13, at 23:20, Robert Brown wrote, quoting me:
> > Please look atCLPython. [...]
> Ah, that does explain it.

I bet you didn't even look at it. FWIW, I'm the author of CLPython.

> CLOS is most definitely the wrong vehicle for implementing
> Python method dispatch. CLOS is focused around generic functions that themselves
> do method dispatch, and do so in a way that is different from Python's. If I were
> building a Python implementation in CL, I would definitely NOT use CLOS, but
> do my own dispatch using funcall (the CL equivalent of the now-vanished Python
> function apply).

CLOS is way more than method dispatch, it's an infrastructure for
classes, metaclasses, slots, method combinations. And within method
dispatch there are lots of opportunities to customize the behaviour.
Ignoring all that functinality when implementing an object-oriented
language, and using "dispatch using funcall" whatever that means, just
sounds ridiculous. (And funcall != apply.)

> > Method lookup is just the tip if the iceburg.  How about comparison?  Here are
> > some comments fromCLPython'simplementation of compare.  There's a lot going
> > on.  It's complex and SLOW.

Right, although by special-casing the most common argument types you
can save most of that lookup.

> Re comparison. Python 3 has cleaned comparison up a fair bit. In particular, you
> can no longer compare objects of different types using default comparisons.
> However, it could well be that there are nasty little crannies of inefficiency
> there, they could be the subject of PEPs after the moratorium is over.

It might have gotten a bit better, but the central message still
stands: Python has made design choices that make efficient compilation
hard.

> OK, let me try this again. My assertion is that with some combination of JITting,
> reorganization of the Python runtime, and optional static declarations, Python
> can be made acceptably fast,

That does not contradict that, had other language design choices been
made, it could be much easier to get better performance. Python may in
general be about as dynamic as Common Lisp from a _user_ perspective,
but from an implementator's point of view Python is harder to make it
run efficiently.

Here are some examples of design choices in Common Lisp that help it
perform very well; while in Python there is more freedom at the cost
of performance:

 - Lisp hashtables, arrays, numbers, and strings are not subclassable.
The specialized operations on them, like function aref for array index
referencing, don't invoke arbitrary user-level code, and also not
lookup of magic methods.

 - Certain Lisp sequence objects, like lists and arrays, can easily be
allocated on the stack;

 - Lisp allows type declarations, like for variables, array types,
function arguments and return values.

 - Even if Python had type declarations, it is possible to define a
subclass that redefines semantics. E.g. it's possible to subclass
'int' and redefine what '+' means, making a declaration that "x is of
type int" not as valuable as in Lisp.

 - A recursive function defined at module level, can not assume that
its name refers to itself.

 - Every function can be called using keyword arguments or positional
arguments. E.g. with the definition "def f(x,y): ..", some possible
calls are: f(1), f(1,2), f(x=1, y=2), f(1,y=2) so every function must
be prepared to do keyword argument processing. (This can be considered
a lack of separation between internal details and external interface.)

 - Every function call could potentially be a call to locals() (e.g.
f=locals; f()), which means every function that contains a function
call must store the value of all locals, even of "dead" variables.

 - Built-in functions can be shadowed.

 - The potential fields of a Python object are often not defined, as
arbitrary attributes can be set. Accessors for fields generally
generally have to retrieve the value from a dict.

 - When limiting the potential fields of a class instance using
__slots__, subclasses may override __slots__ thus this is hardly
limiting.

 - Python attribute lookup and comparison (as shown in a previous
mail) are examples of hairy behaviour that often mean the lookup of
several (!) __magic__ methods and could invoke arbitrary user code.
(Lisp in particular offers "structures" whose definition is fixed,
that inline all accessors, for maximum efficiency.)

This is just to show how language design leads to efficieny
characteristics. That there is a need for projects like NumPy and
Cython, follows in my eyes from Python being too dynamic for its own
good, with no way to tame it. In Common Lisp there would be less need
to go to C for speed, because of user-supplied type declarations and
compiler-based type inferencing.

It has been said that CLPython is a very good counterargument for
"just write a Python to Lisp compiler to make things fast", and even I
as its developer agree. Lisp offers lots of readily available
optimization opportunities, but Python simply doesn't.

I remember reading some years ago about a Smalltalk compiler guru who
said he would come up with a ridiculously fast Python implementation
based on all the message sending optimizations he knew. It does not
surprise me that we've never heard from him yet.

- Willem