python simply not scaleable enough for google?

Tue Nov 17 21:32:06 EST 2009

On Tue, 17 Nov 2009 22:11:42 +0530, Rustom Mody wrote:

> "Language L is (in)efficient. No! Only implementations are
> (in)efficient"
> 
> I am reminded of a personal anecdote.  It happened about 20 years ago
> but is still fresh and this thread reminds me of it.
> 
> I was attending some workshop on theoretical computer science. I gave a
> talk on Haskell.
> 
> I showed off all the good-stuff -- pattern matching, lazy lists,
> infinite data structures, etc etc.
> Somebody asked me: Isnt all this very inefficient? Now at that time I
> was a strong adherent of the Dijkstra-religion and this viewpoint
> "efficiency has nothing to do with languages, only implementations"
> traces to him. So I quoted that.
> 
> Slowing the venerable P S Thiagarajan got up and asked me: Lets say that
> I have a language with a type 'Proposition' And I have an operation on
> proposition called sat [ sat(p) returns true if p is satisfiable]...

I assume you're referring to this:

http://en.wikipedia.org/wiki/Boolean_satisfiability_problem

which is N-P complete and O(2**N) (although many such problems can be 
solved rapidly in polynomial time).

> I wont complete the tale other than to say that Ive never had the wind
> in my sails taken out so completely!
> 
> So Vincent? I wonder what you would have said in my place?

I won't answer for Vincent, but I would have made five points:

(1) The existence of one inherently slow function in a language does not 
mean that the language itself is slow overall. It's not clear exactly 
what "overall" means in the context of a language, but one function out 
of potentially thousands obviously isn't it.

(2) Obviously the quality of implementation for the sat function will 
make a major difference as far as speed goes, so the speed of the 
function is dependent on the implementation.

(3) Since the language definition doesn't specify an implementation, no 
prediction of the time needed to execute the function can be made. At 
most we know how many algorithmic steps the function will take, given 
many assumptions, but we have no idea of the constant term. The language 
definition would be satisfied by having an omniscient, omnipotent deity 
perform the O(2**N) steps required by the algorithm infinitely fast, i.e. 
in constant (zero) time, which would make it pretty fast. The fact that 
we don't have access to such deities to do our calculations for us is an 
implementation issue, not a language issue.

(4) In order to justify the claim that the language is slow, you have to 
define what you are comparing it against and how you are measuring the 
speed. Hence different benchmarks give different relative ordering 
between language implementations. You must have a valid benchmark, and 
not stack the deck against one language: compared to (say) integer 
addition in C, yes the sat function is slow, but that's an invalid 
comparison, as invalid as comparing the sat function against factorizing 
a one million digit number. ("Time to solve sat(P) -- sixty milliseconds. 
Time to factorize N -- sixty million years.") You have to compare similar 
functionality, not two arbitrary operations.

Can you write a sat function in (say) C that does better than the one in 
your language? If you can't, then you have no justification for saying 
that C is faster than your language, for the amount of work your language 
does. If you can write a faster implementation of sat, then you can 
improve the implementation of your language by using that C function, 
thus demonstrating that speed depends on the implementation, not the 
language.

(5) There's no need for such hypothetical examples. Let's use a more 
realistic example... disk IO is expensive and slow. I believe that disk 
IO is three orders of magnitude slower than memory access, and heaven 
help you if you're reading from tape instead of a hard drive!

Would anyone like to argue that every language which supports disk IO 
(including C, Lisp, Fortran and, yes, Python) are therefore "slow"? Since 
the speed of the hard drive dominates the time taken, we might even be 
justified as saying that all languages are equally slow!

Obviously this conclusion is nonsense. Since the conclusion is nonsense, 
we have to question the premise, and the weakest premise is the idea that 
talking about the speed of a *language* is even meaningful (except as a 
short-hand for "state of the art implementations of that language").

-- 
Steven