Python Performance

Markus Stenberg mstenber at cc.Helsinki.FI
Tue Jul 27 00:13:30 EDT 1999


aaron_watters at my-deja.com writes:
> In article <al81zdxkktf.fsf at sirppi.helsinki.fi>,
>   Markus Stenberg <mstenber at cc.Helsinki.FI> wrote:
> > Even in websites' case, I would worry; Python doesn't use multiCPU
> very
> > well, and with single CPU, it is fairly easy to saturate one with
> > pure-Python code, especially if it is badly designed to boot.
> Bad design aside.  If you have many CPU's for a web site you
> might consider using plain old CGI scripts, since python would then
> use all the cpu's.  In a single (multithreaded) process python will
> only use one (thanks to the global interpreter lock).  Also, I
> strongly feel CGI is more stable and reliable than other methods.

True, but overhead of one Python instance starting makes 2 CPU box fairly
prohibitive for CGI use, and thus requires 4+ to make it even give _some_
gains compared to constantly-running interpreter.

> > Using a Gadfly database, _with_ pre-compiled queries: (very simple
> > one-liners from table with one entry)
> > select: 21.4182/sec [3.64s] (46.689ms/call)
> > update: 14.3209/sec [3.98s] (69.828ms/call)
> Note please that gadfly queries have a certain amount of fixed overhead.
> You *might* find that queries over a table with one entry are not
> much faster than queries over tables with 1000 entries, or maybe even
> tables with 10000 entries that have appropriate indices.

True, but same numbers for non-Python-implemented databases are, even with
Python interfaces, 100+x times as fast. 

> However each gadfly query requires 10's of python function
> calls regardless of how simple the query is.  But more complex queries
> over more data don't necessarily require many more python function
> calls (this was one of the design goals for the engine).

Nod.. I didn't mean to particularly _blame_ Gadfly, just point out that
implementing some things in Python seems to be prohibitvely slow due to the
'speed'. 

> Also, I hope you have added the kjbuckets builtin...

Yes.

> On second thought 21 queries per second ain't too bad...
> What are you comparing it to?  Updates are naturally slower
> since they involve writes to a log file for recovery purposes.

Single select/update's on Solid or MySQL with Python interface do easily
1000/second on trivial tables. (on most boxen)

> > .. and so on. Even more distressing results lie in the speed of say,
> > implementing protocols in Python (when we're talking _nearly_ realtime
> > action).
> Be careful here.  Protocol speed can be effected by a lot of stuff.
> Some of it can be python and some of it can be other system overhead.
> If a profiler or unix "top" says it's the python process and it's
> implemented well, then you've got a point.  But until I see the
> code I personally would withhold assignment of blame.

Well, I might share the code to get some insights, but according to 'top'
Python process is hogging the CPU, _and_ primarily thanks to number of
method calls in the OO-based protocol model with two layers (TLS-lookalike
protocol with 100 percent Python implementation).

I've speeded it up by factor of N by just removing the layers and merging
all functions without more than 5 lines to their calling functions
directly, but result is ugly-looking (albeit faster) protocol with some
~100+ liner functions :P

>    -- Aaron Watters

-Markus Stenberg

-- 
"...Deep Hack Mode--that mysterious and frightening state of
consciousness where Mortal Users fear to tread."
	- Matt Welsh




More information about the Python-list mailing list