[Python-Dev] Free threading

Tim Peters tim.one@home.com
Mon, 13 Aug 2001 00:26:05 -0400


[Tim]
> Well, I'm not a Tcl guy, but Tcl has historically had no threads at
> all ...

[Paul Prescod]
> No, Tcl has real threads now.

"Now", yes, not "historically".  When Guido said that multiple interpreters
were the #1 request from Tcl-land, I'm sure he was talking about pre-thread
Tcl-land.

> Tcl and Perl both share a model that the Perl guys call "Ithreads".
> (for interpreter threads or independent threads, I guess) where each
> interpreter is firewalled from the other unless you ask to share
> information. Tcl's threads were driven by the AOL guys' need to get
> massive scalability.

You can't get massive scalability using OS threads, unless the OS is an
oddball specifically designed for that.  Worming around the bloat of Windows
and Linux (etc) OS threads is Stackless Python's natural domain (simulated
threads).  The catch, of course, is they that don't run in parallel!  Or
maybe by "massive" you mean a few dozen?  I worked in the "massively
parallel supercomputer" biz, so I'm thinking more on the order of a few
thousands of dozens.  I bet you're not.

> ...
> Why do people use mod_python, fcgi, mod_snake, mod_php and all of those
> other things instead of pure CGI?

Beats me why people bother with web programming at all <wink>.

> The usual claim is that they dislike the cost of forking a process
> and loading the interpreter code.  Now imagine a user with a 20
> processor machine. She isn't going to be happy with the price of
> forking processes and using IPC for information sharing either. On
> the other hand, she isn't going to be happy with a shared GIL.

The cost of creating 20 processes is trivial "even on Windows" if you only
need to do that once.  Process-based (as opposed to thread-based) solutions
are unnatural on Windows, though, and part of this argument seems cultural
in related ways.

> Free threading helps, but if sharing data has a performance cost (e.g.
> by requiring reference count operations to be locked, or requiring
> mutexes on dictionaries) then you might not want to pay that cost
> either. The Perl guys convinced me of that much.

IIRC, Greg's fabled free-threading version of Python took a speed hit of
about a factor of 2 (for a program using only 1 thread, compared to that
same program without the free-threading patches).

> The most popular "embedded scripting languages" (PHP and VBScript/ASP)
> use this totally independent thread model. As far as I know, neither has
> a concept of sharing information between threads. To a PHP programmer,
> that's what SQL and browser cookies are for. :-)

Well, the idea that threads don't share information is foreign to every
intense belief about the world *I've* ever been paid to adopt <wink>, and
I'm not enough of a Windows geek to believe "threads are always the answer"
even so.  A pool of worker processes using OS-specific IPC as needed works
great in my real-life experience, and if information sharing is rare, works
especially great because it's not *fighting* the OS and C libraries tooth
and nail.

What of your hypothetical user earlier, who "isn't going to be happy with
the price of ... using IPC for information sharing either"?  That is, in
what sense do isolated threads leave her happy about her information sharing
needs?  If she's happy to communicate via database transactions and queries,
fear of IPC being too expensive wouldn't be rational.