Help me use my Dual Core CPU!

Sat Sep 23 01:05:05 EDT 2006

"Michael Sparks" <sparks.m at gmail.com> writes:
> > Kamaelia doesn't attempt concurrency at all.  Its main idea is to use
> > generators to simulate microthreads.
> 
> Regarding Kamaelia, that's not been the case for over a year now.
> 
> We've had threaded components as well as generator based ones since
> around last July, however their API stablised properly about 4 months
> back. If you use C extensions that release the GIL and are using an OS
> that puts threads on different CPUs then you have genuine concurrency.
> (those are albeit some big caveats, but not uncommon ones in python).

Oh neat, this is good to hear.

> Personally, I'm very much in the camp that says "shared data is
> invariably a bad idea unless you really know what you're doing"
> (largely because it's the most common source of bugs for people where
> they're trying to do more than one thing at a time). People also
> generally appear to find writing threadsafe code very hard. (not
> everyone, just the people who aren't at the top end of the bell curve
> for writing code that does more than one thing at a time)

I don't think it's that bad.  Yes, free-threaded programs synchronized
by seat-of-the-pants locking turns to a mess pretty quickly.  But
ordinary programmers write real-world applications with shared data
all the time, namely database apps.  The database server provides the
synchronization and a good abstraction that lets the programmer not
have to go crazy thinking about the fine points, while maintaining the
shared data.  The cost is all the query construction, the marshalling
and demarshalling of the data from Python object formats into byte
strings, then copying these byte strings around through OS-supplied
IPC mechanisms involving multiple context switches and sometimes trips
up and down network protocol stacks even when the client and server
are on the same machine.  This is just silly, and wasteful of the
efforts of the hardworking chip designers who put that nice cache
coherence circuitry into our CPU's, to mediate shared data access at
the sub-instruction level so we don't need all that IPC hair.  

Basically if the hardware gods have blessed us with concurrent cpu's
sharing memory, it's our nerdly duty to figure out how to use it.  We
need better abstractions than raw locks, but this is hardly new.
Assembly language programmers made integer-vs-pointer aliasing or
similar type errors all the time, so we got compiled languages with
type consistency enforcement.  Goto statements turned ancient Fortran
code into spaghetti, so we got languages with better control
structures.  We found memory allocation bookkeeping to do by hand in
complex programs, so we use garbage collection now.  And to deal with
shared data, transactional databases have been a very effective tool
despite all the inefficiency mentioned above.  

Lately I've been reading about "software transactional memory" (STM),
a scheme for treating shared memory as if it were a database, without
using locks except for during updates.  In some versions, STM
transactions are composable, so nowhere near as bug-prone as
fine-grained locks; and because readers don't need locks (they instead
have to abort and restart transactions in the rare event of a
simultaneous update) STM actually performs -faster- than traditional
locking.  I posted a couple of URL's in another thread and will try
writing a more detailed post sometime.  It is pretty neat stuff.
There are some C libraries for it that it might be possible to port to
Python.