Parallelization on muli-CPU hardware?

Tue Oct 12 12:57:50 EDT 2004

"Alex Martelli" <aleaxit at yahoo.com> a écrit dans le message de news:1glk0w7.wac0yl4laurwN%aleaxit at yahoo.com...
> Nicolas Lehuen <nicolas.lehuen at thecrmcompany.com> wrote:
>    ...
> > > > > Sorry, I don't get your point.  Sure, Windows makes process creation
> > > > > hideously expensive and has no forking.  But all kinds of BSD, including
> > > > > MacOSX, are just great at forking.  Why is a preference for multiple
> > > > > processes over threads "forgetting about BSD, MacOSX", or any other
> > > > > flavour of Unix for that matter?
>    ...
> > mod_python's market (pure guess). So the multithreading issues have not
> > many chances of being fixed soon. That is why I said that a
> > "Linux-centered mindset forgetting [other OSes, none in particular]" is
> > hindering the bugfix process.
>    ...
> > I do hope the point above is cleared.
> 
> It's basically "retracted", as I see thing, except that whatever's left
> still doesn't make any sense to me.  Why should a Linux user need to
> "forget" another system, that's just as perfect for multiprocessing as
> Linux, before deciding he's got better things to do with his or her time
> than work on a problem which multiprocessing finesses>

Because a Linux user could eventually be interested in sharing data amongst different workers - be it processes or threads. Things that are taken for granted in the Java world, like a good connection pool, are still not being used whereas they are really interesting from a performance and management point of view.

> > Now, you propose to share objects between Python VMs using shared memory.
> 
> Not necessarily -- as you say, if you need synchronization the problem
> is just about as hard (maybe even worse).  I was just trying to
> understand the focus on multithreading vs multiprocessing, it now
> appears it's more of a focus on the shared-memory paradigm of
> multiprocessing in general.

Exactly. I don't care about multithreading or multiprocessing, I want to share data between my workers. It's just more easily done with threads, as of today. The connection pool example is a good one : how can you easily share TCP connection to a DBMS server between processes ? With threads it's a matter of a few lines of Python code (I'll publish my connection pool code on the Python Cookbook soon).

> > > > Obviously they won't be any worse. Well, to be precise, it still depends
> > > on the OS, because the scheduler must know the difference between 2
> > > processors and a 2-core processor to efficiently balance the work, but
>    ...
> > I am way beyond my competences here, but I've read some articles about
> > hyperthreading-aware schedulers (in WinXP, Win2003, and patches for
> > Linux). The idea is that on multi-core CPUs, threads from the same
> > process should be ran on the same core for maximum cache efficiency,
> > whereas different processes can freely run on different cores. I've read
> 
> And how is this a difference between two processors and two cores within
> the same processor, which is what I quoted you above as saying?  If two
> CPUs do not share caches, the CPU-affinity issues (of processing units
> that share address spaces vs ones that don't) would appear to be the
> same.  If two CPUs share some level of cache (as some multi-CPU designs
> do), that's different from the case where the CPUs share no cache but to
> share RAM.
>
> > But apart from this caveat, yes, multi-threads and multi-processes
> > application equally benefit from multi-core CPUs.
> 
> So it would seem to me, yes -- except that if CPUs share caches this may
> help (perhaps) used of shared memory (if the cache design is optimized
> for that and the scheduler actively helps), even though even in that
> case it doesn't seem to me you can reach the same bandwidth that
> hypertransport promises for a more streaming/message passing approach.
> 
> 
> Alex

The trick is that there are many levels of internal or external cache, and that IIRC from my readings the two cores of a Pentium IV processor with HT share some level of internal cache.

But again, I'm pretty much outsmarted on the subject, so I'll call it quits on the subject of hyperthreading and its compared impact on the threaded model vs the fork model :)

Best regards,

Nicolas