From pfein at pobox.com  Thu Jun  4 00:45:23 2009
From: pfein at pobox.com (Pete)
Date: Wed, 3 Jun 2009 18:45:23 -0400
Subject: [concurrency] Change of List Location
Message-ID: <9C8346B5-D6F9-4EEF-9066-B3213FC4E446@pobox.com>

Hi all-

Per requests, I've moved the concurrency-sig list to concurrency-sig at python.org 
  from it's present home at Google Groups.  I'll be shutting down the  
google list in the next few days.  Sorry for any inconvenience this  
may have caused.

--Pete

From paul at boddie.org.uk  Mon Jun  8 23:10:11 2009
From: paul at boddie.org.uk (Paul Boddie)
Date: Mon, 8 Jun 2009 23:10:11 +0200
Subject: [concurrency] Common Concurrent Problems
Message-ID: <200906082310.11586.paul@boddie.org.uk>

Hello,

I noticed recently that there's an effort to show solutions to common 
concurrent problems on the python.org Wiki, and I've been using this effort 
as an excuse to look over my own work and to improve it in various ways. I 
even made a remark on the "99 Concurrent Bottles of Beer" page which appears 
to have been wide of the mark - that one would surely try and use operating 
system features in the given example in order to provide a more optimal 
implementation - and I note that Glyph appears to regard the stated problem 
as not really being concurrent.

http://wiki.python.org/moin/Concurrency/99Bottles

In the next few days, I intend to release a new version of my pprocess library 
in order to more conveniently support problems like the one on the Wiki, but 
previous experience has shown that more compelling problems are required. 
Some time ago, the "Wide Finder" project attempted to document concurrency 
solutions for a log parsing problem which was regarded by some as being 
I/O-dominated, and for others led to optimised serial solutions that could 
outperform parallel solutions due to the scope for optimisation in most 
people's naive code. With pprocess, I bundle the PyGmy raytracer in order to 
demonstrate that multiple processes really do get used and can benefit 
programs on multiple cores, but I imagine that many people don't regard this 
as being "real world"-enough.

How should we extend the problem on the Wiki to be something that doesn't have 
a workable serial solution? Does anyone have any suggestions of more 
realistic problems, or are we back at the level of Wide Finder?

Paul

From pfein at pobox.com  Tue Jun  9 16:39:48 2009
From: pfein at pobox.com (Pete)
Date: Tue, 9 Jun 2009 10:39:48 -0400
Subject: [concurrency] Common Concurrent Problems
In-Reply-To: <200906082310.11586.paul@boddie.org.uk>
References: <200906082310.11586.paul@boddie.org.uk>
Message-ID: <0F1691A6-E522-41D2-BCEF-254020C53312@pobox.com>

On Jun 8, 2009, at 5:10 PM, Paul Boddie wrote:

On http://wiki.python.org/moin/Concurrency/99Bottles

> even made a remark on the "99 Concurrent Bottles of Beer" page which  
> appears
> to have been wide of the mark - that one would surely try and use  
> operating
> system features in the given example in order to provide a more  
> optimal

To clarify:  my objective with this page is to give potential users a  
general sense of what code using a variety of toolkits/libraries/ 
techniques looks like.  The technical term for such a collection is a  
chrestomathy: "a collection of similar programs written in various  
programming languages, for the purpose of demonstrating differences in  
syntax, semantics and idioms for each language"  http://en.wikipedia.org/wiki/Program_Chrestomathy

AFAIC, such a collection is *not* the place for optimal solutions -  
that would be appropriate a benchmark (something that's probably worth  
doing as well).  Accordingly, I'd encourage submitters to minimize  
dependencies on external libraries (other than the toolkit being  
demonstrated, obviously) and focus on clarity & comprehensibility for  
new users.[0]

> implementation - and I note that Glyph appears to regard the stated  
> problem
> as not really being concurrent.
>
> How should we extend the problem on the Wiki to be something that  
> doesn't have
> a workable serial solution?

The particular problem (tail | grep) came out of Beaz's class and was  
incredibly helpful for comparing generators vs. coroutines.  We  
*should* find a problem that is actually concurrent - how about tail| 
grep'ing multiple input files?

> Does anyone have any suggestions of more realistic problems, or are  
> we back at the level of Wide Finder?


I don't see realism as the primary goal here - we could just use tail  
& grep after all. ;-)  That said, ideas for reasonable benchmarks  
would be helpful - thoughts?

--Pete

[0] I'm -0 on the use of time.sleep() & assuming input comes in full  
lines. 
  

From aahz at pythoncraft.com  Tue Jun  9 22:23:45 2009
From: aahz at pythoncraft.com (Aahz)
Date: Tue, 9 Jun 2009 13:23:45 -0700
Subject: [concurrency] Common Concurrent Problems
In-Reply-To: <0F1691A6-E522-41D2-BCEF-254020C53312@pobox.com>
References: <200906082310.11586.paul@boddie.org.uk>
	<0F1691A6-E522-41D2-BCEF-254020C53312@pobox.com>
Message-ID: <20090609202345.GA7885@panix.com>

On Tue, Jun 09, 2009, Pete wrote:
>
> The particular problem (tail | grep) came out of Beaz's class and was  
> incredibly helpful for comparing generators vs. coroutines.  We *should* 
> find a problem that is actually concurrent - how about tail|grep'ing 
> multiple input files?

What about a spider?  Feel free to rip this off and rewrite as multiple
processes (preferably with credit but I don't really care):

http://pythoncraft.com/OSCON2001/ThreadPoolSpider.py
-- 
Aahz (aahz at pythoncraft.com)           <*>         http://www.pythoncraft.com/

"If you don't know what your program is supposed to do, you'd better not
start writing it."  --Dijkstra

From g at rrett.us.com  Tue Jun  9 22:49:25 2009
From: g at rrett.us.com (Garrett Smith)
Date: Tue, 9 Jun 2009 15:49:25 -0500 (CDT)
Subject: [concurrency] Common Concurrent Problems
In-Reply-To: <2107601994.565351244580472505.JavaMail.root@mail-3.01.com>
Message-ID: <EMEWEMEW2_DELIMl58FnP831f22bd7e4c9b6416d731, g@rrett.us.com,
	105093970.566131244580565259.JavaMail.roo>

----- "Aahz" <aahz at pythoncraft.com> wrote:
> On Tue, Jun 09, 2009, Pete wrote:
>> The particular problem (tail | grep) came out of Beaz's class and was  
>> incredibly helpful for comparing generators vs. coroutines.  We
>> *should* find a problem that is actually concurrent - how about 
>> tail|grep'ing multiple input files?
> 
> What about a spider?  Feel free to rip this off and rewrite as
> multiple processes (preferably with credit but I don't really care):

The generator/coroutines example of the pipeline is a good example of
lazy evaluation in Python. That's pretty cool coming from Python!

In Pete's defense, the ideas *were* presented in a workshop entitled
"Concurrency" :)

What about parallel quick sort? This seems to me almost a "hello world" 
version of a parallel algorithm. No IO though.

Garrett

From pfein at pobox.com  Wed Jun 10 05:16:32 2009
From: pfein at pobox.com (Pete)
Date: Tue, 9 Jun 2009 23:16:32 -0400
Subject: [concurrency] Common Concurrent Problems
In-Reply-To: <EMEWEMEW2_DELIMl58FnP831f22bd7e4c9b6416d731, g@rrett.us.com,
	105093970.566131244580565259.JavaMail.roo>
References: <EMEWEMEW2_DELIMl58FnP831f22bd7e4c9b6416d731, g@rrett.us.com,
	105093970.566131244580565259.JavaMail.roo>
Message-ID: <63742584-0616-47D2-87C2-3846E0632168@pobox.com>

On Jun 9, 2009, at 4:49 PM, Garrett Smith wrote:

> ----- "Aahz" <aahz at pythoncraft.com> wrote:
>> What about a spider?  Feel free to rip this off and rewrite as
>> multiple processes (preferably with credit but I don't really care):

I worry that adding sockets into the mix complicates things (for this  
purpose anyway - sockets aren't all that complicated once you  
understand them).  Also, this needs to be a problem that can be solved  
using arbitrary techniques, not necessarily multiple threads/processes  
(gotta make room for Twisted & friends, after all).

I think the multi-file tail|grep works well here:

  * actually concurrent
  * everyone understands files
  * amenable to a variety of solutions
  * allows simplification while still being mostly correct (the  
incomplete line issue)

> What about parallel quick sort? This seems to me almost a "hello  
> world"
> version of a parallel algorithm. No IO though.

Hmm, it seems we've got two types of concurrency:
  * using multiple CPUs simultaneously
  * "concurrent" I/O - let's just interpret this as "not grinding to a  
halt while waiting for input"

Do we need two different problems (ugh)?  At a minimum, a better  
definition of terms for use on the wiki/brain seems desirable.

I'd rather be the consensus taker than the decision maker, so speak  
up. ;-)

--Pete

From prologic at shortcircuit.net.au  Wed Jun 10 13:08:48 2009
From: prologic at shortcircuit.net.au (James Mills)
Date: Wed, 10 Jun 2009 21:08:48 +1000
Subject: [concurrency] Common Concurrent Problems
In-Reply-To: <200906082310.11586.paul@boddie.org.uk>
References: <200906082310.11586.paul@boddie.org.uk>
Message-ID: <e1a84d570906100408r721cc5b9pc4b2f5bc52df62d@mail.gmail.com>

On Tue, Jun 9, 2009 at 7:10 AM, Paul Boddie<paul at boddie.org.uk> wrote:
> Hello,
>
> I noticed recently that there's an effort to show solutions to common
> concurrent problems on the python.org Wiki, and I've been using this effort
> as an excuse to look over my own work and to improve it in various ways. I
> even made a remark on the "99 Concurrent Bottles of Beer" page which appears
> to have been wide of the mark - that one would surely try and use operating
> system features in the given example in order to provide a more optimal
> implementation - and I note that Glyph appears to regard the stated problem
> as not really being concurrent.
>
> http://wiki.python.org/moin/Concurrency/99Bottles

Hi all,

I joined this list as concurrency and distributed systems interest me
and after reading the current thread of converstation I thought I'd
share my implementation of the problem with a library/framework
I maintain called circuits (1)

http://codepad.org/6VVLVZ6g

This consists of a threaded Follow Component and 2 other components
responsible for line-buffering and pattern-matching. Before I post
my solution to the wiki, I'll extend my implementation a bit to
following multiple files simultaneously.

Best regards,

cheers
James

1. http://trac.softcircuit.com.au/circuits/

From prologic at shortcircuit.net.au  Wed Jun 10 13:25:12 2009
From: prologic at shortcircuit.net.au (James Mills)
Date: Wed, 10 Jun 2009 21:25:12 +1000
Subject: [concurrency] Common Concurrent Problems
In-Reply-To: <e1a84d570906100408r721cc5b9pc4b2f5bc52df62d@mail.gmail.com>
References: <200906082310.11586.paul@boddie.org.uk>
	<e1a84d570906100408r721cc5b9pc4b2f5bc52df62d@mail.gmail.com>
Message-ID: <e1a84d570906100425v13f10d12v9618cde54b10a2bd@mail.gmail.com>

On Wed, Jun 10, 2009 at 9:08 PM, James
Mills<prologic at shortcircuit.net.au> wrote:
> I joined this list as concurrency and distributed systems interest me
> and after reading the current thread of converstation I thought I'd
> share my implementation of the problem with a library/framework
> I maintain called circuits (1)
>
> http://codepad.org/6VVLVZ6g
>
> This consists of a threaded Follow Component and 2 other components
> responsible for line-buffering and pattern-matching. Before I post
> my solution to the wiki, I'll extend my implementation a bit to
> following multiple files simultaneously.

Here is my 2nd implementation which relies on the non-blocking/asynchronous
File Component in circuits.io (which unfortunately is not cross-platform afaik)
but makes the implementation much simpler.

http://codepad.org/aPkqDFjA

cheers
James

From jeremy.mcmillan at gmail.com  Wed Jun 10 17:29:03 2009
From: jeremy.mcmillan at gmail.com (Jeremy McMillan)
Date: Wed, 10 Jun 2009 10:29:03 -0500
Subject: [concurrency] Common Concurrent Problems
In-Reply-To: <0F1691A6-E522-41D2-BCEF-254020C53312@pobox.com>
References: <200906082310.11586.paul@boddie.org.uk>
	<0F1691A6-E522-41D2-BCEF-254020C53312@pobox.com>
Message-ID: <845F120C-F9F0-4815-A364-806015015A6F@gmail.com>

I think we should sample a range of parallel problems so we can  
benchmark implementations based on some objective criteria.

Concurrency problems exist on a scale that ranges from  
"embarrassingly parallel" to "inherently sequential" and can be  
benchmarked by Amdahl's Law: Some tasks can not be computed in  
parallel, and the minimum time to compute a parallel task depends on  
the maximal serially computed component. This is an ideal that will  
allow objective scoring the "concurrency potential" of a given  
problem with Big-O notation.

http://en.wikipedia.org/wiki/Embarrassingly_parallel
http://en.wikipedia.org/wiki/Amdahl%27s%5Flaw
[notice the urlencoded title in the URL looks like Amdahl's Flaw]

In reality, doing computing in parallel comes with two overhead  
factors. First you have to distribute the workload. That means  
setting up each parallel execution context and dealing out the  
workload. Second, you have to coalesce (some of) the computed  
products of those parallel operations. Besides benchmarking  
performance, any given approach/tool/API may require different levels  
of effort to implement/read/debug those two chunks.

So I guess what I'm suggesting is a three dimensional matrix of  
benchmarks: x Frameworks/APIs/Tools, y representative sample problems  
on the scale of concurrency potential, and z performance measurements.

On Jun 9, 2009, at 9:39 AM, Pete wrote:

> On Jun 8, 2009, at 5:10 PM, Paul Boddie wrote:
>
> On http://wiki.python.org/moin/Concurrency/99Bottles
>
>> even made a remark on the "99 Concurrent Bottles of Beer" page  
>> which appears
>> to have been wide of the mark - that one would surely try and use  
>> operating
>> system features in the given example in order to provide a more  
>> optimal
>
> To clarify:  my objective with this page is to give potential users  
> a general sense of what code using a variety of toolkits/libraries/ 
> techniques looks like.  The technical term for such a collection is  
> a chrestomathy: "a collection of similar programs written in  
> various programming languages, for the purpose of demonstrating  
> differences in syntax, semantics and idioms for each language"   
> http://en.wikipedia.org/wiki/Program_Chrestomathy
>
> AFAIC, such a collection is *not* the place for optimal solutions -  
> that would be appropriate a benchmark (something that's probably  
> worth doing as well).  Accordingly, I'd encourage submitters to  
> minimize dependencies on external libraries (other than the toolkit  
> being demonstrated, obviously) and focus on clarity &  
> comprehensibility for new users.[0]
>
>> implementation - and I note that Glyph appears to regard the  
>> stated problem
>> as not really being concurrent.
>>
>> How should we extend the problem on the Wiki to be something that  
>> doesn't have
>> a workable serial solution?
>
> The particular problem (tail | grep) came out of Beaz's class and  
> was incredibly helpful for comparing generators vs. coroutines.  We  
> *should* find a problem that is actually concurrent - how about  
> tail|grep'ing multiple input files?
>
>> Does anyone have any suggestions of more realistic problems, or  
>> are we back at the level of Wide Finder?
>
>
> I don't see realism as the primary goal here - we could just use  
> tail & grep after all. ;-)  That said, ideas for reasonable  
> benchmarks would be helpful - thoughts?
>
> --Pete
>
> [0] I'm -0 on the use of time.sleep() & assuming input comes in  
> full lines._______________________________________________
> concurrency-sig mailing list
> concurrency-sig at python.org
> http://mail.python.org/mailman/listinfo/concurrency-sig


From Larry at Riedel.org  Wed Jun 10 18:21:24 2009
From: Larry at Riedel.org (Larry Riedel)
Date: Wed, 10 Jun 2009 09:21:24 -0700
Subject: [concurrency] Common Concurrent Problems
In-Reply-To: <0F1691A6-E522-41D2-BCEF-254020C53312@pobox.com>
References: <200906082310.11586.paul@boddie.org.uk>
	<0F1691A6-E522-41D2-BCEF-254020C53312@pobox.com>
Message-ID: <7c64c2920906100921v57f424hadb3a210fe44522e@mail.gmail.com>

> The technical term for such a collection is a
> chrestomathy: "a collection of similar programs
> written in various programming languages, for
> the purpose of demonstrating differences in
> syntax, semantics and idioms for each language"
> http://en.wikipedia.org/wiki/Program_Chrestomathy

For that I think naively I would expect to see
something like a list of, for example, how to have a
shared object protected using a condition variable
with a mutex, to gain exclusive access to the shared
object, change its state, and signal threads/processes
waiting on the condition variable that the object
state has changed, and to see how to do these things
using Python, java.util.concurrent, pthreads, etc.
And likewise how to perform other common fundamental
concurrent operations in various languages.

I realize there is a separate orthogonal aspect, which
is how to solve a particular problem using either a
shared object approach, a message passing or parallel
approach, or ..., but to me that aspect is something
which could be treated independently of showing how
to implement each style of solution using Python.
In other words, for me, there are already lots of
sources for information about what are the different
concurrent approaches to solving a particular problem,
and what I would be interested in is seeing how to
idiomatically implement a particular approach using
Python.  And I would like especially to see where
certain approaches not likely to work well with
CPython, might work fine using Jython or IronPython.


Larry

From simonwittber at gmail.com  Thu Jun 11 06:11:15 2009
From: simonwittber at gmail.com (Simon Wittber)
Date: Thu, 11 Jun 2009 12:11:15 +0800
Subject: [concurrency] Common Concurrent Problems
In-Reply-To: <200906082310.11586.paul@boddie.org.uk>
References: <200906082310.11586.paul@boddie.org.uk>
Message-ID: <4e4a11f80906102111i4be2bcbeyd8482d2c4b656bf6@mail.gmail.com>

On Tue, Jun 9, 2009 at 5:10 AM, Paul Boddie<paul at boddie.org.uk> wrote:
> Hello,
>
> I noticed recently that there's an effort to show solutions to common
> concurrent problems on the python.org Wiki, and I've been using this effort
> as an excuse to look over my own work and to improve it in various ways. I
> even made a remark on the "99 Concurrent Bottles of Beer" page which appears
> to have been wide of the mark - that one would surely try and use operating
> system features in the given example in order to provide a more optimal
> implementation - and I note that Glyph appears to regard the stated problem
> as not really being concurrent.

I agree with your comments, however the examples do provide an entry
point towards understanding the different toolkits. They also show how
message passing is handled, terseness and clarity, as well as how much
boiler plate code is needed to set things up.

I've just added an example for Stackless, and an example for Fibra.


-- 
    :: Simon Wittber
    :: http://www.linkedin.com/in/simonwittber
    :: phone: +61.4.0135.0685
    :: jabber/msn: simonwittber at gmail.com

From pfein at pobox.com  Fri Jun 12 17:16:12 2009
From: pfein at pobox.com (Pete)
Date: Fri, 12 Jun 2009 11:16:12 -0400
Subject: [concurrency] Inside the Python GIL
References: <549053140906120751u15b2e494r9ce1c3201a04215a@mail.gmail.com>
Message-ID: <79E6039F-2F2C-4BBE-9F48-FFF0457CB999@pobox.com>

I didn't attend last night's UG, but I saw Dave give a version of this  
talk about a month ago.  I'll second Carl's opinion - this talk is of  
critical importance to anyone using threads in Python.

Begin forwarded message:

> From: Carl Karsten <carl at personnelware.com>
> Date: June 12, 2009 10:51:33 AM EDT
> To: The Chicago Python Users Group <chicago at python.org>
> Subject: Re: [Chicago] Posted : Video
>
> * David Beazley: mind-blowing presentation about how the Python GIL
> actually works and why it's even worse than most people even imagine.
> http://blip.tv/file/2232410   http://www.dabeaz.com/python/GIL.pdf


From jnoller at gmail.com  Fri Jun 12 17:45:45 2009
From: jnoller at gmail.com (Jesse Noller)
Date: Fri, 12 Jun 2009 11:45:45 -0400
Subject: [concurrency] Inside the Python GIL
In-Reply-To: <79E6039F-2F2C-4BBE-9F48-FFF0457CB999@pobox.com>
References: <549053140906120751u15b2e494r9ce1c3201a04215a@mail.gmail.com>
	<79E6039F-2F2C-4BBE-9F48-FFF0457CB999@pobox.com>
Message-ID: <4222a8490906120845u7015fa81ne3b2d02b108c44be@mail.gmail.com>

Really? Is this the worse thing ever? How many of us building heavily
threaded I/O bound applications are truly hampered by this? Yes; this
sucks for CPU bound applications, that's been known since the earth
cooled.

I, and many others, have been using threads in python w/o issue, now
that multiprocessing is in core, when I do run into a limitation, I
simply swap out the imports in many cases.

On Fri, Jun 12, 2009 at 11:16 AM, Pete<pfein at pobox.com> wrote:
> I didn't attend last night's UG, but I saw Dave give a version of this talk
> about a month ago. ?I'll second Carl's opinion - this talk is of critical
> importance to anyone using threads in Python.
>
> Begin forwarded message:
>
>> From: Carl Karsten <carl at personnelware.com>
>> Date: June 12, 2009 10:51:33 AM EDT
>> To: The Chicago Python Users Group <chicago at python.org>
>> Subject: Re: [Chicago] Posted : Video
>>
>> * David Beazley: mind-blowing presentation about how the Python GIL
>> actually works and why it's even worse than most people even imagine.
>> http://blip.tv/file/2232410 ? http://www.dabeaz.com/python/GIL.pdf
>
> _______________________________________________
> concurrency-sig mailing list
> concurrency-sig at python.org
> http://mail.python.org/mailman/listinfo/concurrency-sig
>

From jyasskin at gmail.com  Fri Jun 12 17:52:37 2009
From: jyasskin at gmail.com (Jeffrey Yasskin)
Date: Fri, 12 Jun 2009 08:52:37 -0700
Subject: [concurrency] Inside the Python GIL
In-Reply-To: <79E6039F-2F2C-4BBE-9F48-FFF0457CB999@pobox.com>
References: <549053140906120751u15b2e494r9ce1c3201a04215a@mail.gmail.com>
	<79E6039F-2F2C-4BBE-9F48-FFF0457CB999@pobox.com>
Message-ID: <5d44f72f0906120852i36fbe33di156f5455496d9ab4@mail.gmail.com>

The slides make perfect sense. As he says, the open question is what
to do about it. If someone can write a relatively simple patch to
improve the behavior, with a test to make sure it stays improved, I
think it would have a very good chance of getting accepted into
CPython. A complex patch would have less chance because of Jesse's
answer. :) Unladen Swallow
(http://code.google.com/p/unladen-swallow/source/browse/tests/perf.py)
would accept a benchmark just measuring the problem even without a
suggestion to improve it.

Here's a discussion that may illustrate why fixing this is tough:

 * On a multicore machine, a waiting thread has to do some amount of
work to wake up. A reasonable ballpark is ~1us. It makes sense to let
the foreground thread continue making progress while the background
thread is waking up, especially since the OS may not choose to wake up
a Python thread first. So we let the foreground thread re-acquire the
GIL immediately after releasing it, in the hope that it can get a
couple more checks in before the background thread actually wakes up.
BUT, we don't really want to let it continue running after the waiting
thread does wake up, so perhaps we should have the waiting thread set
a flag when it does wake up which forces the foreground thread to
sleep asap. Then the waiting thread has to wait for the GIL again, but
we DON'T want it to hand control back to the OS or we would have
wasted that waking-up time. So maybe we have it spin-wait. But what
happens if the OS has actually swapped out the foreground thread for
another process? Then we waste lots of time. I don't know of any OSes
that give us a way to do something when a thread gets swapped out.
They don't even let another thread check whether a given thread is
currently running.

 * On a single core, any time the foreground thread spends executing
after signaling a waiting thread is time the waiting thread can't use
to wake up. So it makes sense to force a context switch to a
particular waiting thread. This is actually pretty easy: instead of a
GIL, we have a binary semaphore per thread that gets upped to instruct
a particular thread to run, and then the previously-running thread
immediately waits on its own semaphore. The issue here is just the
time it takes to switch threads: ~1us. The GIL checks are currently
every 100 ticks (every couple opcodes), which means that in
arithmetic-heavy code those checks occur on the order of every
microsecond too. You don't want to spend half of your time switching
threads. On the other hand, as Dave pointed out, sometimes even 100
ticks isn't soon enough. I think we could solve this by checking the
elapsed time on each "check" rather than unconditionally switching
threads, but we might want to do something to give I/O-bound threads
higher priority.

Anyway, I'm not likely to work on this any time soon, but I'm happy to
review any patches someone else produces. :)

On Fri, Jun 12, 2009 at 8:16 AM, Pete<pfein at pobox.com> wrote:
> I didn't attend last night's UG, but I saw Dave give a version of this talk
> about a month ago. ?I'll second Carl's opinion - this talk is of critical
> importance to anyone using threads in Python.
>
> Begin forwarded message:
>
>> From: Carl Karsten <carl at personnelware.com>
>> Date: June 12, 2009 10:51:33 AM EDT
>> To: The Chicago Python Users Group <chicago at python.org>
>> Subject: Re: [Chicago] Posted : Video
>>
>> * David Beazley: mind-blowing presentation about how the Python GIL
>> actually works and why it's even worse than most people even imagine.
>> http://blip.tv/file/2232410 ? http://www.dabeaz.com/python/GIL.pdf
>

From jeremy at alum.mit.edu  Fri Jun 12 18:08:27 2009
From: jeremy at alum.mit.edu (Jeremy Hylton)
Date: Fri, 12 Jun 2009 12:08:27 -0400
Subject: [concurrency] Inside the Python GIL
In-Reply-To: <4222a8490906120845u7015fa81ne3b2d02b108c44be@mail.gmail.com>
References: <549053140906120751u15b2e494r9ce1c3201a04215a@mail.gmail.com>
	<79E6039F-2F2C-4BBE-9F48-FFF0457CB999@pobox.com>
	<4222a8490906120845u7015fa81ne3b2d02b108c44be@mail.gmail.com>
Message-ID: <e8bf7a530906120908k38c3a5f2q3975b55ee778a0c9@mail.gmail.com>

On Fri, Jun 12, 2009 at 11:45 AM, Jesse Noller<jnoller at gmail.com> wrote:
> Really? Is this the worse thing ever? How many of us building heavily
> threaded I/O bound applications are truly hampered by this? Yes; this
> sucks for CPU bound applications, that's been known since the earth
> cooled.

I'm not sure I understand how to distinguish between I/O bound threads
and CPU bound threads.  If you've got a relatively simple
multi-threaded application like an HTTP fetcher with a thread pool
fetching a lot of urls, you're probably going to end up having more
than one thread  with input to process at any instant.  There's a ton
of Python code that executes when that happens.  You've got a urllib
addinfourl wrapper, a httplib HTTPResponse (with read & _safe_read)
and a socket _fileobject.  Heaven help you if you are using readline.
So I could image even this trivial I/O bound program having lots of
CPU contention.

Jeremy


> I, and many others, have been using threads in python w/o issue, now
> that multiprocessing is in core, when I do run into a limitation, I
> simply swap out the imports in many cases.
>
> On Fri, Jun 12, 2009 at 11:16 AM, Pete<pfein at pobox.com> wrote:
>> I didn't attend last night's UG, but I saw Dave give a version of this talk
>> about a month ago. ?I'll second Carl's opinion - this talk is of critical
>> importance to anyone using threads in Python.
>>
>> Begin forwarded message:
>>
>>> From: Carl Karsten <carl at personnelware.com>
>>> Date: June 12, 2009 10:51:33 AM EDT
>>> To: The Chicago Python Users Group <chicago at python.org>
>>> Subject: Re: [Chicago] Posted : Video
>>>
>>> * David Beazley: mind-blowing presentation about how the Python GIL
>>> actually works and why it's even worse than most people even imagine.
>>> http://blip.tv/file/2232410 ? http://www.dabeaz.com/python/GIL.pdf
>>
>> _______________________________________________
>> concurrency-sig mailing list
>> concurrency-sig at python.org
>> http://mail.python.org/mailman/listinfo/concurrency-sig
>>
> _______________________________________________
> concurrency-sig mailing list
> concurrency-sig at python.org
> http://mail.python.org/mailman/listinfo/concurrency-sig
>

From jnoller at gmail.com  Fri Jun 12 18:18:12 2009
From: jnoller at gmail.com (Jesse Noller)
Date: Fri, 12 Jun 2009 12:18:12 -0400
Subject: [concurrency] Inside the Python GIL
In-Reply-To: <e8bf7a530906120908k38c3a5f2q3975b55ee778a0c9@mail.gmail.com>
References: <549053140906120751u15b2e494r9ce1c3201a04215a@mail.gmail.com>
	<79E6039F-2F2C-4BBE-9F48-FFF0457CB999@pobox.com>
	<4222a8490906120845u7015fa81ne3b2d02b108c44be@mail.gmail.com>
	<e8bf7a530906120908k38c3a5f2q3975b55ee778a0c9@mail.gmail.com>
Message-ID: <4222a8490906120918v1a210f99i69149c89e1f2f5f8@mail.gmail.com>

On Fri, Jun 12, 2009 at 12:08 PM, Jeremy Hylton<jeremy at alum.mit.edu> wrote:
> On Fri, Jun 12, 2009 at 11:45 AM, Jesse Noller<jnoller at gmail.com> wrote:
>> Really? Is this the worse thing ever? How many of us building heavily
>> threaded I/O bound applications are truly hampered by this? Yes; this
>> sucks for CPU bound applications, that's been known since the earth
>> cooled.
>
> I'm not sure I understand how to distinguish between I/O bound threads
> and CPU bound threads. ?If you've got a relatively simple
> multi-threaded application like an HTTP fetcher with a thread pool
> fetching a lot of urls, you're probably going to end up having more
> than one thread ?with input to process at any instant. ?There's a ton
> of Python code that executes when that happens. ?You've got a urllib
> addinfourl wrapper, a httplib HTTPResponse (with read & _safe_read)
> and a socket _fileobject. ?Heaven help you if you are using readline.
> So I could image even this trivial I/O bound program having lots of
> CPU contention.
>
> Jeremy
>

Speaking as someone who does have lots of apps doing heavily threaded
URL fetching (puts, gets, deletes) - the GIL ends up not bothering me,
and does speed things up (but not as much as I'd like). I tend to push
heavier data parsing off via multiprocessing, and stick to just
threads for the GET/PUT/POSTS.

I had one benchmark in PEP 371 which did url fetching
(http://www.python.org/dev/peps/pep-0371/):

        cmd: python run_benchmarks.py url_get.py
        Importing url_get
        Starting tests ...
        non_threaded (1 iters)  0.124774 seconds
        threaded (1 threads)    0.120478 seconds
        processes (1 procs)     0.121404 seconds

        non_threaded (2 iters)  0.239574 seconds
        threaded (2 threads)    0.146138 seconds
        processes (2 procs)     0.138366 seconds

        non_threaded (4 iters)  0.479159 seconds
        threaded (4 threads)    0.200985 seconds
        processes (4 procs)     0.188847 seconds

        non_threaded (8 iters)  0.960621 seconds
        threaded (8 threads)    0.659298 seconds
        processes (8 procs)     0.298625 seconds

For heavy http handling though, I rapidly move to using pycurl, rather
than httplib, which of course brings a C module into play and allows
me to sidestep some of the issues even more.

Note that I'm not advocating/saying "things are fine as is" - I'm a
pretty squeaky wheel when it comes to making this space (threads, the
GIL, etc) better. Right now, my biggest thing to watch is
unladen-swallow in this regard, as I don't see a lot of movement for
this in core today.

However, that being said; I think people get hung up on the GIL before
even knowing if it does affect their application, and are too quick to
discount python threads as a whole before figuring it out for
themselves.

jesse

From skip at pobox.com  Fri Jun 12 19:39:34 2009
From: skip at pobox.com (skip at pobox.com)
Date: Fri, 12 Jun 2009 12:39:34 -0500
Subject: [concurrency] Inside the Python GIL
In-Reply-To: <e8bf7a530906120908k38c3a5f2q3975b55ee778a0c9@mail.gmail.com>
References: <549053140906120751u15b2e494r9ce1c3201a04215a@mail.gmail.com>
	<79E6039F-2F2C-4BBE-9F48-FFF0457CB999@pobox.com>
	<4222a8490906120845u7015fa81ne3b2d02b108c44be@mail.gmail.com>
	<e8bf7a530906120908k38c3a5f2q3975b55ee778a0c9@mail.gmail.com>
Message-ID: <18994.37590.103665.53874@montanaro.dyndns.org>


    Jeremy> I'm not sure I understand how to distinguish between I/O bound
    Jeremy> threads and CPU bound threads.

I don't know that we can (people writing bits of Python which operate on
threads).  I suspect a useful distinction though is that an I/O bound thread
mostly gives up the CPU to wait on an I/O device, while a CPU bound thread
is mostly "evicted" from the CPU by the OS scheduler.  (Though I sort of
suspect you already understand this textbook definition.)  Is that what
you're referring to by "not sure I understand"?

Skip

From aahz at pythoncraft.com  Fri Jun 12 19:50:00 2009
From: aahz at pythoncraft.com (Aahz)
Date: Fri, 12 Jun 2009 10:50:00 -0700
Subject: [concurrency] Inside the Python GIL
In-Reply-To: <e8bf7a530906120908k38c3a5f2q3975b55ee778a0c9@mail.gmail.com>
References: <549053140906120751u15b2e494r9ce1c3201a04215a@mail.gmail.com>
	<79E6039F-2F2C-4BBE-9F48-FFF0457CB999@pobox.com>
	<4222a8490906120845u7015fa81ne3b2d02b108c44be@mail.gmail.com>
	<e8bf7a530906120908k38c3a5f2q3975b55ee778a0c9@mail.gmail.com>
Message-ID: <20090612175000.GA26972@panix.com>

On Fri, Jun 12, 2009, Jeremy Hylton wrote:
>
> I'm not sure I understand how to distinguish between I/O bound threads
> and CPU bound threads.  If you've got a relatively simple
> multi-threaded application like an HTTP fetcher with a thread pool
> fetching a lot of urls, you're probably going to end up having more
> than one thread  with input to process at any instant.  There's a ton
> of Python code that executes when that happens.  You've got a urllib
> addinfourl wrapper, a httplib HTTPResponse (with read & _safe_read)
> and a socket _fileobject.  Heaven help you if you are using readline.
> So I could image even this trivial I/O bound program having lots of
> CPU contention.

You could imagine, but have you tested it?  ;-)  Back in the 1.5.2 days,
I helped write a web crawler where the sweet spot was around twenty or
thirty threads.  That clearly indicates a significant I/O bottleneck.
-- 
Aahz (aahz at pythoncraft.com)           <*>         http://www.pythoncraft.com/

"Many customs in this life persist because they ease friction and promote
productivity as a result of universal agreement, and whether they are
precisely the optimal choices is much less important." --Henry Spencer

From dave at dabeaz.com  Fri Jun 12 19:27:13 2009
From: dave at dabeaz.com (David Beazley)
Date: Fri, 12 Jun 2009 12:27:13 -0500
Subject: [concurrency] Inside the Python GIL
In-Reply-To: <4222a8490906120918v1a210f99i69149c89e1f2f5f8@mail.gmail.com>
References: <549053140906120751u15b2e494r9ce1c3201a04215a@mail.gmail.com>
	<79E6039F-2F2C-4BBE-9F48-FFF0457CB999@pobox.com>
	<4222a8490906120845u7015fa81ne3b2d02b108c44be@mail.gmail.com>
	<e8bf7a530906120908k38c3a5f2q3975b55ee778a0c9@mail.gmail.com>
	<4222a8490906120918v1a210f99i69149c89e1f2f5f8@mail.gmail.com>
Message-ID: <AB40AAB6-FA88-4EF6-B503-1677F9E992EA@dabeaz.com>

>
> However, that being said; I think people get hung up on the GIL before
> even knowing if it does affect their application, and are too quick to
> discount python threads as a whole before figuring it out for
> themselves.
>

I agree.   I'd even so far as to say that more people should probably  
go pick up an operating systems text and look at it.  In the big  
picture, the GIL doesn't really matter if everything stays I/O  
bound.   It's only when programs start to drift away from I/O  
processing that things start to get fuzzy.   Obviously, the material I  
presented in the talk is at the opposite extreme (where there is heavy  
CPU processing).   The real question is what is happening for programs  
that sit somewhere in the middle of that space.  I honestly don't know.

Cheers,
Dave


From jeremy.mcmillan at gmail.com  Fri Jun 12 21:45:43 2009
From: jeremy.mcmillan at gmail.com (Jeremy McMillan)
Date: Fri, 12 Jun 2009 14:45:43 -0500
Subject: [concurrency] Inside the Python GIL
In-Reply-To: <18994.37590.103665.53874@montanaro.dyndns.org>
References: <549053140906120751u15b2e494r9ce1c3201a04215a@mail.gmail.com>
	<79E6039F-2F2C-4BBE-9F48-FFF0457CB999@pobox.com>
	<4222a8490906120845u7015fa81ne3b2d02b108c44be@mail.gmail.com>
	<e8bf7a530906120908k38c3a5f2q3975b55ee778a0c9@mail.gmail.com>
	<18994.37590.103665.53874@montanaro.dyndns.org>
Message-ID: <E7D882D9-4CD1-457A-AE17-302C1F61A39B@gmail.com>

Proof is in the pudding. If you can cook up a load test, under what  
conditions can you soak up all of your CPU's cycles?

On Jun 12, 2009, at 12:39 PM, skip at pobox.com wrote:

>
>     Jeremy> I'm not sure I understand how to distinguish between I/ 
> O bound
>     Jeremy> threads and CPU bound threads.
>
> I don't know that we can (people writing bits of Python which  
> operate on
> threads).  I suspect a useful distinction though is that an I/O  
> bound thread
> mostly gives up the CPU to wait on an I/O device, while a CPU bound  
> thread
> is mostly "evicted" from the CPU by the OS scheduler.  (Though I  
> sort of
> suspect you already understand this textbook definition.)  Is that  
> what
> you're referring to by "not sure I understand"?
>
> Skip
> _______________________________________________
> concurrency-sig mailing list
> concurrency-sig at python.org
> http://mail.python.org/mailman/listinfo/concurrency-sig


From jeremy at alum.mit.edu  Fri Jun 12 21:58:21 2009
From: jeremy at alum.mit.edu (Jeremy Hylton)
Date: Fri, 12 Jun 2009 15:58:21 -0400
Subject: [concurrency] Inside the Python GIL
In-Reply-To: <E7D882D9-4CD1-457A-AE17-302C1F61A39B@gmail.com>
References: <549053140906120751u15b2e494r9ce1c3201a04215a@mail.gmail.com>
	<79E6039F-2F2C-4BBE-9F48-FFF0457CB999@pobox.com>
	<4222a8490906120845u7015fa81ne3b2d02b108c44be@mail.gmail.com>
	<e8bf7a530906120908k38c3a5f2q3975b55ee778a0c9@mail.gmail.com>
	<18994.37590.103665.53874@montanaro.dyndns.org>
	<E7D882D9-4CD1-457A-AE17-302C1F61A39B@gmail.com>
Message-ID: <e8bf7a530906121258y71dca54ftcca047e7f15a0a45@mail.gmail.com>

Yeah, by "not sure I understand" means that I'd be interested to see
how the GIL behaves for some other workloads.  I have some angst about
the number of abstraction layers in the std library for I/O, so I'm
curious about how the interpreter would do as the number of I/O
threads increases.

Jeremy

On Fri, Jun 12, 2009 at 3:45 PM, Jeremy
McMillan<jeremy.mcmillan at gmail.com> wrote:
> Proof is in the pudding. If you can cook up a load test, under what
> conditions can you soak up all of your CPU's cycles?
>
> On Jun 12, 2009, at 12:39 PM, skip at pobox.com wrote:
>
>>
>> ? ?Jeremy> I'm not sure I understand how to distinguish between I/O bound
>> ? ?Jeremy> threads and CPU bound threads.
>>
>> I don't know that we can (people writing bits of Python which operate on
>> threads). ?I suspect a useful distinction though is that an I/O bound
>> thread
>> mostly gives up the CPU to wait on an I/O device, while a CPU bound thread
>> is mostly "evicted" from the CPU by the OS scheduler. ?(Though I sort of
>> suspect you already understand this textbook definition.) ?Is that what
>> you're referring to by "not sure I understand"?
>>
>> Skip
>> _______________________________________________
>> concurrency-sig mailing list
>> concurrency-sig at python.org
>> http://mail.python.org/mailman/listinfo/concurrency-sig
>
> _______________________________________________
> concurrency-sig mailing list
> concurrency-sig at python.org
> http://mail.python.org/mailman/listinfo/concurrency-sig
>

From ats at offog.org  Sat Jun 13 01:44:42 2009
From: ats at offog.org (Adam Sampson)
Date: Sat, 13 Jun 2009 00:44:42 +0100
Subject: [concurrency] Inside the Python GIL
In-Reply-To: <5d44f72f0906120852i36fbe33di156f5455496d9ab4@mail.gmail.com>
	(Jeffrey Yasskin's message of "Fri\,
	12 Jun 2009 08\:52\:37 -0700")
References: <549053140906120751u15b2e494r9ce1c3201a04215a@mail.gmail.com>
	<79E6039F-2F2C-4BBE-9F48-FFF0457CB999@pobox.com>
	<5d44f72f0906120852i36fbe33di156f5455496d9ab4@mail.gmail.com>
Message-ID: <y2ar5xp3tlh.fsf@cartman.at.offog.org>

Jeffrey Yasskin <jyasskin at gmail.com> writes:

> I think we could solve this by checking the elapsed time on each
> "check" rather than unconditionally switching threads,

This is a reasonable strategy, and one that's used elsewhere. One
problem is that reading the clock can be quite expensive, depending on
the OS and hardware -- although you may be able to get away with a
cheap-but-inaccurate clock (e.g. the IA32 TSC). This is a fairly common
problem for lightweight threading frameworks, so it's probably worth
looking at how (for example) the GHC runtime solves it.

I think your suggestion that each thread should have its own signal
is entirely sensible. It'd be less efficient than a proper lightweight
threading implementation (which is hard to do portably), but it'd at
least avoid the expensive contention on the lock in the current
approach, allow smarter scheduling for handling I/O and signals, and
make it easier to reason about Python's scheduling behaviour.

Managing the queues of processes to be run can be done in a wait-free
way, if atomics are available, so locking can go away entirely. It's
worth noting that a thread can decide to reschedule for two reasons:
either because it's hit one of the periodic checks, in which case it can
be woken up again immediately and can thus just wait, or because it
wants to go away and do something that doesn't involve the Python
runtime state, in which case it'll need to explicitly requeue itself
before waiting. This also gives you a fairly crude way to distinguish
CPU-bound (the first case) and I/O-bound (the second case) threads; in
the first case, you might want to prefer immediately rescheduling the
same thread most of the time if there are no higher-priority threads
waiting, to reduce cache thrashing.

Somewhat apropos of this, I did an extremely grotty hack a while ago to
build a Python threading implementation on top of CCSP, our multicore
lightweight threads runtime:

  Patch: http://offog.org/stuff/python-2.5-ccsp-v1.diff
  CCSP is part of KRoC: http://projects.cs.kent.ac.uk/projects/kroc/

(It's grotty because CCSP's lock semantics don't match Python's, and
because there's no way of getting a "thread identifier" from CCSP
directly. Both would be fixable with changes to CCSP.)

The benchmark results for pure Python programs were largely unimpressive
(as I expected), since the cheap communication was swamped by the amount
of time spent claiming and releasing the GIL...

-- 
Adam Sampson <ats at offog.org>                         <http://offog.org/>

From john at szakmeister.net  Sat Jun 13 13:30:45 2009
From: john at szakmeister.net (John Szakmeister)
Date: Sat, 13 Jun 2009 07:30:45 -0400
Subject: [concurrency] Inside the Python GIL
In-Reply-To: <AB40AAB6-FA88-4EF6-B503-1677F9E992EA@dabeaz.com>
References: <549053140906120751u15b2e494r9ce1c3201a04215a@mail.gmail.com>
	<79E6039F-2F2C-4BBE-9F48-FFF0457CB999@pobox.com>
	<4222a8490906120845u7015fa81ne3b2d02b108c44be@mail.gmail.com>
	<e8bf7a530906120908k38c3a5f2q3975b55ee778a0c9@mail.gmail.com>
	<4222a8490906120918v1a210f99i69149c89e1f2f5f8@mail.gmail.com>
	<AB40AAB6-FA88-4EF6-B503-1677F9E992EA@dabeaz.com>
Message-ID: <a1c1f4750906130430o6904939di81a2a784b902c788@mail.gmail.com>

On Fri, Jun 12, 2009 at 1:27 PM, David Beazley<dave at dabeaz.com> wrote:
[snip]
> I agree. ? I'd even so far as to say that more people should probably go
> pick up an operating systems text and look at it. ?In the big picture, the
> GIL doesn't really matter if everything stays I/O bound. ? It's only when
> programs start to drift away from I/O processing that things start to get
> fuzzy. ? Obviously, the material I presented in the talk is at the opposite
> extreme (where there is heavy CPU processing). ? The real question is what
> is happening for programs that sit somewhere in the middle of that space.
>?I honestly don't know.

FWIW, I just patched my py3k branch to use native Mach semaphores
instead of the mutex/condition variable combo, and it had a fairly
substantial savings in terms of system calls.  I'll see if I can get
that into some form that's acceptable for inclusion into the core.  It
obviously doesn't fix the greater problem, but at least makes things
more well behaved on Mac.

-John

From jeremy.mcmillan at gmail.com  Sat Jun 13 18:54:49 2009
From: jeremy.mcmillan at gmail.com (Jeremy McMillan)
Date: Sat, 13 Jun 2009 11:54:49 -0500
Subject: [concurrency] Inside the Python GIL
In-Reply-To: <a1c1f4750906130430o6904939di81a2a784b902c788@mail.gmail.com>
References: <549053140906120751u15b2e494r9ce1c3201a04215a@mail.gmail.com>
	<79E6039F-2F2C-4BBE-9F48-FFF0457CB999@pobox.com>
	<4222a8490906120845u7015fa81ne3b2d02b108c44be@mail.gmail.com>
	<e8bf7a530906120908k38c3a5f2q3975b55ee778a0c9@mail.gmail.com>
	<4222a8490906120918v1a210f99i69149c89e1f2f5f8@mail.gmail.com>
	<AB40AAB6-FA88-4EF6-B503-1677F9E992EA@dabeaz.com>
	<a1c1f4750906130430o6904939di81a2a784b902c788@mail.gmail.com>
Message-ID: <0A4D5002-D3CA-4DDF-BC35-093B48BF9C4D@gmail.com>

I'd like to help test that patch!

On Jun 13, 2009, at 6:30 AM, John Szakmeister wrote:

> On Fri, Jun 12, 2009 at 1:27 PM, David Beazley<dave at dabeaz.com> wrote:
> [snip]
>> I agree.   I'd even so far as to say that more people should  
>> probably go
>> pick up an operating systems text and look at it.  In the big  
>> picture, the
>> GIL doesn't really matter if everything stays I/O bound.   It's  
>> only when
>> programs start to drift away from I/O processing that things start  
>> to get
>> fuzzy.   Obviously, the material I presented in the talk is at the  
>> opposite
>> extreme (where there is heavy CPU processing).   The real question  
>> is what
>> is happening for programs that sit somewhere in the middle of that  
>> space.
>>  I honestly don't know.
>
> FWIW, I just patched my py3k branch to use native Mach semaphores
> instead of the mutex/condition variable combo, and it had a fairly
> substantial savings in terms of system calls.  I'll see if I can get
> that into some form that's acceptable for inclusion into the core.  It
> obviously doesn't fix the greater problem, but at least makes things
> more well behaved on Mac.
>
> -John
> _______________________________________________
> concurrency-sig mailing list
> concurrency-sig at python.org
> http://mail.python.org/mailman/listinfo/concurrency-sig


From piet at cs.uu.nl  Mon Jun 15 21:57:51 2009
From: piet at cs.uu.nl (Piet van Oostrum)
Date: Mon, 15 Jun 2009 21:57:51 +0200
Subject: [concurrency] Inside the Python GIL
In-Reply-To: <20090612175000.GA26972@panix.com>
References: <549053140906120751u15b2e494r9ce1c3201a04215a@mail.gmail.com>
	<79E6039F-2F2C-4BBE-9F48-FFF0457CB999@pobox.com>
	<4222a8490906120845u7015fa81ne3b2d02b108c44be@mail.gmail.com>
	<e8bf7a530906120908k38c3a5f2q3975b55ee778a0c9@mail.gmail.com>
	<20090612175000.GA26972@panix.com>
Message-ID: <18998.42943.38536.869443@cochabamba.local>

>>>>> Aahz <aahz at pythoncraft.com> (A) wrote:

>A> On Fri, Jun 12, 2009, Jeremy Hylton wrote:
>>> 
>>> I'm not sure I understand how to distinguish between I/O bound threads
>>> and CPU bound threads.  If you've got a relatively simple
>>> multi-threaded application like an HTTP fetcher with a thread pool
>>> fetching a lot of urls, you're probably going to end up having more
>>> than one thread  with input to process at any instant.  There's a ton
>>> of Python code that executes when that happens.  You've got a urllib
>>> addinfourl wrapper, a httplib HTTPResponse (with read & _safe_read)
>>> and a socket _fileobject.  Heaven help you if you are using readline.
>>> So I could image even this trivial I/O bound program having lots of
>>> CPU contention.

>A> You could imagine, but have you tested it?  ;-)  Back in the 1.5.2 days,
>A> I helped write a web crawler where the sweet spot was around twenty or
>A> thirty threads.  That clearly indicates a significant I/O bottleneck.

I have written a small script to test this. It fires up a couple of
threads (or does it unthreaded) that each fetch a couple of web pages
(random google searches to be precise). It then measures some things
like the CPU percentage (using the psutil module, but you could also
do it with the ps command of course). You can also choose to do some
CPU processing, such as HTML parsing or hash calculation. And writing
something to a file.

I noticed some 5 - 15 % CPU utilisation on my 2-core MacBook, when at
home on a 4Mb/s ADSL line. So apparently I/O bound. I guess on the high speed university
network the CPU load may be a bit higher. I'll test that tomorrow at
work.

And with respect to readline, I don't think there are problems with
that in newer Python versions. My program has an option to use
readline instead of read and I see no significant differences.

Anyway here is the program.

-------------- next part --------------
#!/usr/bin/env python

# Author: Piet van Oostrum <piet at cs.uu.nl>
# This software is free (no rights reserved).

""" This program tries to test the speed of fetching web pages and doing
some processing on them in a multithreaded environment. The main purpose is
to see how much CPU time it uses so that we might draw some conclusions
about the effectivity of using threads in Python. Normally O.S. Threads
should help to get greater throughput, but Python's GIL may hinder this.
The web pages will be the results of some Google searches.

You call this program with the following command line args:

    - number of pages to be fetched
    - number of threads to be used.
      0 means do everything in main thread
      > 0  means start that many threads
    - flags:
      r = use readline instead of read
      h calculate SHA1 and MD5 hashes of the pages
      p do some HTML parsing on the pages
      w write some information to logfile (length and/or calculated hash)
"""

import sys
import os
from random import random
import urllib2
import hashlib
import psutil
process = psutil.Process(os.getpid())
import time
start_time = time.time()

def usage(help):
    progname = sys.argv[0]
    if help:
        print __doc__
    else:
        print >> sys.stderr, """Usage:
        %s npages nthreads flags
        For more help: %s help
        """ % (progname, progname)
    sys.exit(1)

from HTMLParser import HTMLParser

class MyHTMLParser(HTMLParser):

    def __init__(self):
        HTMLParser.__init__(self)
        self.ntags = 0
        self.depth = 0
        self.maxdepth = 0

    def handle_starttag(self, tag, attrs):
        self.ntags += 1
        self.depth +=1
        if self.depth > self.maxdepth:
            self.maxdepth = self.depth

    def handle_endtag(self, tag):
        self.depth -= 1

class DummyLock(object):
    '''Dummy Lock class only used as context handler
    (therefore no aquire and release necessary)
    '''
    def __enter__(self):
        pass
    def __exit__(self, et, ev, tb):
        pass

# get some search terms

words = """acutely alarmclock anaesthesia antitypical arteries autochthones
bargain bestowal blondes brazen butterfingers buttermilk captions cedarwood
cherries circumference codification compliments contagious cotangent
crucified daiquiri defence deplete diagrams discontinue dixieland ducts
elastomers endodontist epistemic evaporator extravert fertilizer flicker
fortuitous futurology geometry godzilla grovel handwriter hemlock hologram
hydrologic ikebana incite ingrowth internally islamization jungle kurdish
leftmost lipstick lymphocyte manufactory melancholia nests nonharmonic
obscene opus overabundant pagesize partaker percolator philosophy pirouette
policy preacher primogenital protuberance pyrite rangers reconvert reindeer
reroute rhapsody rudeness saturday scurry servant sidewalk slurry soul
sprawl still subentry supersede temper thorny tortilla trichome twine
undercover unload unwed velcro vocation wheel wrong zoologic""".split()

nwords = len(words)
google = "http://www.google.nl/search?q="
logfile = "testthreads.log"
BUFSIZE = 1024

try:
    if sys.argv[1].strip().lower() == 'help':
        usage(True)
    npages = int(sys.argv[1])
    nthreads = int(sys.argv[2])
    if len(sys.argv) < 4:
        flags = ''
    else:
        flags = sys.argv[3]
except (ValueError, IndexError):
    usage(False)

use_readline = 'r' in flags
do_hash = 'h' in flags
do_parse = 'p' in flags
do_write = 'w' in flags

user_agent = "Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10_4_11; en) AppleWebKit/525.28.3 (KHTML, like Gecko)"
headers = { 'User-Agent' : user_agent }
    
def doit(np, lock):
    '''Fetch np web pages.
    lock will be used for exclusive access to the log file.
    Global variables do_hash and do_write will determine the behaviour.
    '''
    for i in range(np):
        url = google + "+".join((words[int(nwords * random())] for w in range(3)))
        req = urllib2.Request(url, None, headers)
        doc = urllib2.urlopen(req)
        docsize = 0
        if do_hash:
            h1 = hashlib.sha1()
            h2 = hashlib.md5()
        if do_parse:
            parser = MyHTMLParser()
            
        while True:
            if use_readline:
                data = doc.readline()
            else:
                data = doc.read(BUFSIZE)
            if not data:
                break
            docsize += len(data)
            if do_hash:
                h1.update(data)
                h2.update(data)
            if do_parse:
                parser.feed(data)

        if do_parse:
            parser.close()
        if do_write:
            with lock:
                log = open(logfile, 'a')
                print >>log, "URL: %s, size: %d" % (url, docsize)
                if do_hash:
                    print >>log, "sha1:", h1.hexdigest()
                    print >>log, "md5:", h2.hexdigest()
                if do_parse:
                    print >>log, "Read %d tags, max depth: %d" % \
                                  (parser.ntags, parser.maxdepth)
                log.close()

def start_thread(np, lock):
    '''Start a new thread fetching np pages, using lock for
    exclusive access to the logfile.
    The thread is put in the running_threads list.
    '''
    thr = threading.Thread(target = doit, args = (np, lock))
    thr.start()
    running_threads.append(thr)

running_threads = []
lock = DummyLock()

if nthreads == 0:
    doit(npages, lock)
else:
    import threading
    np = npages//nthreads
    np1 = npages - np*(nthreads - 1)
    if do_write:
        lock = threading.Lock()

    start_thread(np1, lock)
    for i in range(1, nthreads):
        start_thread(np, lock)

# Wait for all threads to finish

for thr in running_threads:
    thr.join()

print "CPU time (system): %.2f, (user): %.2f secs." % process.get_cpu_times()
print "Elapsed time: %.2f secs." % (time.time() - start_time)
print "CPU utilisation: %.2f %%" % process.get_cpu_percent()

-------------- next part --------------

-- 
Piet van Oostrum <piet at cs.uu.nl>
URL: http://pietvanoostrum.com [PGP 8DAE142BE17999C4]
Private email: piet at vanoostrum.org


From john at szakmeister.net  Sat Jun 20 12:34:53 2009
From: john at szakmeister.net (John Szakmeister)
Date: Sat, 20 Jun 2009 06:34:53 -0400
Subject: [concurrency] Inside the Python GIL
In-Reply-To: <0A4D5002-D3CA-4DDF-BC35-093B48BF9C4D@gmail.com>
References: <549053140906120751u15b2e494r9ce1c3201a04215a@mail.gmail.com>
	<79E6039F-2F2C-4BBE-9F48-FFF0457CB999@pobox.com>
	<4222a8490906120845u7015fa81ne3b2d02b108c44be@mail.gmail.com>
	<e8bf7a530906120908k38c3a5f2q3975b55ee778a0c9@mail.gmail.com>
	<4222a8490906120918v1a210f99i69149c89e1f2f5f8@mail.gmail.com>
	<AB40AAB6-FA88-4EF6-B503-1677F9E992EA@dabeaz.com>
	<a1c1f4750906130430o6904939di81a2a784b902c788@mail.gmail.com>
	<0A4D5002-D3CA-4DDF-BC35-093B48BF9C4D@gmail.com>
Message-ID: <a1c1f4750906200334t16280eafub5f98c22ef81207d@mail.gmail.com>

On Sat, Jun 13, 2009 at 12:54 PM, Jeremy McMillan
<jeremy.mcmillan at gmail.com> wrote:
>
> I'd like to help test that patch!

Sorry for the late response, I've been trying to clean it up (it was
*awfully* ugly before).  I'll definitely pass it your way... I'm going
to be out-of-town over the weekend, so hopefully some time next week.

I'm also trying to make sure my results are correct.  I had a fair
amount going on at the time, and perhaps too many things mixed
together.  I may have had some i/o that was skewing the results.  I'm
still looking at it all.

-John

From jeremy.mcmillan at gmail.com  Sat Jun 20 20:11:32 2009
From: jeremy.mcmillan at gmail.com (Jeremy McMillan)
Date: Sat, 20 Jun 2009 13:11:32 -0500
Subject: [concurrency] Inside the Python GIL
In-Reply-To: <a1c1f4750906200334t16280eafub5f98c22ef81207d@mail.gmail.com>
References: <549053140906120751u15b2e494r9ce1c3201a04215a@mail.gmail.com>
	<79E6039F-2F2C-4BBE-9F48-FFF0457CB999@pobox.com>
	<4222a8490906120845u7015fa81ne3b2d02b108c44be@mail.gmail.com>
	<e8bf7a530906120908k38c3a5f2q3975b55ee778a0c9@mail.gmail.com>
	<4222a8490906120918v1a210f99i69149c89e1f2f5f8@mail.gmail.com>
	<AB40AAB6-FA88-4EF6-B503-1677F9E992EA@dabeaz.com>
	<a1c1f4750906130430o6904939di81a2a784b902c788@mail.gmail.com>
	<0A4D5002-D3CA-4DDF-BC35-093B48BF9C4D@gmail.com>
	<a1c1f4750906200334t16280eafub5f98c22ef81207d@mail.gmail.com>
Message-ID: <73F681A9-0177-4DE8-9D21-9F8E91F2A1A1@gmail.com>

Happy Father's Day (to you and/or fathers in your family).

I'd like to take a crack at it without peeking under the hood.

BTW: I'm getting a new 8-Core MacPro soon, and I'm very interested in  
lubricating my Python works so I can take advantage of it.

---TIA.

On Jun 20, 2009, at 5:34 AM, John Szakmeister wrote:

> On Sat, Jun 13, 2009 at 12:54 PM, Jeremy McMillan
> <jeremy.mcmillan at gmail.com> wrote:
>>
>> I'd like to help test that patch!
>
> Sorry for the late response, I've been trying to clean it up (it was
> *awfully* ugly before).  I'll definitely pass it your way... I'm going
> to be out-of-town over the weekend, so hopefully some time next week.
>
> I'm also trying to make sure my results are correct.  I had a fair
> amount going on at the time, and perhaps too many things mixed
> together.  I may have had some i/o that was skewing the results.  I'm
> still looking at it all.
>
> -John
> _______________________________________________
> concurrency-sig mailing list
> concurrency-sig at python.org
> http://mail.python.org/mailman/listinfo/concurrency-sig