Please help with Threading

Carlos Nepomuceno carlosnepomuceno at outlook.com
Sat May 18 20:02:29 EDT 2013


----------------------------------------
> To: python-list at python.org
> From: wlfraed at ix.netcom.com
> Subject: Re: Please help with Threading
> Date: Sat, 18 May 2013 15:28:56 -0400
>
> On Sat, 18 May 2013 01:58:13 -0700 (PDT), Jurgens de Bruin
> <debruinjj at gmail.com> declaimed the following in
> gmane.comp.python.general:
>
>> This is my first script where I want to use the python threading module. I have a large dataset which is a list of dict this can be as much as 200 dictionaries in the list. The final goal is a histogram for each dict 16 histograms on a page ( 4x4 ) - this already works.
>> What I currently do is a create a nested list [ [ {} ], [ {} ] ] each inner list contains 16 dictionaries, thus each inner list is a single page of 16 histograms. Iterating over the outer-list and creating the graphs takes to long. So I would like multiple inner-list to be processes simultaneously and creating the graphs in "parallel".
>> I am trying to use the python threading for this. I create 4 threads loop over the outer-list and send a inner-list to the thread. This seems to work if my nested lists only contains 2 elements - thus less elements than threads. Currently the scripts runs and then seems to get hung up. I monitor the resource on my mac and python starts off good using 80% and when the 4-thread is created the CPU usages drops to 0%.
>>
>
> The odds are good that this is just going to run slower...

Just been told that GIL doesn't make things slower, but as I didn't know that such a thing even existed I went out looking for more info and found that document: http://www.dabeaz.com/python/UnderstandingGIL.pdf

Is it current? I didn't know Python threads aren't preemptive. Seems to be something really old considering the state of the art on parallel execution on multi-cores.

What's the catch on making Python threads preemptive? Are there any ongoing projects to make that?

> One: The common Python implementation uses a global interpreter lock
> to prevent interpreted code from interfering with itself in multiple
> threads. So "number cruncher" applications don't gain any speed from
> being partitioned into thread -- even on a multicore processor, only one
> thread can have the GIL at a time. On top of that, you have the overhead
> of the interpreter switching between threads (GIL release on one thread,
> GIL acquire for the next thread).
>
> Python threads work fine if the threads either rely on intelligent
> DLLs for number crunching (instead of doing nested Python loops to
> process a numeric array you pass it to something like NumPy which
> releases the GIL while crunching a copy of the array) or they do lots of
> I/O and have to wait for I/O devices (while one thread is waiting for
> the write/read operation to complete, another thread can do some number
> crunching).
>
> If you really need to do this type of number crunching in Python
> level code, you'll want to look into the multiprocessing library
> instead. That will create actual OS processes (each with a copy of the
> interpreter, and not sharing memory) and each of those can run on a core
> without conflicting on the GIL.

Which library do you suggest?

> --
> Wulfraed Dennis Lee Bieber AF6VN
> wlfraed at ix.netcom.com HTTP://wlfraed.home.netcom.com/
>
> --
> http://mail.python.org/mailman/listinfo/python-list 		 	   		  


More information about the Python-list mailing list