[Chicago] threading is slow

Daniel Griffin dgriff1 at gmail.com
Thu Mar 7 00:35:59 CET 2013


What sort of speed are you looking for here? Does the ordering matter? If
not then you can just do a multiprocessing Pool and call map with a chunk
of the million int pairs.


On Wed, Mar 6, 2013 at 4:31 PM, Oren Livne <livne at uchicago.edu> wrote:

>  Thanks so much for all your answers!
>
> I have a text file with a million int pairs, each of which can be
> processed immediately. I would like to set up a queue to read lines from
> the file and feed a thread pool that will process it in parallel and output
> into (say) another queue, to be processed by another thread that prints the
> results.
>
>
> On 3/6/2013 5:19 PM, Brantley Harris wrote:
>
> Whoa, back up.  What are you trying to do with threads?
>
>
> On Wed, Mar 6, 2013 at 5:05 PM, Daniel Griffin <dgriff1 at gmail.com> wrote:
>
>> Python has a GIL so threads mostly sort of suck. Use multiprocessing,
>> twisted or celery.
>>
>>
>> On Wed, Mar 6, 2013 at 3:29 PM, Oren Livne <livne at uchicago.edu> wrote:
>>
>>> Dear All,
>>>
>>> I am new to python multithreading. It seems that using threading causes
>>> a slow down with more threads rather than a speedup. should I be using the
>>> multiprocessing module instead? Any good examples for threads reading from
>>> a queue with multiprocessing?
>>>
>>> Thanks so much,
>>> Oren
>>>
>>> #!/usr/bin/env python
>>> '''Sum up the first 100000000 numbers. Time the speed-up of using
>>> multithreading.'''
>>> import threading, time, numpy as np
>>>
>>> class SumThread(threading.Thread):
>>>     def __init__(self, a, b):
>>>         threading.Thread.__init__(self)
>>>         self.a = a
>>>         self.b = b
>>>         self.s = 0
>>>
>>>     def run(self):
>>>         self.s = sum(i for i in xrange(self.a, self.b))
>>>
>>> def main(num_threads):
>>>     start = time.time()
>>>     a = map(int, np.core.function_base.linspace(0, 100000000,
>>> num_threads + 1, True))
>>>     # spawn a pool of threads, and pass them queue instance
>>>     threads = []
>>>     for i in xrange(num_threads):
>>>         t = SumThread(a[i], a[i + 1])
>>>         t.setDaemon(True)
>>>         t.start()
>>>         threads.append(t)
>>>
>>>     # Wait for all threads to complete
>>>     for t in threads:
>>>         t.join()
>>>
>>>     # Fetch results
>>>     s = sum(t.s for t in threads)
>>>     print '#threads = %d, result = %10d, elapsed Time: %s' %
>>> (num_threads, s, time.time() - start)
>>>
>>> for n in 2 ** np.arange(4):
>>>     main(n)
>>>
>>> Output:
>>> #threads = 1, result = 4999999950000000, elapsed Time: 12.3320000172
>>> #threads = 2, result = 4999999950000000, elapsed Time: 16.5600001812  ???
>>> #threads = 4, result = 4999999950000000, elapsed Time: 16.7489998341  ???
>>> #threads = 8, result = 4999999950000000, elapsed Time: 16.6720001698  ???
>>>
>>> _______________________________________________
>>> Chicago mailing list
>>> Chicago at python.org
>>> http://mail.python.org/mailman/listinfo/chicago
>>>
>>
>>
>> _______________________________________________
>> Chicago mailing list
>> Chicago at python.org
>> http://mail.python.org/mailman/listinfo/chicago
>>
>>
>
>
> _______________________________________________
> Chicago mailing listChicago at python.orghttp://mail.python.org/mailman/listinfo/chicago
>
>
>
> --
> A person is just about as big as the things that make him angry.
>
>
> _______________________________________________
> Chicago mailing list
> Chicago at python.org
> http://mail.python.org/mailman/listinfo/chicago
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/chicago/attachments/20130306/5e007809/attachment.html>


More information about the Chicago mailing list