[Chicago] threading is slow
Brantley Harris
deadwisdom at gmail.com
Thu Mar 7 01:00:17 CET 2013
I agree with Mr. Griffin here.
On Wed, Mar 6, 2013 at 5:35 PM, Daniel Griffin <dgriff1 at gmail.com> wrote:
> What sort of speed are you looking for here? Does the ordering matter? If
> not then you can just do a multiprocessing Pool and call map with a chunk
> of the million int pairs.
>
>
> On Wed, Mar 6, 2013 at 4:31 PM, Oren Livne <livne at uchicago.edu> wrote:
>
>> Thanks so much for all your answers!
>>
>> I have a text file with a million int pairs, each of which can be
>> processed immediately. I would like to set up a queue to read lines from
>> the file and feed a thread pool that will process it in parallel and output
>> into (say) another queue, to be processed by another thread that prints the
>> results.
>>
>>
>> On 3/6/2013 5:19 PM, Brantley Harris wrote:
>>
>> Whoa, back up. What are you trying to do with threads?
>>
>>
>> On Wed, Mar 6, 2013 at 5:05 PM, Daniel Griffin <dgriff1 at gmail.com> wrote:
>>
>>> Python has a GIL so threads mostly sort of suck. Use multiprocessing,
>>> twisted or celery.
>>>
>>>
>>> On Wed, Mar 6, 2013 at 3:29 PM, Oren Livne <livne at uchicago.edu> wrote:
>>>
>>>> Dear All,
>>>>
>>>> I am new to python multithreading. It seems that using threading causes
>>>> a slow down with more threads rather than a speedup. should I be using the
>>>> multiprocessing module instead? Any good examples for threads reading from
>>>> a queue with multiprocessing?
>>>>
>>>> Thanks so much,
>>>> Oren
>>>>
>>>> #!/usr/bin/env python
>>>> '''Sum up the first 100000000 numbers. Time the speed-up of using
>>>> multithreading.'''
>>>> import threading, time, numpy as np
>>>>
>>>> class SumThread(threading.Thread):
>>>> def __init__(self, a, b):
>>>> threading.Thread.__init__(self)
>>>> self.a = a
>>>> self.b = b
>>>> self.s = 0
>>>>
>>>> def run(self):
>>>> self.s = sum(i for i in xrange(self.a, self.b))
>>>>
>>>> def main(num_threads):
>>>> start = time.time()
>>>> a = map(int, np.core.function_base.linspace(0, 100000000,
>>>> num_threads + 1, True))
>>>> # spawn a pool of threads, and pass them queue instance
>>>> threads = []
>>>> for i in xrange(num_threads):
>>>> t = SumThread(a[i], a[i + 1])
>>>> t.setDaemon(True)
>>>> t.start()
>>>> threads.append(t)
>>>>
>>>> # Wait for all threads to complete
>>>> for t in threads:
>>>> t.join()
>>>>
>>>> # Fetch results
>>>> s = sum(t.s for t in threads)
>>>> print '#threads = %d, result = %10d, elapsed Time: %s' %
>>>> (num_threads, s, time.time() - start)
>>>>
>>>> for n in 2 ** np.arange(4):
>>>> main(n)
>>>>
>>>> Output:
>>>> #threads = 1, result = 4999999950000000, elapsed Time: 12.3320000172
>>>> #threads = 2, result = 4999999950000000, elapsed Time: 16.5600001812
>>>> ???
>>>> #threads = 4, result = 4999999950000000, elapsed Time: 16.7489998341
>>>> ???
>>>> #threads = 8, result = 4999999950000000, elapsed Time: 16.6720001698
>>>> ???
>>>>
>>>> _______________________________________________
>>>> Chicago mailing list
>>>> Chicago at python.org
>>>> http://mail.python.org/mailman/listinfo/chicago
>>>>
>>>
>>>
>>> _______________________________________________
>>> Chicago mailing list
>>> Chicago at python.org
>>> http://mail.python.org/mailman/listinfo/chicago
>>>
>>>
>>
>>
>> _______________________________________________
>> Chicago mailing listChicago at python.orghttp://mail.python.org/mailman/listinfo/chicago
>>
>>
>>
>> --
>> A person is just about as big as the things that make him angry.
>>
>>
>> _______________________________________________
>> Chicago mailing list
>> Chicago at python.org
>> http://mail.python.org/mailman/listinfo/chicago
>>
>>
>
> _______________________________________________
> Chicago mailing list
> Chicago at python.org
> http://mail.python.org/mailman/listinfo/chicago
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/chicago/attachments/20130306/0fbf5511/attachment.html>
More information about the Chicago
mailing list