Is there a more efficient threading lock?

Sat Feb 25 17:20:20 EST 2023

On 2/25/2023 4:41 PM, Skip Montanaro wrote:
> Thanks for the responses.
> 
> Peter wrote:
> 
>> Which OS is this?
> 
> MacOS Ventura 13.1, M1 MacBook Pro (eight cores).
> 
> Thomas wrote:
> 
>  > I'm no expert on locks, but you don't usually want to keep a lock while
>  > some long-running computation goes on.  You want the computation to be
>  > done by a separate thread, put its results somewhere, and then notify
>  > the choreographing thread that the result is ready.
> 
> In this case I'm extracting the noun phrases from the body of an email 
> message(returned as a list). I have a collection of email messages 
> organized by month(typically 1000 to 3000 messages per month). I'm using 
> concurrent.futures.ThreadPoolExecutor() with the default number of 
> workers (os.cpu_count() * 1.5, or 12 threads on my system)to process 
> each month, so 12 active threads at a time. Given that the process is 
> pretty much CPU bound, maybe reducing the number of workers to the CPU 
> count would make sense. Processing of each email message enters that 
> with block once.That's about as minimal as I can make it. I thought for 
> a bit about pushing the textblob stuff into a separate worker thread, 
> but it wasn't obvious how to set up queues to handle the communication 
> between the threads created by ThreadPoolExecutor()and the worker 
> thread. Maybe I'll think about it harder. (I have a related problem with 
> SQLite, since an open database can't be manipulated from multiple 
> threads. That makes much of the program's end-of-run processing 
> single-threaded.)

If the noun extractor is single-threaded (which I think you mentioned), 
no amount of parallel access is going to help.  The best you can do is 
to queue up requests so that as soon as the noun extractor returns from 
one call, it gets handed another blob.  The CPU will be busy all the 
time running the noun-extraction code.

If that's the case, you might just as well eliminate all the threads and 
just do it sequentially in the most obvious and simple manner.

It would possibly be worth while to try this approach out and see what 
happens to the CPU usage and overall computation time.

>  > This link may be helpful -
>  >
>  > https://anandology.com/blog/using-iterators-and-generators/ 
> <https://anandology.com/blog/using-iterators-and-generators/>
> 
> I don't think that's where my problem is. The lock protects the 
> generation of the noun phrases. My loop which does the yielding operates 
> outside of that lock's control. The version of the code is my latest, in 
> which I tossed out a bunch of phrase-processing code (effectively dead 
> end ideas for processing the phrases). Replacing the for loop with a 
> simple return seems not to have any effect. In any case, the caller 
> which uses the phrases does a fair amount of extra work with the 
> phrases, populating a SQLite database, so I don't think the amount of 
> time it takes to process a single email message is dominated by the 
> phrase generation.
> 
> Here's timeitoutput for the noun_phrases code:
> 
> % python -m timeit -s 'text = """`python -m timeit --help`""" ; from 
> textblob import TextBlob ; from textblob.np_extractors import 
> ConllExtractor ; ext = ConllExtractor() ; phrases = TextBlob(text, 
> np_extractor=ext).noun_phrases' 'phrases = TextBlob(text, 
> np_extractor=ext).noun_phrases'
> 5000 loops, best of 5: 98.7 usec per loop
> 
> I process the output of timeit's help message which looks to be about 
> the same length as a typical email message, certainly the same order of 
> magnitude. Also, note that I call it once in the setup to eliminate the 
> initial training of the ConllExtractor instance. I don't know if ~100us 
> qualifies as long running or not.
> 
> I'll keep messing with it.
> 
> Skip