[melbourne-pug] Joblib question
Mike Dewhirst
miked at dewhirst.com.au
Sat Mar 10 01:03:59 EST 2018
On 10/03/2018 12:33 PM, paul sorenson wrote:
>
> Mike,
>
> Are there unique features of joblib that you need to use?
>
I was seduced by "Parallel". On reading the docs a little more
diligently it seems well suited to parallel computation with heavy
compute-bound stuff like scientific number crunching and disk caching
results to prevent re-computing.
> Scraping web pages is often a good candidate for asyncio based models.
>
I think I'm being seduced by io in the name. I do judge books by their
cover so I think I'll read asyncio
Thanks Paul
Mike
>
> cheers
>
>
> On 03/08/2018 11:41 PM, Mike Dewhirst wrote:
>> https://media.readthedocs.org/pdf/joblib/latest/joblib.pdf
>>
>> I'm trying to make the following code run in parallel on separate CPU
>> cores but haven't had any success.
>>
>> def make_links(self): for db in databases: link =
>> create_useful_link(self, Link, db) if link: scrape_db(self, link, db)
>> This is a web scraper which is working nicely in a leisurely
>> sequential manner. databases is a list of urls with gaps to be
>> filled by create_useful_link() which makes a link record from the
>> Link class. The self instance is a source of attributes for filling
>> the url gaps. self is a chemical substance and the link record url
>> field when clicked in a browser will bring up that external website
>> with the chemical substance selected for researching by the viewer.
>> If successful, we then fetch the external page and scrape a bunch of
>> interesting data from it and turn that into substance notes.
>> scrape_db() doesn't return anything but it does create up to nine
>> other records.
>>
>> from joblib import Parallel, delayed
>>
>> class Substance( etc ..
>> ...
>> def make_links(self):
>> #Parallel(n_jobs=-2)(delayed(
>> # scrape_db(self, create_useful_link(self, Link, db), db) for db in databases
>> #))
>> I'm getting a TypeError from Parallel delayed() - can't pickle
>> generator objects
>>
>> So my question is how to write the commented code properly? I suspect
>> I haven't done enough comprehension.
>>
>> Thanks for any help
>>
>> Mike
>>
>>
>> _______________________________________________
>> melbourne-pug mailing list
>> melbourne-pug at python.org
>> https://mail.python.org/mailman/listinfo/melbourne-pug
>
More information about the melbourne-pug
mailing list