BeautifulSoup doesn't work with a threaded input queue?
Paul Rubin
no.email at nospam.invalid
Sun Aug 27 17:23:58 EDT 2017
Christopher Reimer <christopher_reimer at yahoo.com> writes:
> I have 20 read_threads requesting and putting pages into the output
> queue that is the input_queue for the parser.
Given how slow parsing is, you probably want to scrap the pages into
disk files, and then run the parser in parallel processes that read from
the disk. You could also use something like Redis (redis.io) as a queue.
More information about the Python-list
mailing list