Real-world use of concurrent.futures

Terry Reedy tjreedy at udel.edu
Thu May 8 16:27:03 EDT 2014


On 5/8/2014 2:55 PM, Andrew McLean wrote:
> I have a problem that would benefit from a multithreaded implementation
> and having trouble understanding how to approach it using
> concurrent.futures.
>
> The details don't really matter, but it will probably help to be
> explicit. I have a large CSV file that contains a lot of fields, amongst
> them one containing email addresses. I want to write a program that
> validates the email addresses by checking that the domain names have a
> valid MX record. The output will be a copy of the file with any invalid
> email addresses removed. Because of latency in the DNS lookup this could
> benefit from multithreading.
>
> I have written similar code in the past using explicit threads
> communicating via queues. For this example, I could have a thread that
> read the file using csv.DictReader, putting dicts containing records
> from the input file into a (finite length) queue. Then I would have a
> number of worker threads reading the queue, performing the validation
> and putting validated results in a second queue. A final thread would
> read from the second queue writing the results to the output file.
>
> So far so good. However, I thought this would be an opportunity to
> explore concurrent.futures and to see whether it offered any benefits
> over the more explicit approach discussed above. The problem I am having
> is that all the discussions I can find of the use of concurrent.futures
> show use with toy problems involving just a few tasks.

You might look as the new asyncio module in 3.4 (backport available on 
pypi, I believe). Among other things, it uses a variation on 
concurrent.futures. It includes timeouts.

-- 
Terry Jan Reedy




More information about the Python-list mailing list