asyncip application hangs

Yaşar Arabacı yasar11732 at gmail.com
Mon Jul 21 18:19:42 EDT 2014


I am trying to grasp how asyncio works today. Based on the examples
that I found on the docs, I write a simple program like this;

import asyncio
import urllib.request
import urllib.parse

@asyncio.coroutine
def print_status_code(url_q):
    while True:
        url = yield from url_q.get()
        print('URL recieved from q:', url)
        if url is None:
            return

        url = urllib.parse.urlsplit(url)

        reader, writer = yield from asyncio.open_connection(url.hostname, 80)

        query = ('GET {url.path} HTTP/1.0\r\n'
                 'Host: {url.hostname}\r\n'
                 '\r\n').format(url=url)

        writer.write(query.encode('latin-1'))
        line = yield from reader.readline()
        code = line.decode('latin1').split()[0]
        print("(%s) %s", code, url.path)

if __name__ == "__main__":
    from bs4 import BeautifulSoup as bs
    sitemap = urllib.request.urlopen('http://ysar.net/sitemap.xml').read()
    soup = bs(sitemap)
    print('soup created')
    tasks = []

    num_coroutines = 10

    q = asyncio.Queue()

    for i in range(num_coroutines):  # start 10 tasks
        tasks.append(asyncio.Task(print_status_code(q)))

    for loc in soup.find_all('loc'):
        q.put(loc.string)

    for i in range(num_coroutines):  # Put poison pil.
        q.put(None)

    loop = asyncio.get_event_loop()
    loop.run_until_complete(asyncio.wait(tasks))

This program is supposed to give me status codes for web pages that
are found on my sitemap.xml file. But program hangs as Tasks wait for
getting something out of the Queue. I think it has something to do
with how I am using asyncio.Queue, but I couldn't figure out what am I
doing wrong here. Can anyone help me with that?
-- 
http://ysar.net/



More information about the Python-list mailing list