urllib's performance

Thu May 17 15:17:03 EDT 2001

In article <9e0f3h$scf$1 at panix3.panix.com>, Aahz Maruch <aahz at panix.com> wrote:
>In article <9dmkbn$gm2$1 at harvester.transas.com>,
>Mikhail Sobolev <mss at transas.com> wrote:
>>
>>Just to give a little bit more background.  I have a server program that can
>>create a number of working threads.  What I tried to look at is how the number
>>of working threads and the number of concurrent clients correlate.  So my
>>script looked like:
>>
>>    for thread_no in range (1, 21): # as 20 seems to be a reasonable limit as
>>                                    # each working thread on server uses a
>>                                    # significant amount of memory
>>        for client_no in range (1, 21): # maybe 20 is not sufficient, but let's
>>                                        # have a look on that many concurrent
>>                                        # clients
>>            for client in range (1, client_no+1):
>>                thread.start_new (client_proc, args)
>>
>>            wait_for_threads_to_finish ()
>>
>>And the problem is that the request rate of each client_proc is not sufficient.
>
>You've got two problems.  First of all, you're not spawning a consistent
>number of threads; you keep rejiggering the client load.  Secondly, it
>takes time to create/destroy threads; you're better off with a thread
>pool.  As I said earlier, take a look at
>http://starship.python.net/crew/aahz/
>for some examples of building a web client.
Sorry, I still do not quite understand.  The client_proc function looks like:
    for arg in args:
        urllib.urlopen (base_url + arg).read ()

And I would say that it's here where I see the problem, not the way the threads
are created.  Maybe, I am just missing something.

Thanks,

--
Misha