urllib sleeping?

Kragen Sitaker kragen at dnaco.net
Sat Sep 16 17:08:58 EDT 2000


I had a bit of Perl for hammering on Web servers.  I translated it to
Python, and I'm getting strange performance differences --- Python
seems much kinder to my CPU, but seems to be gratuitously sleeping.

Here's what I'm seeing:

[kragen at kragen devel]$ time ./http_client.py http://localhost/ 60 | tail -1
0.14user 0.00system 1:00.04elapsed 0%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (362major+12086minor)pagefaults 0swaps
20670 969130592.858 total conns in 60.0079900026 seconds: 5969
[kragen at kragen devel]$ time ./http_client.plx http://localhost/ 60 | tail -1
26.22user 6.84system 0:59.70elapsed 55%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (308major+105385minor)pagefaults 0swaps
20733 969130659 total conns in 60 seconds: 21018

So the Python version takes roughly 1/200 the CPU time that the Perl
version does, but gets in a bit less than a third of the HTTP
requests.  What is my computer doing for all that extra time?

Increasing the number of concurrent Perl clients up to 6 does not
change the total number of HTTP transactions, either positively or
negatively.  I'd have expected that it would have a negative impact on
performance, since the Perl client is using half the CPU time even at
one thread.

Increasing the number of concurrent Python clients up to 4 increases
performance linearly, up to about 10% less than the Perl version; going
up to 5 or 6 concurrent Python clients doesn't increase performance any
further.

My machine is Linux kragen.internal.knownow.com 2.2.12-32 #1 Mon Oct 25
19:56:23 EDT 1999 i686 unknown, running Red Hat 6.2 or something.

Looking at strace doesn't show anything the Python client is doing that
I'd expect to cause it to spend more time sleeping than a theoretical
ideal client --- except that it does perform several send() calls to
send its HTTP request, while the LWP function performs a single
write().

Here's the Perl code:

#!/usr/bin/perl -w
# Copyright 2000 KnowNow, Inc.  All rights reserved.
use strict;
use LWP::Simple;

sub logMsg
{
    print "$$ ", time, " ", @_, "\n";
}

my ($url, $seconds) = @ARGV;
if (not $url) 
{
    die "Usage: $0 url seconds\n" .
        " to hit 'url' repeatedly for 'seconds' seconds.\n";
}

my $start = time;
my $conns = 0;
while (time < $start + $seconds)
{
    logMsg "getting $url";
    get $url;
    logMsg "got $url";
    $conns++;
}

logMsg "total conns in $seconds seconds: $conns";

And here's the Python version; it's about 50% longer, mostly to add a
get() like the one in LWP::Simple, and puts everything in a function so
I could try running it under the profiler:

#!/usr/bin/env python
# Copyright 2000 KnowNow, Inc.  All rights reserved.
import urllib
import sys
import posix
import time

def logMsg(msg):
    print("%s %s %s" % (posix.getpid(), time.time(), msg))

def get(url):
    conn = urllib.urlopen(url)
    connopen = 1
    text = ""
    while connopen:
        newtext = conn.read()
        if newtext == "":
            connopen = 0
            conn.close()
        else:
            text = text + newtext
    return text

def fetchfor(url, seconds):
    seconds = float(seconds)
    start = time.time()
    conns = 0
    while time.time() < start + seconds:
        logMsg("getting %s" % url)
        get(url)
        logMsg("got %s" % url)
        conns = conns + 1
        
    logMsg("total conns in %s seconds: %s" % (time.time() - start, conns))

def run():
    if len(sys.argv) != 3:
        sys.stderr.write(("Usage: %s url seconds\n" +
                          " to fetch 'url' repeatedly for 'seconds' seconds.\n")
                         % sys.argv[0])
        sys.exit(1)
    fetchfor(sys.argv[1], sys.argv[2])

run()


imitatively y'rs - kragen
-- 
<kragen at pobox.com>       Kragen Sitaker     <http://www.pobox.com/~kragen/>
Perilous to all of us are the devices of an art deeper than we ourselves
possess.
                -- Gandalf the Grey [J.R.R. Tolkien, "Lord of the Rings"]



More information about the Python-list mailing list